Analysis of float/double precision in 32 decimal digits - c++

From a .c file of another guy, I saw this:
const float c = 0.70710678118654752440084436210485f;
where he wants to avoid the computation of sqrt(1/2).
Can this be really stored somehow with plain C/C++? I mean without loosing precision. It seems impossible to me.
I am using C++, but I do not believe that precision difference between this two languages are too big (if any), that' why I did not test it.
So, I wrote these few lines, to have a look at the behaviour of the code:
std::cout << "Number: 0.70710678118654752440084436210485\n";
const float f = 0.70710678118654752440084436210485f;
std::cout << "float: " << std::setprecision(32) << f << std::endl;
const double d = 0.70710678118654752440084436210485; // no f extension
std::cout << "double: " << std::setprecision(32) << d << std::endl;
const double df = 0.70710678118654752440084436210485f;
std::cout << "doublef: " << std::setprecision(32) << df << std::endl;
const long double ld = 0.70710678118654752440084436210485;
std::cout << "l double: " << std::setprecision(32) << ld << std::endl;
const long double ldl = 0.70710678118654752440084436210485l; // l suffix!
std::cout << "l doublel: " << std::setprecision(32) << ldl << std::endl;
The output is this:
* ** ***
v v v
Number: 0.70710678118654752440084436210485 // 32 decimal digits
float: 0.707106769084930419921875 // 24 >> >>
double: 0.70710678118654757273731092936941
doublef: 0.707106769084930419921875 // same as float
l double: 0.70710678118654757273731092936941 // same as double
l doublel: 0.70710678118654752438189403651592 // suffix l
where * is the last accurate digit of float, ** the last accurate digit of double and *** the last accurate digit of long double.
The output of double has 32 decimal digits, since I have set the precision of std::cout at that value.
float output has 24, as expected, as said here:
float has 24 binary bits of precision, and double has 53.
I would expect the last output to be the same with the pre-last, i.e. that the f suffix would not prevent the number from becoming a double. I think that when I write this:
const double df = 0.70710678118654752440084436210485f;
what happens is that first the number becomes a float one and then stored as a double, so after the 24th decimal digits, it has zeroes and that's why the double precision stops there.
Am I correct?
From this answer I found some relevant information:
float x = 0 has an implicit typecast from int to float.
float x = 0.0f does not have such a typecast.
float x = 0.0 has an implicit typecast from double to float.
[EDIT]
About __float128, it is not standard, thus it's out of the competition. See more here.

From the standard:
There are three floating point types: float, double, and long double.
The type double provides at least as much precision as float, and the
type long double provides at least as much precision as double. The
set of values of the type float is a subset of the set of values of
the type double; the set of values of the type double is a subset of
the set of values of the type long double. The value representation of
floating-point types is implementation-defined.
So you can see your issue with this question: the standard doesn't actually say how precise floats are.
In terms of standard implementations, you need to look at IEEE754, which means the other two answers from Irineau and Davidmh are perfectly valid approaches to the problem.
As to suffix letters to indicate type, again looking at the standard:
The type of a floating literal is double unless explicitly specified by
a suffix. The suffixes f and F specify float, the suffixes l and L specify
long double.
So your attempt to create a long double will just have the same precision as the double literal you are assigning to it unless you use the L suffix.
I understand that some of these answers may not seem satisfactory, but there is a lot of background reading to be done on the relevant standards before you can dismiss answers. This answer is already longer than intended so I won't try and explain everything here.
And as a final note: Since the precision is not clearly defined, why not have a constant that's longer than it needs to be? Seems to make sense to always define a constant that is precise enough to always be representable regardless of type.

Python's numerical library, numpy, has a very convenient float info function. All the types are the equivalent to C:
For C's float:
print numpy.finfo(numpy.float32)
Machine parameters for float32
---------------------------------------------------------------------
precision= 6 resolution= 1.0000000e-06
machep= -23 eps= 1.1920929e-07
negep = -24 epsneg= 5.9604645e-08
minexp= -126 tiny= 1.1754944e-38
maxexp= 128 max= 3.4028235e+38
nexp = 8 min= -max
---------------------------------------------------------------------
For C's double:
print numpy.finfo(numpy.float64)
Machine parameters for float64
---------------------------------------------------------------------
precision= 15 resolution= 1.0000000000000001e-15
machep= -52 eps= 2.2204460492503131e-16
negep = -53 epsneg= 1.1102230246251565e-16
minexp= -1022 tiny= 2.2250738585072014e-308
maxexp= 1024 max= 1.7976931348623157e+308
nexp = 11 min= -max
---------------------------------------------------------------------
And for C's long float:
print numpy.finfo(numpy.float128)
Machine parameters for float128
---------------------------------------------------------------------
precision= 18 resolution= 1e-18
machep= -63 eps= 1.08420217249e-19
negep = -64 epsneg= 5.42101086243e-20
minexp=-16382 tiny= 3.36210314311e-4932
maxexp= 16384 max= 1.18973149536e+4932
nexp = 15 min= -max
---------------------------------------------------------------------
So, not even long float (128 bits) will give you the 32 digits you want. But, do you really need them all?

Some compilers have an implementation of the binary128 floating point format, normalized by IEEE 754-2008. Using gcc, for example, the type is __float128. That floating point format have about 34 decimal precision (log(2^113)/log(10)).
You can use the Boost Multiprecision library, to use their wrapper float128. That implementation will either use native types, if available, or use a drop-in replacement.
Let's extend your experiment with that new non-standard type __float128, with a recent g++ (4.8):
// Compiled with g++ -Wall -lquadmath essai.cpp
#include <iostream>
#include <iomanip>
#include <quadmath.h>
#include <sstream>
std::ostream& operator<<(std::ostream& out, __float128 f) {
char buf[200];
std::ostringstream format;
format << "%." << (std::min)(190L, out.precision()) << "Qf";
quadmath_snprintf(buf, 200, format.str().c_str(), f);
out << buf;
return out;
}
int main() {
std::cout.precision(32);
std::cout << "Number: 0.70710678118654752440084436210485\n";
const float f = 0.70710678118654752440084436210485f;
std::cout << "float: " << std::setprecision(32) << f << std::endl;
const double d = 0.70710678118654752440084436210485; // no f extension
std::cout << "double: " << std::setprecision(32) << d << std::endl;
const double df = 0.70710678118654752440084436210485f;
std::cout << "doublef: " << std::setprecision(32) << df << std::endl;
const long double ld = 0.70710678118654752440084436210485;
std::cout << "l double: " << std::setprecision(32) << ld << std::endl;
const long double ldl = 0.70710678118654752440084436210485l; // l suffix!
std::cout << "l doublel: " << std::setprecision(32) << ldl << std::endl;
const __float128 f128 = 0.70710678118654752440084436210485;
const __float128 f128f = 0.70710678118654752440084436210485f; // f suffix
const __float128 f128l = 0.70710678118654752440084436210485l; // l suffix
const __float128 f128q = 0.70710678118654752440084436210485q; // q suffix
std::cout << "f128: " << f128 << std::endl;
std::cout << "f f128: " << f128f << std::endl;
std::cout << "l f128: " << f128l << std::endl;
std::cout << "q f128: " << f128q << std::endl;
}
The output is:
* ** *** ****
v v v v
Number: 0.70710678118654752440084436210485
float: 0.707106769084930419921875
double: 0.70710678118654757273731092936941
doublef: 0.707106769084930419921875
l double: 0.70710678118654757273731092936941
l doublel: 0.70710678118654752438189403651592
f128: 0.70710678118654757273731092936941
f f128: 0.70710676908493041992187500000000
l f128: 0.70710678118654752438189403651592
q f128: 0.70710678118654752440084436210485
where * is the last accurate digit of float, ** the last accurate digit of
double, *** the last accurate digit of long double, and **** is the
last accurate digit of __float128.
As said by another answer, the C++ standard does not say what is the precision of the various floating point types (like it does not says what is the size of the integral types). It only specifies minimal precision/size of those types. But the norm IEEE754 does specify all that! The FPU of all lot of architectures does implement that norm IEEE745, and the recent versions of gcc implement the type binary128 of the norm with the extension __float128.
As for the explanation of your code, or mine, an expression like 0.70710678118654752440084436210485f is a floating-point literal. It has a type, that is defined by its suffix, here f for float. And thus the value of the literal correspond to the nearest value of the given type from the given number. That explains why, for example, the precision of "doublef" is the same as for "float", in your code. In recent gcc versions, there is an extension, that allows to define floating-point literals of type __float128, with the Q suffix (Quadruple-precision).

Related

How to read txt of floats to a 2d vector? [duplicate]

In my earlier question I was printing a double using cout that got rounded when I wasn't expecting it. How can I make cout print a double using full precision?
You can set the precision directly on std::cout and use the std::fixed format specifier.
double d = 3.14159265358979;
cout.precision(17);
cout << "Pi: " << fixed << d << endl;
You can #include <limits> to get the maximum precision of a float or double.
#include <limits>
typedef std::numeric_limits< double > dbl;
double d = 3.14159265358979;
cout.precision(dbl::max_digits10);
cout << "Pi: " << d << endl;
Use std::setprecision:
#include <iomanip>
std::cout << std::setprecision (15) << 3.14159265358979 << std::endl;
Here is what I would use:
std::cout << std::setprecision (std::numeric_limits<double>::digits10 + 1)
<< 3.14159265358979
<< std::endl;
Basically the limits package has traits for all the build in types.
One of the traits for floating point numbers (float/double/long double) is the digits10 attribute. This defines the accuracy (I forget the exact terminology) of a floating point number in base 10.
See: http://www.cplusplus.com/reference/std/limits/numeric_limits.html
For details about other attributes.
How do I print a double value with full precision using cout?
Use hexfloat or
use scientific and set the precision
std::cout.precision(std::numeric_limits<double>::max_digits10 - 1);
std::cout << std::scientific << 1.0/7.0 << '\n';
// C++11 Typical output
1.4285714285714285e-01
Too many answers address only one of 1) base 2) fixed/scientific layout or 3) precision. Too many answers with precision do not provide the proper value needed. Hence this answer to a old question.
What base?
A double is certainly encoded using base 2. A direct approach with C++11 is to print using std::hexfloat.
If a non-decimal output is acceptable, we are done.
std::cout << "hexfloat: " << std::hexfloat << exp (-100) << '\n';
std::cout << "hexfloat: " << std::hexfloat << exp (+100) << '\n';
// output
hexfloat: 0x1.a8c1f14e2af5dp-145
hexfloat: 0x1.3494a9b171bf5p+144
Otherwise: fixed or scientific?
A double is a floating point type, not fixed point.
Do not use std::fixed as that fails to print small double as anything but 0.000...000. For large double, it prints many digits, perhaps hundreds of questionable informativeness.
std::cout << "std::fixed: " << std::fixed << exp (-100) << '\n';
std::cout << "std::fixed: " << std::fixed << exp (+100) << '\n';
// output
std::fixed: 0.000000
std::fixed: 26881171418161356094253400435962903554686976.000000
To print with full precision, first use std::scientific which will "write floating-point values in scientific notation". Notice the default of 6 digits after the decimal point, an insufficient amount, is handled in the next point.
std::cout << "std::scientific: " << std::scientific << exp (-100) << '\n';
std::cout << "std::scientific: " << std::scientific << exp (+100) << '\n';
// output
std::scientific: 3.720076e-44
std::scientific: 2.688117e+43
How much precision (how many total digits)?
A double encoded using the binary base 2 encodes the same precision between various powers of 2. This is often 53 bits.
[1.0...2.0) there are 253 different double,
[2.0...4.0) there are 253 different double,
[4.0...8.0) there are 253 different double,
[8.0...10.0) there are 2/8 * 253 different double.
Yet if code prints in decimal with N significant digits, the number of combinations [1.0...10.0) is 9/10 * 10N.
Whatever N (precision) is chosen, there will not be a one-to-one mapping between double and decimal text. If a fixed N is chosen, sometimes it will be slightly more or less than truly needed for certain double values. We could error on too few (a) below) or too many (b) below).
3 candidate N:
a) Use an N so when converting from text-double-text we arrive at the same text for all double.
std::cout << dbl::digits10 << '\n';
// Typical output
15
b) Use an N so when converting from double-text-double we arrive at the same double for all double.
// C++11
std::cout << dbl::max_digits10 << '\n';
// Typical output
17
When max_digits10 is not available, note that due to base 2 and base 10 attributes, digits10 + 2 <= max_digits10 <= digits10 + 3, we can use digits10 + 3 to insure enough decimal digits are printed.
c) Use an N that varies with the value.
This can be useful when code wants to display minimal text (N == 1) or the exact value of a double (N == 1000-ish in the case of denorm_min). Yet since this is "work" and not likely OP's goal, it will be set aside.
It is usually b) that is used to "print a double value with full precision". Some applications may prefer a) to error on not providing too much information.
With .scientific, .precision() sets the number of digits to print after the decimal point, so 1 + .precision() digits are printed. Code needs max_digits10 total digits so .precision() is called with a max_digits10 - 1.
typedef std::numeric_limits< double > dbl;
std::cout.precision(dbl::max_digits10 - 1);
std::cout << std::scientific << exp (-100) << '\n';
std::cout << std::scientific << exp (+100) << '\n';
// Typical output
3.7200759760208361e-44
2.6881171418161356e+43
//2345678901234567 17 total digits
Similar C question
In C++20 you'll be able to use std::format to do this:
std::cout << std::format("{}", M_PI);
Output (assuming IEEE754 double):
3.141592653589793
The default floating-point format is the shortest decimal representation with a round-trip guarantee. The advantage of this method compared to the setprecision I/O manipulator is that it doesn't print unnecessary digits.
In the meantime you can use the {fmt} library, std::format is based on. {fmt} also provides the print function that makes this even easier and more efficient (godbolt):
fmt::print("{}", M_PI);
Disclaimer: I'm the author of {fmt} and C++20 std::format.
The iostreams way is kind of clunky. I prefer using boost::lexical_cast because it calculates the right precision for me. And it's fast, too.
#include <string>
#include <boost/lexical_cast.hpp>
using boost::lexical_cast;
using std::string;
double d = 3.14159265358979;
cout << "Pi: " << lexical_cast<string>(d) << endl;
Output:
Pi: 3.14159265358979
Here is how to display a double with full precision:
double d = 100.0000000000005;
int precision = std::numeric_limits<double>::max_digits10;
std::cout << std::setprecision(precision) << d << std::endl;
This displays:
100.0000000000005
max_digits10 is the number of digits that are necessary to uniquely represent all distinct double values. max_digits10 represents the number of digits before and after the decimal point.
Don't use set_precision(max_digits10) with std::fixed.
On fixed notation, set_precision() sets the number of digits only after the decimal point. This is incorrect as max_digits10 represents the number of digits before and after the decimal point.
double d = 100.0000000000005;
int precision = std::numeric_limits<double>::max_digits10;
std::cout << std::fixed << std::setprecision(precision) << d << std::endl;
This displays incorrect result:
100.00000000000049738
Note: Header files required
#include <iomanip>
#include <limits>
By full precision, I assume mean enough precision to show the best approximation to the intended value, but it should be pointed out that double is stored using base 2 representation and base 2 can't represent something as trivial as 1.1 exactly. The only way to get the full-full precision of the actual double (with NO ROUND OFF ERROR) is to print out the binary bits (or hex nybbles).
One way of doing that is using a union to type-pun the double to a integer and then printing the integer, since integers do not suffer from truncation or round-off issues. (Type punning like this is not supported by the C++ standard, but it is supported in C. However, most C++ compilers will probably print out the correct value anyways. I think g++ supports this.)
union {
double d;
uint64_t u64;
} x;
x.d = 1.1;
std::cout << std::hex << x.u64;
This will give you the 100% accurate precision of the double... and be utterly unreadable because humans can't read IEEE double format ! Wikipedia has a good write up on how to interpret the binary bits.
In newer C++, you can do
std::cout << std::hexfloat << 1.1;
C++20 std::format
This great new C++ library feature has the advantage of not affecting the state of std::cout as std::setprecision does:
#include <format>
#include <string>
int main() {
std::cout << std::format("{:.2} {:.3}\n", 3.1415, 3.1415);
}
Expected output:
3.14 3.142
As mentioned at https://stackoverflow.com/a/65329803/895245 if you don't pass the precision explicitly it prints the shortest decimal representation with a round-trip guarantee. TODO understand in more detail how it compares to: dbl::max_digits10 as shown at https://stackoverflow.com/a/554134/895245 with {:.{}}:
#include <format>
#include <limits>
#include <string>
int main() {
std::cout << std::format("{:.{}}\n",
3.1415926535897932384626433, dbl::max_digits10);
}
See also:
Set back default floating point print precision in C++ for how to restore the initial precision in pre-c++20
std::string formatting like sprintf
https://en.cppreference.com/w/cpp/utility/format/formatter#Standard_format_specification
IEEE 754 floating point values are stored using base 2 representation. Any base 2 number can be represented as a decimal (base 10) to full precision. None of the proposed answers, however, do. They all truncate the decimal value.
This seems to be due to a misinterpretation of what std::numeric_limits<T>::max_digits10 represents:
The value of std::numeric_limits<T>::max_digits10 is the number of base-10 digits that are necessary to uniquely represent all distinct values of the type T.
In other words: It's the (worst-case) number of digits required to output if you want to roundtrip from binary to decimal to binary, without losing any information. If you output at least max_digits10 decimals and reconstruct a floating point value, you are guaranteed to get the exact same binary representation you started with.
What's important: max_digits10 in general neither yields the shortest decimal, nor is it sufficient to represent the full precision. I'm not aware of a constant in the C++ Standard Library that encodes the maximum number of decimal digits required to contain the full precision of a floating point value. I believe it's something like 767 for doubles1. One way to output a floating point value with full precision would be to use a sufficiently large value for the precision, like so2, and have the library strip any trailing zeros:
#include <iostream>
int main() {
double d = 0.1;
std::cout.precision(767);
std::cout << "d = " << d << std::endl;
}
This produces the following output, that contains the full precision:
d = 0.1000000000000000055511151231257827021181583404541015625
Note that this has significantly more decimals than max_digits10 would suggest.
While that answers the question that was asked, a far more common goal would be to get the shortest decimal representation of any given floating point value, that retains all information. Again, I'm not aware of any way to instruct the Standard I/O library to output that value. Starting with C++17 the possibility to do that conversion has finally arrived in C++ in the form of std::to_chars. By default, it produces the shortest decimal representation of any given floating point value that retains the entire information.
Its interface is a bit clunky, and you'd probably want to wrap this up into a function template that returns something you can output to std::cout (like a std::string), e.g.
#include <charconv>
#include <array>
#include <string>
#include <system_error>
#include <iostream>
#include <cmath>
template<typename T>
std::string to_string(T value)
{
// 24 characters is the longest decimal representation of any double value
std::array<char, 24> buffer {};
auto const res { std::to_chars(buffer.data(), buffer.data() + buffer.size(), value) };
if (res.ec == std::errc {})
{
// Success
return std::string(buffer.data(), res.ptr);
}
// Error
return { "FAILED!" };
}
int main()
{
auto value { 0.1f };
std::cout << to_string(value) << std::endl;
value = std::nextafter(value, INFINITY);
std::cout << to_string(value) << std::endl;
value = std::nextafter(value, INFINITY);
std::cout << to_string(value) << std::endl;
}
This would print out (using Microsoft's C++ Standard Library):
0.1
0.10000001
0.10000002
1 From Stephan T. Lavavej's CppCon 2019 talk titled Floating-Point <charconv>: Making Your Code 10x Faster With C++17's Final Boss. (The entire talk is worth watching.)
2 This would also require using a combination of scientific and fixed, whichever is shorter. I'm not aware of a way to set this mode using the C++ Standard I/O library.
printf("%.12f", M_PI);
%.12f means floating point, with precision of 12 digits.
The best option is to use std::setprecision, and the solution works like this:
# include <iostream>
# include <iomanip>
int main()
{
double a = 34.34322;
std::cout<<std::fixed<<a<<std::setprecision(0)<<std::endl;
return 0;
}
Note: you do not need to use cout.setprecision to do it and I fill up 0 at std::setprecision because it must have a argument.
Most portably...
#include <limits>
using std::numeric_limits;
...
cout.precision(numeric_limits<double>::digits10 + 1);
cout << d;
In this question there is a description on how to convert a double to string losselessly (in Octave, but it can be easily reproduced in C++). De idea is to have a short human readable description of the float and a losseless description in hexa form, for instance: pi -> 3.14{54442d18400921fb}.
Here is a function that works for any floating-point type, not just double, and also puts the stream back the way it was found afterwards. Unfortunately it won't interact well with threads, but that's the nature of iostreams. You'll need these includes at the start of your file:
#include <limits>
#include <iostream>
Here's the function, you could it in a header file if you use it a lot:
template <class T>
void printVal(std::ostream& os, T val)
{
auto oldFlags = os.flags();
auto oldPrecision = os.precision();
os.flags(oldFlags & ~std::ios_base::floatfield);
os.precision(std::numeric_limits<T>::digits10);
os << val;
os.flags(oldFlags);
os.precision(oldPrecision);
}
Use it like this:
double d = foo();
float f = bar();
printVal(std::cout, d);
printVal(std::cout, f);
If you want to be able to use the normal insertion << operator, you can use this extra wrapper code:
template <class T>
struct PrintValWrapper { T val; };
template <class T>
std::ostream& operator<<(std::ostream& os, PrintValWrapper<T> pvw) {
printVal(os, pvw.val);
return os;
}
template <class T>
PrintValWrapper<T> printIt(T val) {
return PrintValWrapper<T>{val};
}
Now you can use it like this:
double d = foo();
float f = bar();
std::cout << "The values are: " << printIt(d) << ", " << printIt(f) << '\n';
This will show the value up to two decimal places after the dot.
#include <iostream>
#include <iomanip>
double d = 2.0;
int n = 2;
cout << fixed << setprecision(n) << d;
See here: Fixed-point notation
std::fixed
Use fixed floating-point notation Sets the floatfield format flag for
the str stream to fixed.
When floatfield is set to fixed, floating-point values are written
using fixed-point notation: the value is represented with exactly as
many digits in the decimal part as specified by the precision field
(precision) and with no exponent part.
std::setprecision
Set decimal precision Sets the decimal precision to be used to format
floating-point values on output operations.
If you're familiar with the IEEE standard for representing the floating-points, you would know that it is impossible to show floating-points with full-precision out of the scope of the standard, that is to say, it will always result in a rounding of the real value.
You need to first check whether the value is within the scope, if yes, then use:
cout << defaultfloat << d ;
std::defaultfloat
Use default floating-point notation Sets the floatfield format flag
for the str stream to defaultfloat.
When floatfield is set to defaultfloat, floating-point values are
written using the default notation: the representation uses as many
meaningful digits as needed up to the stream's decimal precision
(precision), counting both the digits before and after the decimal
point (if any).
That is also the default behavior of cout, which means you don't use it explicitly.
With ostream::precision(int)
cout.precision( numeric_limits<double>::digits10 + 1);
cout << M_PI << ", " << M_E << endl;
will yield
3.141592653589793, 2.718281828459045
Why you have to say "+1" I have no clue, but the extra digit you get out of it is correct.

Why does C++ automatically rounds float [duplicate]

In my earlier question I was printing a double using cout that got rounded when I wasn't expecting it. How can I make cout print a double using full precision?
You can set the precision directly on std::cout and use the std::fixed format specifier.
double d = 3.14159265358979;
cout.precision(17);
cout << "Pi: " << fixed << d << endl;
You can #include <limits> to get the maximum precision of a float or double.
#include <limits>
typedef std::numeric_limits< double > dbl;
double d = 3.14159265358979;
cout.precision(dbl::max_digits10);
cout << "Pi: " << d << endl;
Use std::setprecision:
#include <iomanip>
std::cout << std::setprecision (15) << 3.14159265358979 << std::endl;
Here is what I would use:
std::cout << std::setprecision (std::numeric_limits<double>::digits10 + 1)
<< 3.14159265358979
<< std::endl;
Basically the limits package has traits for all the build in types.
One of the traits for floating point numbers (float/double/long double) is the digits10 attribute. This defines the accuracy (I forget the exact terminology) of a floating point number in base 10.
See: http://www.cplusplus.com/reference/std/limits/numeric_limits.html
For details about other attributes.
How do I print a double value with full precision using cout?
Use hexfloat or
use scientific and set the precision
std::cout.precision(std::numeric_limits<double>::max_digits10 - 1);
std::cout << std::scientific << 1.0/7.0 << '\n';
// C++11 Typical output
1.4285714285714285e-01
Too many answers address only one of 1) base 2) fixed/scientific layout or 3) precision. Too many answers with precision do not provide the proper value needed. Hence this answer to a old question.
What base?
A double is certainly encoded using base 2. A direct approach with C++11 is to print using std::hexfloat.
If a non-decimal output is acceptable, we are done.
std::cout << "hexfloat: " << std::hexfloat << exp (-100) << '\n';
std::cout << "hexfloat: " << std::hexfloat << exp (+100) << '\n';
// output
hexfloat: 0x1.a8c1f14e2af5dp-145
hexfloat: 0x1.3494a9b171bf5p+144
Otherwise: fixed or scientific?
A double is a floating point type, not fixed point.
Do not use std::fixed as that fails to print small double as anything but 0.000...000. For large double, it prints many digits, perhaps hundreds of questionable informativeness.
std::cout << "std::fixed: " << std::fixed << exp (-100) << '\n';
std::cout << "std::fixed: " << std::fixed << exp (+100) << '\n';
// output
std::fixed: 0.000000
std::fixed: 26881171418161356094253400435962903554686976.000000
To print with full precision, first use std::scientific which will "write floating-point values in scientific notation". Notice the default of 6 digits after the decimal point, an insufficient amount, is handled in the next point.
std::cout << "std::scientific: " << std::scientific << exp (-100) << '\n';
std::cout << "std::scientific: " << std::scientific << exp (+100) << '\n';
// output
std::scientific: 3.720076e-44
std::scientific: 2.688117e+43
How much precision (how many total digits)?
A double encoded using the binary base 2 encodes the same precision between various powers of 2. This is often 53 bits.
[1.0...2.0) there are 253 different double,
[2.0...4.0) there are 253 different double,
[4.0...8.0) there are 253 different double,
[8.0...10.0) there are 2/8 * 253 different double.
Yet if code prints in decimal with N significant digits, the number of combinations [1.0...10.0) is 9/10 * 10N.
Whatever N (precision) is chosen, there will not be a one-to-one mapping between double and decimal text. If a fixed N is chosen, sometimes it will be slightly more or less than truly needed for certain double values. We could error on too few (a) below) or too many (b) below).
3 candidate N:
a) Use an N so when converting from text-double-text we arrive at the same text for all double.
std::cout << dbl::digits10 << '\n';
// Typical output
15
b) Use an N so when converting from double-text-double we arrive at the same double for all double.
// C++11
std::cout << dbl::max_digits10 << '\n';
// Typical output
17
When max_digits10 is not available, note that due to base 2 and base 10 attributes, digits10 + 2 <= max_digits10 <= digits10 + 3, we can use digits10 + 3 to insure enough decimal digits are printed.
c) Use an N that varies with the value.
This can be useful when code wants to display minimal text (N == 1) or the exact value of a double (N == 1000-ish in the case of denorm_min). Yet since this is "work" and not likely OP's goal, it will be set aside.
It is usually b) that is used to "print a double value with full precision". Some applications may prefer a) to error on not providing too much information.
With .scientific, .precision() sets the number of digits to print after the decimal point, so 1 + .precision() digits are printed. Code needs max_digits10 total digits so .precision() is called with a max_digits10 - 1.
typedef std::numeric_limits< double > dbl;
std::cout.precision(dbl::max_digits10 - 1);
std::cout << std::scientific << exp (-100) << '\n';
std::cout << std::scientific << exp (+100) << '\n';
// Typical output
3.7200759760208361e-44
2.6881171418161356e+43
//2345678901234567 17 total digits
Similar C question
In C++20 you'll be able to use std::format to do this:
std::cout << std::format("{}", M_PI);
Output (assuming IEEE754 double):
3.141592653589793
The default floating-point format is the shortest decimal representation with a round-trip guarantee. The advantage of this method compared to the setprecision I/O manipulator is that it doesn't print unnecessary digits.
In the meantime you can use the {fmt} library, std::format is based on. {fmt} also provides the print function that makes this even easier and more efficient (godbolt):
fmt::print("{}", M_PI);
Disclaimer: I'm the author of {fmt} and C++20 std::format.
The iostreams way is kind of clunky. I prefer using boost::lexical_cast because it calculates the right precision for me. And it's fast, too.
#include <string>
#include <boost/lexical_cast.hpp>
using boost::lexical_cast;
using std::string;
double d = 3.14159265358979;
cout << "Pi: " << lexical_cast<string>(d) << endl;
Output:
Pi: 3.14159265358979
Here is how to display a double with full precision:
double d = 100.0000000000005;
int precision = std::numeric_limits<double>::max_digits10;
std::cout << std::setprecision(precision) << d << std::endl;
This displays:
100.0000000000005
max_digits10 is the number of digits that are necessary to uniquely represent all distinct double values. max_digits10 represents the number of digits before and after the decimal point.
Don't use set_precision(max_digits10) with std::fixed.
On fixed notation, set_precision() sets the number of digits only after the decimal point. This is incorrect as max_digits10 represents the number of digits before and after the decimal point.
double d = 100.0000000000005;
int precision = std::numeric_limits<double>::max_digits10;
std::cout << std::fixed << std::setprecision(precision) << d << std::endl;
This displays incorrect result:
100.00000000000049738
Note: Header files required
#include <iomanip>
#include <limits>
By full precision, I assume mean enough precision to show the best approximation to the intended value, but it should be pointed out that double is stored using base 2 representation and base 2 can't represent something as trivial as 1.1 exactly. The only way to get the full-full precision of the actual double (with NO ROUND OFF ERROR) is to print out the binary bits (or hex nybbles).
One way of doing that is using a union to type-pun the double to a integer and then printing the integer, since integers do not suffer from truncation or round-off issues. (Type punning like this is not supported by the C++ standard, but it is supported in C. However, most C++ compilers will probably print out the correct value anyways. I think g++ supports this.)
union {
double d;
uint64_t u64;
} x;
x.d = 1.1;
std::cout << std::hex << x.u64;
This will give you the 100% accurate precision of the double... and be utterly unreadable because humans can't read IEEE double format ! Wikipedia has a good write up on how to interpret the binary bits.
In newer C++, you can do
std::cout << std::hexfloat << 1.1;
C++20 std::format
This great new C++ library feature has the advantage of not affecting the state of std::cout as std::setprecision does:
#include <format>
#include <string>
int main() {
std::cout << std::format("{:.2} {:.3}\n", 3.1415, 3.1415);
}
Expected output:
3.14 3.142
As mentioned at https://stackoverflow.com/a/65329803/895245 if you don't pass the precision explicitly it prints the shortest decimal representation with a round-trip guarantee. TODO understand in more detail how it compares to: dbl::max_digits10 as shown at https://stackoverflow.com/a/554134/895245 with {:.{}}:
#include <format>
#include <limits>
#include <string>
int main() {
std::cout << std::format("{:.{}}\n",
3.1415926535897932384626433, dbl::max_digits10);
}
See also:
Set back default floating point print precision in C++ for how to restore the initial precision in pre-c++20
std::string formatting like sprintf
https://en.cppreference.com/w/cpp/utility/format/formatter#Standard_format_specification
IEEE 754 floating point values are stored using base 2 representation. Any base 2 number can be represented as a decimal (base 10) to full precision. None of the proposed answers, however, do. They all truncate the decimal value.
This seems to be due to a misinterpretation of what std::numeric_limits<T>::max_digits10 represents:
The value of std::numeric_limits<T>::max_digits10 is the number of base-10 digits that are necessary to uniquely represent all distinct values of the type T.
In other words: It's the (worst-case) number of digits required to output if you want to roundtrip from binary to decimal to binary, without losing any information. If you output at least max_digits10 decimals and reconstruct a floating point value, you are guaranteed to get the exact same binary representation you started with.
What's important: max_digits10 in general neither yields the shortest decimal, nor is it sufficient to represent the full precision. I'm not aware of a constant in the C++ Standard Library that encodes the maximum number of decimal digits required to contain the full precision of a floating point value. I believe it's something like 767 for doubles1. One way to output a floating point value with full precision would be to use a sufficiently large value for the precision, like so2, and have the library strip any trailing zeros:
#include <iostream>
int main() {
double d = 0.1;
std::cout.precision(767);
std::cout << "d = " << d << std::endl;
}
This produces the following output, that contains the full precision:
d = 0.1000000000000000055511151231257827021181583404541015625
Note that this has significantly more decimals than max_digits10 would suggest.
While that answers the question that was asked, a far more common goal would be to get the shortest decimal representation of any given floating point value, that retains all information. Again, I'm not aware of any way to instruct the Standard I/O library to output that value. Starting with C++17 the possibility to do that conversion has finally arrived in C++ in the form of std::to_chars. By default, it produces the shortest decimal representation of any given floating point value that retains the entire information.
Its interface is a bit clunky, and you'd probably want to wrap this up into a function template that returns something you can output to std::cout (like a std::string), e.g.
#include <charconv>
#include <array>
#include <string>
#include <system_error>
#include <iostream>
#include <cmath>
template<typename T>
std::string to_string(T value)
{
// 24 characters is the longest decimal representation of any double value
std::array<char, 24> buffer {};
auto const res { std::to_chars(buffer.data(), buffer.data() + buffer.size(), value) };
if (res.ec == std::errc {})
{
// Success
return std::string(buffer.data(), res.ptr);
}
// Error
return { "FAILED!" };
}
int main()
{
auto value { 0.1f };
std::cout << to_string(value) << std::endl;
value = std::nextafter(value, INFINITY);
std::cout << to_string(value) << std::endl;
value = std::nextafter(value, INFINITY);
std::cout << to_string(value) << std::endl;
}
This would print out (using Microsoft's C++ Standard Library):
0.1
0.10000001
0.10000002
1 From Stephan T. Lavavej's CppCon 2019 talk titled Floating-Point <charconv>: Making Your Code 10x Faster With C++17's Final Boss. (The entire talk is worth watching.)
2 This would also require using a combination of scientific and fixed, whichever is shorter. I'm not aware of a way to set this mode using the C++ Standard I/O library.
printf("%.12f", M_PI);
%.12f means floating point, with precision of 12 digits.
The best option is to use std::setprecision, and the solution works like this:
# include <iostream>
# include <iomanip>
int main()
{
double a = 34.34322;
std::cout<<std::fixed<<a<<std::setprecision(0)<<std::endl;
return 0;
}
Note: you do not need to use cout.setprecision to do it and I fill up 0 at std::setprecision because it must have a argument.
Most portably...
#include <limits>
using std::numeric_limits;
...
cout.precision(numeric_limits<double>::digits10 + 1);
cout << d;
In this question there is a description on how to convert a double to string losselessly (in Octave, but it can be easily reproduced in C++). De idea is to have a short human readable description of the float and a losseless description in hexa form, for instance: pi -> 3.14{54442d18400921fb}.
Here is a function that works for any floating-point type, not just double, and also puts the stream back the way it was found afterwards. Unfortunately it won't interact well with threads, but that's the nature of iostreams. You'll need these includes at the start of your file:
#include <limits>
#include <iostream>
Here's the function, you could it in a header file if you use it a lot:
template <class T>
void printVal(std::ostream& os, T val)
{
auto oldFlags = os.flags();
auto oldPrecision = os.precision();
os.flags(oldFlags & ~std::ios_base::floatfield);
os.precision(std::numeric_limits<T>::digits10);
os << val;
os.flags(oldFlags);
os.precision(oldPrecision);
}
Use it like this:
double d = foo();
float f = bar();
printVal(std::cout, d);
printVal(std::cout, f);
If you want to be able to use the normal insertion << operator, you can use this extra wrapper code:
template <class T>
struct PrintValWrapper { T val; };
template <class T>
std::ostream& operator<<(std::ostream& os, PrintValWrapper<T> pvw) {
printVal(os, pvw.val);
return os;
}
template <class T>
PrintValWrapper<T> printIt(T val) {
return PrintValWrapper<T>{val};
}
Now you can use it like this:
double d = foo();
float f = bar();
std::cout << "The values are: " << printIt(d) << ", " << printIt(f) << '\n';
This will show the value up to two decimal places after the dot.
#include <iostream>
#include <iomanip>
double d = 2.0;
int n = 2;
cout << fixed << setprecision(n) << d;
See here: Fixed-point notation
std::fixed
Use fixed floating-point notation Sets the floatfield format flag for
the str stream to fixed.
When floatfield is set to fixed, floating-point values are written
using fixed-point notation: the value is represented with exactly as
many digits in the decimal part as specified by the precision field
(precision) and with no exponent part.
std::setprecision
Set decimal precision Sets the decimal precision to be used to format
floating-point values on output operations.
If you're familiar with the IEEE standard for representing the floating-points, you would know that it is impossible to show floating-points with full-precision out of the scope of the standard, that is to say, it will always result in a rounding of the real value.
You need to first check whether the value is within the scope, if yes, then use:
cout << defaultfloat << d ;
std::defaultfloat
Use default floating-point notation Sets the floatfield format flag
for the str stream to defaultfloat.
When floatfield is set to defaultfloat, floating-point values are
written using the default notation: the representation uses as many
meaningful digits as needed up to the stream's decimal precision
(precision), counting both the digits before and after the decimal
point (if any).
That is also the default behavior of cout, which means you don't use it explicitly.
With ostream::precision(int)
cout.precision( numeric_limits<double>::digits10 + 1);
cout << M_PI << ", " << M_E << endl;
will yield
3.141592653589793, 2.718281828459045
Why you have to say "+1" I have no clue, but the extra digit you get out of it is correct.

Double does not work properly when executed [duplicate]

In my earlier question I was printing a double using cout that got rounded when I wasn't expecting it. How can I make cout print a double using full precision?
You can set the precision directly on std::cout and use the std::fixed format specifier.
double d = 3.14159265358979;
cout.precision(17);
cout << "Pi: " << fixed << d << endl;
You can #include <limits> to get the maximum precision of a float or double.
#include <limits>
typedef std::numeric_limits< double > dbl;
double d = 3.14159265358979;
cout.precision(dbl::max_digits10);
cout << "Pi: " << d << endl;
Use std::setprecision:
#include <iomanip>
std::cout << std::setprecision (15) << 3.14159265358979 << std::endl;
Here is what I would use:
std::cout << std::setprecision (std::numeric_limits<double>::digits10 + 1)
<< 3.14159265358979
<< std::endl;
Basically the limits package has traits for all the build in types.
One of the traits for floating point numbers (float/double/long double) is the digits10 attribute. This defines the accuracy (I forget the exact terminology) of a floating point number in base 10.
See: http://www.cplusplus.com/reference/std/limits/numeric_limits.html
For details about other attributes.
How do I print a double value with full precision using cout?
Use hexfloat or
use scientific and set the precision
std::cout.precision(std::numeric_limits<double>::max_digits10 - 1);
std::cout << std::scientific << 1.0/7.0 << '\n';
// C++11 Typical output
1.4285714285714285e-01
Too many answers address only one of 1) base 2) fixed/scientific layout or 3) precision. Too many answers with precision do not provide the proper value needed. Hence this answer to a old question.
What base?
A double is certainly encoded using base 2. A direct approach with C++11 is to print using std::hexfloat.
If a non-decimal output is acceptable, we are done.
std::cout << "hexfloat: " << std::hexfloat << exp (-100) << '\n';
std::cout << "hexfloat: " << std::hexfloat << exp (+100) << '\n';
// output
hexfloat: 0x1.a8c1f14e2af5dp-145
hexfloat: 0x1.3494a9b171bf5p+144
Otherwise: fixed or scientific?
A double is a floating point type, not fixed point.
Do not use std::fixed as that fails to print small double as anything but 0.000...000. For large double, it prints many digits, perhaps hundreds of questionable informativeness.
std::cout << "std::fixed: " << std::fixed << exp (-100) << '\n';
std::cout << "std::fixed: " << std::fixed << exp (+100) << '\n';
// output
std::fixed: 0.000000
std::fixed: 26881171418161356094253400435962903554686976.000000
To print with full precision, first use std::scientific which will "write floating-point values in scientific notation". Notice the default of 6 digits after the decimal point, an insufficient amount, is handled in the next point.
std::cout << "std::scientific: " << std::scientific << exp (-100) << '\n';
std::cout << "std::scientific: " << std::scientific << exp (+100) << '\n';
// output
std::scientific: 3.720076e-44
std::scientific: 2.688117e+43
How much precision (how many total digits)?
A double encoded using the binary base 2 encodes the same precision between various powers of 2. This is often 53 bits.
[1.0...2.0) there are 253 different double,
[2.0...4.0) there are 253 different double,
[4.0...8.0) there are 253 different double,
[8.0...10.0) there are 2/8 * 253 different double.
Yet if code prints in decimal with N significant digits, the number of combinations [1.0...10.0) is 9/10 * 10N.
Whatever N (precision) is chosen, there will not be a one-to-one mapping between double and decimal text. If a fixed N is chosen, sometimes it will be slightly more or less than truly needed for certain double values. We could error on too few (a) below) or too many (b) below).
3 candidate N:
a) Use an N so when converting from text-double-text we arrive at the same text for all double.
std::cout << dbl::digits10 << '\n';
// Typical output
15
b) Use an N so when converting from double-text-double we arrive at the same double for all double.
// C++11
std::cout << dbl::max_digits10 << '\n';
// Typical output
17
When max_digits10 is not available, note that due to base 2 and base 10 attributes, digits10 + 2 <= max_digits10 <= digits10 + 3, we can use digits10 + 3 to insure enough decimal digits are printed.
c) Use an N that varies with the value.
This can be useful when code wants to display minimal text (N == 1) or the exact value of a double (N == 1000-ish in the case of denorm_min). Yet since this is "work" and not likely OP's goal, it will be set aside.
It is usually b) that is used to "print a double value with full precision". Some applications may prefer a) to error on not providing too much information.
With .scientific, .precision() sets the number of digits to print after the decimal point, so 1 + .precision() digits are printed. Code needs max_digits10 total digits so .precision() is called with a max_digits10 - 1.
typedef std::numeric_limits< double > dbl;
std::cout.precision(dbl::max_digits10 - 1);
std::cout << std::scientific << exp (-100) << '\n';
std::cout << std::scientific << exp (+100) << '\n';
// Typical output
3.7200759760208361e-44
2.6881171418161356e+43
//2345678901234567 17 total digits
Similar C question
In C++20 you'll be able to use std::format to do this:
std::cout << std::format("{}", M_PI);
Output (assuming IEEE754 double):
3.141592653589793
The default floating-point format is the shortest decimal representation with a round-trip guarantee. The advantage of this method compared to the setprecision I/O manipulator is that it doesn't print unnecessary digits.
In the meantime you can use the {fmt} library, std::format is based on. {fmt} also provides the print function that makes this even easier and more efficient (godbolt):
fmt::print("{}", M_PI);
Disclaimer: I'm the author of {fmt} and C++20 std::format.
The iostreams way is kind of clunky. I prefer using boost::lexical_cast because it calculates the right precision for me. And it's fast, too.
#include <string>
#include <boost/lexical_cast.hpp>
using boost::lexical_cast;
using std::string;
double d = 3.14159265358979;
cout << "Pi: " << lexical_cast<string>(d) << endl;
Output:
Pi: 3.14159265358979
Here is how to display a double with full precision:
double d = 100.0000000000005;
int precision = std::numeric_limits<double>::max_digits10;
std::cout << std::setprecision(precision) << d << std::endl;
This displays:
100.0000000000005
max_digits10 is the number of digits that are necessary to uniquely represent all distinct double values. max_digits10 represents the number of digits before and after the decimal point.
Don't use set_precision(max_digits10) with std::fixed.
On fixed notation, set_precision() sets the number of digits only after the decimal point. This is incorrect as max_digits10 represents the number of digits before and after the decimal point.
double d = 100.0000000000005;
int precision = std::numeric_limits<double>::max_digits10;
std::cout << std::fixed << std::setprecision(precision) << d << std::endl;
This displays incorrect result:
100.00000000000049738
Note: Header files required
#include <iomanip>
#include <limits>
By full precision, I assume mean enough precision to show the best approximation to the intended value, but it should be pointed out that double is stored using base 2 representation and base 2 can't represent something as trivial as 1.1 exactly. The only way to get the full-full precision of the actual double (with NO ROUND OFF ERROR) is to print out the binary bits (or hex nybbles).
One way of doing that is using a union to type-pun the double to a integer and then printing the integer, since integers do not suffer from truncation or round-off issues. (Type punning like this is not supported by the C++ standard, but it is supported in C. However, most C++ compilers will probably print out the correct value anyways. I think g++ supports this.)
union {
double d;
uint64_t u64;
} x;
x.d = 1.1;
std::cout << std::hex << x.u64;
This will give you the 100% accurate precision of the double... and be utterly unreadable because humans can't read IEEE double format ! Wikipedia has a good write up on how to interpret the binary bits.
In newer C++, you can do
std::cout << std::hexfloat << 1.1;
C++20 std::format
This great new C++ library feature has the advantage of not affecting the state of std::cout as std::setprecision does:
#include <format>
#include <string>
int main() {
std::cout << std::format("{:.2} {:.3}\n", 3.1415, 3.1415);
}
Expected output:
3.14 3.142
As mentioned at https://stackoverflow.com/a/65329803/895245 if you don't pass the precision explicitly it prints the shortest decimal representation with a round-trip guarantee. TODO understand in more detail how it compares to: dbl::max_digits10 as shown at https://stackoverflow.com/a/554134/895245 with {:.{}}:
#include <format>
#include <limits>
#include <string>
int main() {
std::cout << std::format("{:.{}}\n",
3.1415926535897932384626433, dbl::max_digits10);
}
See also:
Set back default floating point print precision in C++ for how to restore the initial precision in pre-c++20
std::string formatting like sprintf
https://en.cppreference.com/w/cpp/utility/format/formatter#Standard_format_specification
IEEE 754 floating point values are stored using base 2 representation. Any base 2 number can be represented as a decimal (base 10) to full precision. None of the proposed answers, however, do. They all truncate the decimal value.
This seems to be due to a misinterpretation of what std::numeric_limits<T>::max_digits10 represents:
The value of std::numeric_limits<T>::max_digits10 is the number of base-10 digits that are necessary to uniquely represent all distinct values of the type T.
In other words: It's the (worst-case) number of digits required to output if you want to roundtrip from binary to decimal to binary, without losing any information. If you output at least max_digits10 decimals and reconstruct a floating point value, you are guaranteed to get the exact same binary representation you started with.
What's important: max_digits10 in general neither yields the shortest decimal, nor is it sufficient to represent the full precision. I'm not aware of a constant in the C++ Standard Library that encodes the maximum number of decimal digits required to contain the full precision of a floating point value. I believe it's something like 767 for doubles1. One way to output a floating point value with full precision would be to use a sufficiently large value for the precision, like so2, and have the library strip any trailing zeros:
#include <iostream>
int main() {
double d = 0.1;
std::cout.precision(767);
std::cout << "d = " << d << std::endl;
}
This produces the following output, that contains the full precision:
d = 0.1000000000000000055511151231257827021181583404541015625
Note that this has significantly more decimals than max_digits10 would suggest.
While that answers the question that was asked, a far more common goal would be to get the shortest decimal representation of any given floating point value, that retains all information. Again, I'm not aware of any way to instruct the Standard I/O library to output that value. Starting with C++17 the possibility to do that conversion has finally arrived in C++ in the form of std::to_chars. By default, it produces the shortest decimal representation of any given floating point value that retains the entire information.
Its interface is a bit clunky, and you'd probably want to wrap this up into a function template that returns something you can output to std::cout (like a std::string), e.g.
#include <charconv>
#include <array>
#include <string>
#include <system_error>
#include <iostream>
#include <cmath>
template<typename T>
std::string to_string(T value)
{
// 24 characters is the longest decimal representation of any double value
std::array<char, 24> buffer {};
auto const res { std::to_chars(buffer.data(), buffer.data() + buffer.size(), value) };
if (res.ec == std::errc {})
{
// Success
return std::string(buffer.data(), res.ptr);
}
// Error
return { "FAILED!" };
}
int main()
{
auto value { 0.1f };
std::cout << to_string(value) << std::endl;
value = std::nextafter(value, INFINITY);
std::cout << to_string(value) << std::endl;
value = std::nextafter(value, INFINITY);
std::cout << to_string(value) << std::endl;
}
This would print out (using Microsoft's C++ Standard Library):
0.1
0.10000001
0.10000002
1 From Stephan T. Lavavej's CppCon 2019 talk titled Floating-Point <charconv>: Making Your Code 10x Faster With C++17's Final Boss. (The entire talk is worth watching.)
2 This would also require using a combination of scientific and fixed, whichever is shorter. I'm not aware of a way to set this mode using the C++ Standard I/O library.
printf("%.12f", M_PI);
%.12f means floating point, with precision of 12 digits.
The best option is to use std::setprecision, and the solution works like this:
# include <iostream>
# include <iomanip>
int main()
{
double a = 34.34322;
std::cout<<std::fixed<<a<<std::setprecision(0)<<std::endl;
return 0;
}
Note: you do not need to use cout.setprecision to do it and I fill up 0 at std::setprecision because it must have a argument.
Most portably...
#include <limits>
using std::numeric_limits;
...
cout.precision(numeric_limits<double>::digits10 + 1);
cout << d;
In this question there is a description on how to convert a double to string losselessly (in Octave, but it can be easily reproduced in C++). De idea is to have a short human readable description of the float and a losseless description in hexa form, for instance: pi -> 3.14{54442d18400921fb}.
Here is a function that works for any floating-point type, not just double, and also puts the stream back the way it was found afterwards. Unfortunately it won't interact well with threads, but that's the nature of iostreams. You'll need these includes at the start of your file:
#include <limits>
#include <iostream>
Here's the function, you could it in a header file if you use it a lot:
template <class T>
void printVal(std::ostream& os, T val)
{
auto oldFlags = os.flags();
auto oldPrecision = os.precision();
os.flags(oldFlags & ~std::ios_base::floatfield);
os.precision(std::numeric_limits<T>::digits10);
os << val;
os.flags(oldFlags);
os.precision(oldPrecision);
}
Use it like this:
double d = foo();
float f = bar();
printVal(std::cout, d);
printVal(std::cout, f);
If you want to be able to use the normal insertion << operator, you can use this extra wrapper code:
template <class T>
struct PrintValWrapper { T val; };
template <class T>
std::ostream& operator<<(std::ostream& os, PrintValWrapper<T> pvw) {
printVal(os, pvw.val);
return os;
}
template <class T>
PrintValWrapper<T> printIt(T val) {
return PrintValWrapper<T>{val};
}
Now you can use it like this:
double d = foo();
float f = bar();
std::cout << "The values are: " << printIt(d) << ", " << printIt(f) << '\n';
This will show the value up to two decimal places after the dot.
#include <iostream>
#include <iomanip>
double d = 2.0;
int n = 2;
cout << fixed << setprecision(n) << d;
See here: Fixed-point notation
std::fixed
Use fixed floating-point notation Sets the floatfield format flag for
the str stream to fixed.
When floatfield is set to fixed, floating-point values are written
using fixed-point notation: the value is represented with exactly as
many digits in the decimal part as specified by the precision field
(precision) and with no exponent part.
std::setprecision
Set decimal precision Sets the decimal precision to be used to format
floating-point values on output operations.
If you're familiar with the IEEE standard for representing the floating-points, you would know that it is impossible to show floating-points with full-precision out of the scope of the standard, that is to say, it will always result in a rounding of the real value.
You need to first check whether the value is within the scope, if yes, then use:
cout << defaultfloat << d ;
std::defaultfloat
Use default floating-point notation Sets the floatfield format flag
for the str stream to defaultfloat.
When floatfield is set to defaultfloat, floating-point values are
written using the default notation: the representation uses as many
meaningful digits as needed up to the stream's decimal precision
(precision), counting both the digits before and after the decimal
point (if any).
That is also the default behavior of cout, which means you don't use it explicitly.
With ostream::precision(int)
cout.precision( numeric_limits<double>::digits10 + 1);
cout << M_PI << ", " << M_E << endl;
will yield
3.141592653589793, 2.718281828459045
Why you have to say "+1" I have no clue, but the extra digit you get out of it is correct.

C++ float / double : write to / read from file "exact" values [duplicate]

In my earlier question I was printing a double using cout that got rounded when I wasn't expecting it. How can I make cout print a double using full precision?
You can set the precision directly on std::cout and use the std::fixed format specifier.
double d = 3.14159265358979;
cout.precision(17);
cout << "Pi: " << fixed << d << endl;
You can #include <limits> to get the maximum precision of a float or double.
#include <limits>
typedef std::numeric_limits< double > dbl;
double d = 3.14159265358979;
cout.precision(dbl::max_digits10);
cout << "Pi: " << d << endl;
Use std::setprecision:
#include <iomanip>
std::cout << std::setprecision (15) << 3.14159265358979 << std::endl;
Here is what I would use:
std::cout << std::setprecision (std::numeric_limits<double>::digits10 + 1)
<< 3.14159265358979
<< std::endl;
Basically the limits package has traits for all the build in types.
One of the traits for floating point numbers (float/double/long double) is the digits10 attribute. This defines the accuracy (I forget the exact terminology) of a floating point number in base 10.
See: http://www.cplusplus.com/reference/std/limits/numeric_limits.html
For details about other attributes.
How do I print a double value with full precision using cout?
Use hexfloat or
use scientific and set the precision
std::cout.precision(std::numeric_limits<double>::max_digits10 - 1);
std::cout << std::scientific << 1.0/7.0 << '\n';
// C++11 Typical output
1.4285714285714285e-01
Too many answers address only one of 1) base 2) fixed/scientific layout or 3) precision. Too many answers with precision do not provide the proper value needed. Hence this answer to a old question.
What base?
A double is certainly encoded using base 2. A direct approach with C++11 is to print using std::hexfloat.
If a non-decimal output is acceptable, we are done.
std::cout << "hexfloat: " << std::hexfloat << exp (-100) << '\n';
std::cout << "hexfloat: " << std::hexfloat << exp (+100) << '\n';
// output
hexfloat: 0x1.a8c1f14e2af5dp-145
hexfloat: 0x1.3494a9b171bf5p+144
Otherwise: fixed or scientific?
A double is a floating point type, not fixed point.
Do not use std::fixed as that fails to print small double as anything but 0.000...000. For large double, it prints many digits, perhaps hundreds of questionable informativeness.
std::cout << "std::fixed: " << std::fixed << exp (-100) << '\n';
std::cout << "std::fixed: " << std::fixed << exp (+100) << '\n';
// output
std::fixed: 0.000000
std::fixed: 26881171418161356094253400435962903554686976.000000
To print with full precision, first use std::scientific which will "write floating-point values in scientific notation". Notice the default of 6 digits after the decimal point, an insufficient amount, is handled in the next point.
std::cout << "std::scientific: " << std::scientific << exp (-100) << '\n';
std::cout << "std::scientific: " << std::scientific << exp (+100) << '\n';
// output
std::scientific: 3.720076e-44
std::scientific: 2.688117e+43
How much precision (how many total digits)?
A double encoded using the binary base 2 encodes the same precision between various powers of 2. This is often 53 bits.
[1.0...2.0) there are 253 different double,
[2.0...4.0) there are 253 different double,
[4.0...8.0) there are 253 different double,
[8.0...10.0) there are 2/8 * 253 different double.
Yet if code prints in decimal with N significant digits, the number of combinations [1.0...10.0) is 9/10 * 10N.
Whatever N (precision) is chosen, there will not be a one-to-one mapping between double and decimal text. If a fixed N is chosen, sometimes it will be slightly more or less than truly needed for certain double values. We could error on too few (a) below) or too many (b) below).
3 candidate N:
a) Use an N so when converting from text-double-text we arrive at the same text for all double.
std::cout << dbl::digits10 << '\n';
// Typical output
15
b) Use an N so when converting from double-text-double we arrive at the same double for all double.
// C++11
std::cout << dbl::max_digits10 << '\n';
// Typical output
17
When max_digits10 is not available, note that due to base 2 and base 10 attributes, digits10 + 2 <= max_digits10 <= digits10 + 3, we can use digits10 + 3 to insure enough decimal digits are printed.
c) Use an N that varies with the value.
This can be useful when code wants to display minimal text (N == 1) or the exact value of a double (N == 1000-ish in the case of denorm_min). Yet since this is "work" and not likely OP's goal, it will be set aside.
It is usually b) that is used to "print a double value with full precision". Some applications may prefer a) to error on not providing too much information.
With .scientific, .precision() sets the number of digits to print after the decimal point, so 1 + .precision() digits are printed. Code needs max_digits10 total digits so .precision() is called with a max_digits10 - 1.
typedef std::numeric_limits< double > dbl;
std::cout.precision(dbl::max_digits10 - 1);
std::cout << std::scientific << exp (-100) << '\n';
std::cout << std::scientific << exp (+100) << '\n';
// Typical output
3.7200759760208361e-44
2.6881171418161356e+43
//2345678901234567 17 total digits
Similar C question
In C++20 you'll be able to use std::format to do this:
std::cout << std::format("{}", M_PI);
Output (assuming IEEE754 double):
3.141592653589793
The default floating-point format is the shortest decimal representation with a round-trip guarantee. The advantage of this method compared to the setprecision I/O manipulator is that it doesn't print unnecessary digits.
In the meantime you can use the {fmt} library, std::format is based on. {fmt} also provides the print function that makes this even easier and more efficient (godbolt):
fmt::print("{}", M_PI);
Disclaimer: I'm the author of {fmt} and C++20 std::format.
The iostreams way is kind of clunky. I prefer using boost::lexical_cast because it calculates the right precision for me. And it's fast, too.
#include <string>
#include <boost/lexical_cast.hpp>
using boost::lexical_cast;
using std::string;
double d = 3.14159265358979;
cout << "Pi: " << lexical_cast<string>(d) << endl;
Output:
Pi: 3.14159265358979
Here is how to display a double with full precision:
double d = 100.0000000000005;
int precision = std::numeric_limits<double>::max_digits10;
std::cout << std::setprecision(precision) << d << std::endl;
This displays:
100.0000000000005
max_digits10 is the number of digits that are necessary to uniquely represent all distinct double values. max_digits10 represents the number of digits before and after the decimal point.
Don't use set_precision(max_digits10) with std::fixed.
On fixed notation, set_precision() sets the number of digits only after the decimal point. This is incorrect as max_digits10 represents the number of digits before and after the decimal point.
double d = 100.0000000000005;
int precision = std::numeric_limits<double>::max_digits10;
std::cout << std::fixed << std::setprecision(precision) << d << std::endl;
This displays incorrect result:
100.00000000000049738
Note: Header files required
#include <iomanip>
#include <limits>
By full precision, I assume mean enough precision to show the best approximation to the intended value, but it should be pointed out that double is stored using base 2 representation and base 2 can't represent something as trivial as 1.1 exactly. The only way to get the full-full precision of the actual double (with NO ROUND OFF ERROR) is to print out the binary bits (or hex nybbles).
One way of doing that is using a union to type-pun the double to a integer and then printing the integer, since integers do not suffer from truncation or round-off issues. (Type punning like this is not supported by the C++ standard, but it is supported in C. However, most C++ compilers will probably print out the correct value anyways. I think g++ supports this.)
union {
double d;
uint64_t u64;
} x;
x.d = 1.1;
std::cout << std::hex << x.u64;
This will give you the 100% accurate precision of the double... and be utterly unreadable because humans can't read IEEE double format ! Wikipedia has a good write up on how to interpret the binary bits.
In newer C++, you can do
std::cout << std::hexfloat << 1.1;
C++20 std::format
This great new C++ library feature has the advantage of not affecting the state of std::cout as std::setprecision does:
#include <format>
#include <string>
int main() {
std::cout << std::format("{:.2} {:.3}\n", 3.1415, 3.1415);
}
Expected output:
3.14 3.142
As mentioned at https://stackoverflow.com/a/65329803/895245 if you don't pass the precision explicitly it prints the shortest decimal representation with a round-trip guarantee. TODO understand in more detail how it compares to: dbl::max_digits10 as shown at https://stackoverflow.com/a/554134/895245 with {:.{}}:
#include <format>
#include <limits>
#include <string>
int main() {
std::cout << std::format("{:.{}}\n",
3.1415926535897932384626433, dbl::max_digits10);
}
See also:
Set back default floating point print precision in C++ for how to restore the initial precision in pre-c++20
std::string formatting like sprintf
https://en.cppreference.com/w/cpp/utility/format/formatter#Standard_format_specification
IEEE 754 floating point values are stored using base 2 representation. Any base 2 number can be represented as a decimal (base 10) to full precision. None of the proposed answers, however, do. They all truncate the decimal value.
This seems to be due to a misinterpretation of what std::numeric_limits<T>::max_digits10 represents:
The value of std::numeric_limits<T>::max_digits10 is the number of base-10 digits that are necessary to uniquely represent all distinct values of the type T.
In other words: It's the (worst-case) number of digits required to output if you want to roundtrip from binary to decimal to binary, without losing any information. If you output at least max_digits10 decimals and reconstruct a floating point value, you are guaranteed to get the exact same binary representation you started with.
What's important: max_digits10 in general neither yields the shortest decimal, nor is it sufficient to represent the full precision. I'm not aware of a constant in the C++ Standard Library that encodes the maximum number of decimal digits required to contain the full precision of a floating point value. I believe it's something like 767 for doubles1. One way to output a floating point value with full precision would be to use a sufficiently large value for the precision, like so2, and have the library strip any trailing zeros:
#include <iostream>
int main() {
double d = 0.1;
std::cout.precision(767);
std::cout << "d = " << d << std::endl;
}
This produces the following output, that contains the full precision:
d = 0.1000000000000000055511151231257827021181583404541015625
Note that this has significantly more decimals than max_digits10 would suggest.
While that answers the question that was asked, a far more common goal would be to get the shortest decimal representation of any given floating point value, that retains all information. Again, I'm not aware of any way to instruct the Standard I/O library to output that value. Starting with C++17 the possibility to do that conversion has finally arrived in C++ in the form of std::to_chars. By default, it produces the shortest decimal representation of any given floating point value that retains the entire information.
Its interface is a bit clunky, and you'd probably want to wrap this up into a function template that returns something you can output to std::cout (like a std::string), e.g.
#include <charconv>
#include <array>
#include <string>
#include <system_error>
#include <iostream>
#include <cmath>
template<typename T>
std::string to_string(T value)
{
// 24 characters is the longest decimal representation of any double value
std::array<char, 24> buffer {};
auto const res { std::to_chars(buffer.data(), buffer.data() + buffer.size(), value) };
if (res.ec == std::errc {})
{
// Success
return std::string(buffer.data(), res.ptr);
}
// Error
return { "FAILED!" };
}
int main()
{
auto value { 0.1f };
std::cout << to_string(value) << std::endl;
value = std::nextafter(value, INFINITY);
std::cout << to_string(value) << std::endl;
value = std::nextafter(value, INFINITY);
std::cout << to_string(value) << std::endl;
}
This would print out (using Microsoft's C++ Standard Library):
0.1
0.10000001
0.10000002
1 From Stephan T. Lavavej's CppCon 2019 talk titled Floating-Point <charconv>: Making Your Code 10x Faster With C++17's Final Boss. (The entire talk is worth watching.)
2 This would also require using a combination of scientific and fixed, whichever is shorter. I'm not aware of a way to set this mode using the C++ Standard I/O library.
printf("%.12f", M_PI);
%.12f means floating point, with precision of 12 digits.
The best option is to use std::setprecision, and the solution works like this:
# include <iostream>
# include <iomanip>
int main()
{
double a = 34.34322;
std::cout<<std::fixed<<a<<std::setprecision(0)<<std::endl;
return 0;
}
Note: you do not need to use cout.setprecision to do it and I fill up 0 at std::setprecision because it must have a argument.
Most portably...
#include <limits>
using std::numeric_limits;
...
cout.precision(numeric_limits<double>::digits10 + 1);
cout << d;
In this question there is a description on how to convert a double to string losselessly (in Octave, but it can be easily reproduced in C++). De idea is to have a short human readable description of the float and a losseless description in hexa form, for instance: pi -> 3.14{54442d18400921fb}.
Here is a function that works for any floating-point type, not just double, and also puts the stream back the way it was found afterwards. Unfortunately it won't interact well with threads, but that's the nature of iostreams. You'll need these includes at the start of your file:
#include <limits>
#include <iostream>
Here's the function, you could it in a header file if you use it a lot:
template <class T>
void printVal(std::ostream& os, T val)
{
auto oldFlags = os.flags();
auto oldPrecision = os.precision();
os.flags(oldFlags & ~std::ios_base::floatfield);
os.precision(std::numeric_limits<T>::digits10);
os << val;
os.flags(oldFlags);
os.precision(oldPrecision);
}
Use it like this:
double d = foo();
float f = bar();
printVal(std::cout, d);
printVal(std::cout, f);
If you want to be able to use the normal insertion << operator, you can use this extra wrapper code:
template <class T>
struct PrintValWrapper { T val; };
template <class T>
std::ostream& operator<<(std::ostream& os, PrintValWrapper<T> pvw) {
printVal(os, pvw.val);
return os;
}
template <class T>
PrintValWrapper<T> printIt(T val) {
return PrintValWrapper<T>{val};
}
Now you can use it like this:
double d = foo();
float f = bar();
std::cout << "The values are: " << printIt(d) << ", " << printIt(f) << '\n';
This will show the value up to two decimal places after the dot.
#include <iostream>
#include <iomanip>
double d = 2.0;
int n = 2;
cout << fixed << setprecision(n) << d;
See here: Fixed-point notation
std::fixed
Use fixed floating-point notation Sets the floatfield format flag for
the str stream to fixed.
When floatfield is set to fixed, floating-point values are written
using fixed-point notation: the value is represented with exactly as
many digits in the decimal part as specified by the precision field
(precision) and with no exponent part.
std::setprecision
Set decimal precision Sets the decimal precision to be used to format
floating-point values on output operations.
If you're familiar with the IEEE standard for representing the floating-points, you would know that it is impossible to show floating-points with full-precision out of the scope of the standard, that is to say, it will always result in a rounding of the real value.
You need to first check whether the value is within the scope, if yes, then use:
cout << defaultfloat << d ;
std::defaultfloat
Use default floating-point notation Sets the floatfield format flag
for the str stream to defaultfloat.
When floatfield is set to defaultfloat, floating-point values are
written using the default notation: the representation uses as many
meaningful digits as needed up to the stream's decimal precision
(precision), counting both the digits before and after the decimal
point (if any).
That is also the default behavior of cout, which means you don't use it explicitly.
With ostream::precision(int)
cout.precision( numeric_limits<double>::digits10 + 1);
cout << M_PI << ", " << M_E << endl;
will yield
3.141592653589793, 2.718281828459045
Why you have to say "+1" I have no clue, but the extra digit you get out of it is correct.

Show more more float precision in armadillo matrix [duplicate]

In my earlier question I was printing a double using cout that got rounded when I wasn't expecting it. How can I make cout print a double using full precision?
You can set the precision directly on std::cout and use the std::fixed format specifier.
double d = 3.14159265358979;
cout.precision(17);
cout << "Pi: " << fixed << d << endl;
You can #include <limits> to get the maximum precision of a float or double.
#include <limits>
typedef std::numeric_limits< double > dbl;
double d = 3.14159265358979;
cout.precision(dbl::max_digits10);
cout << "Pi: " << d << endl;
Use std::setprecision:
#include <iomanip>
std::cout << std::setprecision (15) << 3.14159265358979 << std::endl;
Here is what I would use:
std::cout << std::setprecision (std::numeric_limits<double>::digits10 + 1)
<< 3.14159265358979
<< std::endl;
Basically the limits package has traits for all the build in types.
One of the traits for floating point numbers (float/double/long double) is the digits10 attribute. This defines the accuracy (I forget the exact terminology) of a floating point number in base 10.
See: http://www.cplusplus.com/reference/std/limits/numeric_limits.html
For details about other attributes.
How do I print a double value with full precision using cout?
Use hexfloat or
use scientific and set the precision
std::cout.precision(std::numeric_limits<double>::max_digits10 - 1);
std::cout << std::scientific << 1.0/7.0 << '\n';
// C++11 Typical output
1.4285714285714285e-01
Too many answers address only one of 1) base 2) fixed/scientific layout or 3) precision. Too many answers with precision do not provide the proper value needed. Hence this answer to a old question.
What base?
A double is certainly encoded using base 2. A direct approach with C++11 is to print using std::hexfloat.
If a non-decimal output is acceptable, we are done.
std::cout << "hexfloat: " << std::hexfloat << exp (-100) << '\n';
std::cout << "hexfloat: " << std::hexfloat << exp (+100) << '\n';
// output
hexfloat: 0x1.a8c1f14e2af5dp-145
hexfloat: 0x1.3494a9b171bf5p+144
Otherwise: fixed or scientific?
A double is a floating point type, not fixed point.
Do not use std::fixed as that fails to print small double as anything but 0.000...000. For large double, it prints many digits, perhaps hundreds of questionable informativeness.
std::cout << "std::fixed: " << std::fixed << exp (-100) << '\n';
std::cout << "std::fixed: " << std::fixed << exp (+100) << '\n';
// output
std::fixed: 0.000000
std::fixed: 26881171418161356094253400435962903554686976.000000
To print with full precision, first use std::scientific which will "write floating-point values in scientific notation". Notice the default of 6 digits after the decimal point, an insufficient amount, is handled in the next point.
std::cout << "std::scientific: " << std::scientific << exp (-100) << '\n';
std::cout << "std::scientific: " << std::scientific << exp (+100) << '\n';
// output
std::scientific: 3.720076e-44
std::scientific: 2.688117e+43
How much precision (how many total digits)?
A double encoded using the binary base 2 encodes the same precision between various powers of 2. This is often 53 bits.
[1.0...2.0) there are 253 different double,
[2.0...4.0) there are 253 different double,
[4.0...8.0) there are 253 different double,
[8.0...10.0) there are 2/8 * 253 different double.
Yet if code prints in decimal with N significant digits, the number of combinations [1.0...10.0) is 9/10 * 10N.
Whatever N (precision) is chosen, there will not be a one-to-one mapping between double and decimal text. If a fixed N is chosen, sometimes it will be slightly more or less than truly needed for certain double values. We could error on too few (a) below) or too many (b) below).
3 candidate N:
a) Use an N so when converting from text-double-text we arrive at the same text for all double.
std::cout << dbl::digits10 << '\n';
// Typical output
15
b) Use an N so when converting from double-text-double we arrive at the same double for all double.
// C++11
std::cout << dbl::max_digits10 << '\n';
// Typical output
17
When max_digits10 is not available, note that due to base 2 and base 10 attributes, digits10 + 2 <= max_digits10 <= digits10 + 3, we can use digits10 + 3 to insure enough decimal digits are printed.
c) Use an N that varies with the value.
This can be useful when code wants to display minimal text (N == 1) or the exact value of a double (N == 1000-ish in the case of denorm_min). Yet since this is "work" and not likely OP's goal, it will be set aside.
It is usually b) that is used to "print a double value with full precision". Some applications may prefer a) to error on not providing too much information.
With .scientific, .precision() sets the number of digits to print after the decimal point, so 1 + .precision() digits are printed. Code needs max_digits10 total digits so .precision() is called with a max_digits10 - 1.
typedef std::numeric_limits< double > dbl;
std::cout.precision(dbl::max_digits10 - 1);
std::cout << std::scientific << exp (-100) << '\n';
std::cout << std::scientific << exp (+100) << '\n';
// Typical output
3.7200759760208361e-44
2.6881171418161356e+43
//2345678901234567 17 total digits
Similar C question
In C++20 you'll be able to use std::format to do this:
std::cout << std::format("{}", M_PI);
Output (assuming IEEE754 double):
3.141592653589793
The default floating-point format is the shortest decimal representation with a round-trip guarantee. The advantage of this method compared to the setprecision I/O manipulator is that it doesn't print unnecessary digits.
In the meantime you can use the {fmt} library, std::format is based on. {fmt} also provides the print function that makes this even easier and more efficient (godbolt):
fmt::print("{}", M_PI);
Disclaimer: I'm the author of {fmt} and C++20 std::format.
The iostreams way is kind of clunky. I prefer using boost::lexical_cast because it calculates the right precision for me. And it's fast, too.
#include <string>
#include <boost/lexical_cast.hpp>
using boost::lexical_cast;
using std::string;
double d = 3.14159265358979;
cout << "Pi: " << lexical_cast<string>(d) << endl;
Output:
Pi: 3.14159265358979
Here is how to display a double with full precision:
double d = 100.0000000000005;
int precision = std::numeric_limits<double>::max_digits10;
std::cout << std::setprecision(precision) << d << std::endl;
This displays:
100.0000000000005
max_digits10 is the number of digits that are necessary to uniquely represent all distinct double values. max_digits10 represents the number of digits before and after the decimal point.
Don't use set_precision(max_digits10) with std::fixed.
On fixed notation, set_precision() sets the number of digits only after the decimal point. This is incorrect as max_digits10 represents the number of digits before and after the decimal point.
double d = 100.0000000000005;
int precision = std::numeric_limits<double>::max_digits10;
std::cout << std::fixed << std::setprecision(precision) << d << std::endl;
This displays incorrect result:
100.00000000000049738
Note: Header files required
#include <iomanip>
#include <limits>
By full precision, I assume mean enough precision to show the best approximation to the intended value, but it should be pointed out that double is stored using base 2 representation and base 2 can't represent something as trivial as 1.1 exactly. The only way to get the full-full precision of the actual double (with NO ROUND OFF ERROR) is to print out the binary bits (or hex nybbles).
One way of doing that is using a union to type-pun the double to a integer and then printing the integer, since integers do not suffer from truncation or round-off issues. (Type punning like this is not supported by the C++ standard, but it is supported in C. However, most C++ compilers will probably print out the correct value anyways. I think g++ supports this.)
union {
double d;
uint64_t u64;
} x;
x.d = 1.1;
std::cout << std::hex << x.u64;
This will give you the 100% accurate precision of the double... and be utterly unreadable because humans can't read IEEE double format ! Wikipedia has a good write up on how to interpret the binary bits.
In newer C++, you can do
std::cout << std::hexfloat << 1.1;
C++20 std::format
This great new C++ library feature has the advantage of not affecting the state of std::cout as std::setprecision does:
#include <format>
#include <string>
int main() {
std::cout << std::format("{:.2} {:.3}\n", 3.1415, 3.1415);
}
Expected output:
3.14 3.142
As mentioned at https://stackoverflow.com/a/65329803/895245 if you don't pass the precision explicitly it prints the shortest decimal representation with a round-trip guarantee. TODO understand in more detail how it compares to: dbl::max_digits10 as shown at https://stackoverflow.com/a/554134/895245 with {:.{}}:
#include <format>
#include <limits>
#include <string>
int main() {
std::cout << std::format("{:.{}}\n",
3.1415926535897932384626433, dbl::max_digits10);
}
See also:
Set back default floating point print precision in C++ for how to restore the initial precision in pre-c++20
std::string formatting like sprintf
https://en.cppreference.com/w/cpp/utility/format/formatter#Standard_format_specification
IEEE 754 floating point values are stored using base 2 representation. Any base 2 number can be represented as a decimal (base 10) to full precision. None of the proposed answers, however, do. They all truncate the decimal value.
This seems to be due to a misinterpretation of what std::numeric_limits<T>::max_digits10 represents:
The value of std::numeric_limits<T>::max_digits10 is the number of base-10 digits that are necessary to uniquely represent all distinct values of the type T.
In other words: It's the (worst-case) number of digits required to output if you want to roundtrip from binary to decimal to binary, without losing any information. If you output at least max_digits10 decimals and reconstruct a floating point value, you are guaranteed to get the exact same binary representation you started with.
What's important: max_digits10 in general neither yields the shortest decimal, nor is it sufficient to represent the full precision. I'm not aware of a constant in the C++ Standard Library that encodes the maximum number of decimal digits required to contain the full precision of a floating point value. I believe it's something like 767 for doubles1. One way to output a floating point value with full precision would be to use a sufficiently large value for the precision, like so2, and have the library strip any trailing zeros:
#include <iostream>
int main() {
double d = 0.1;
std::cout.precision(767);
std::cout << "d = " << d << std::endl;
}
This produces the following output, that contains the full precision:
d = 0.1000000000000000055511151231257827021181583404541015625
Note that this has significantly more decimals than max_digits10 would suggest.
While that answers the question that was asked, a far more common goal would be to get the shortest decimal representation of any given floating point value, that retains all information. Again, I'm not aware of any way to instruct the Standard I/O library to output that value. Starting with C++17 the possibility to do that conversion has finally arrived in C++ in the form of std::to_chars. By default, it produces the shortest decimal representation of any given floating point value that retains the entire information.
Its interface is a bit clunky, and you'd probably want to wrap this up into a function template that returns something you can output to std::cout (like a std::string), e.g.
#include <charconv>
#include <array>
#include <string>
#include <system_error>
#include <iostream>
#include <cmath>
template<typename T>
std::string to_string(T value)
{
// 24 characters is the longest decimal representation of any double value
std::array<char, 24> buffer {};
auto const res { std::to_chars(buffer.data(), buffer.data() + buffer.size(), value) };
if (res.ec == std::errc {})
{
// Success
return std::string(buffer.data(), res.ptr);
}
// Error
return { "FAILED!" };
}
int main()
{
auto value { 0.1f };
std::cout << to_string(value) << std::endl;
value = std::nextafter(value, INFINITY);
std::cout << to_string(value) << std::endl;
value = std::nextafter(value, INFINITY);
std::cout << to_string(value) << std::endl;
}
This would print out (using Microsoft's C++ Standard Library):
0.1
0.10000001
0.10000002
1 From Stephan T. Lavavej's CppCon 2019 talk titled Floating-Point <charconv>: Making Your Code 10x Faster With C++17's Final Boss. (The entire talk is worth watching.)
2 This would also require using a combination of scientific and fixed, whichever is shorter. I'm not aware of a way to set this mode using the C++ Standard I/O library.
printf("%.12f", M_PI);
%.12f means floating point, with precision of 12 digits.
The best option is to use std::setprecision, and the solution works like this:
# include <iostream>
# include <iomanip>
int main()
{
double a = 34.34322;
std::cout<<std::fixed<<a<<std::setprecision(0)<<std::endl;
return 0;
}
Note: you do not need to use cout.setprecision to do it and I fill up 0 at std::setprecision because it must have a argument.
Most portably...
#include <limits>
using std::numeric_limits;
...
cout.precision(numeric_limits<double>::digits10 + 1);
cout << d;
In this question there is a description on how to convert a double to string losselessly (in Octave, but it can be easily reproduced in C++). De idea is to have a short human readable description of the float and a losseless description in hexa form, for instance: pi -> 3.14{54442d18400921fb}.
Here is a function that works for any floating-point type, not just double, and also puts the stream back the way it was found afterwards. Unfortunately it won't interact well with threads, but that's the nature of iostreams. You'll need these includes at the start of your file:
#include <limits>
#include <iostream>
Here's the function, you could it in a header file if you use it a lot:
template <class T>
void printVal(std::ostream& os, T val)
{
auto oldFlags = os.flags();
auto oldPrecision = os.precision();
os.flags(oldFlags & ~std::ios_base::floatfield);
os.precision(std::numeric_limits<T>::digits10);
os << val;
os.flags(oldFlags);
os.precision(oldPrecision);
}
Use it like this:
double d = foo();
float f = bar();
printVal(std::cout, d);
printVal(std::cout, f);
If you want to be able to use the normal insertion << operator, you can use this extra wrapper code:
template <class T>
struct PrintValWrapper { T val; };
template <class T>
std::ostream& operator<<(std::ostream& os, PrintValWrapper<T> pvw) {
printVal(os, pvw.val);
return os;
}
template <class T>
PrintValWrapper<T> printIt(T val) {
return PrintValWrapper<T>{val};
}
Now you can use it like this:
double d = foo();
float f = bar();
std::cout << "The values are: " << printIt(d) << ", " << printIt(f) << '\n';
This will show the value up to two decimal places after the dot.
#include <iostream>
#include <iomanip>
double d = 2.0;
int n = 2;
cout << fixed << setprecision(n) << d;
See here: Fixed-point notation
std::fixed
Use fixed floating-point notation Sets the floatfield format flag for
the str stream to fixed.
When floatfield is set to fixed, floating-point values are written
using fixed-point notation: the value is represented with exactly as
many digits in the decimal part as specified by the precision field
(precision) and with no exponent part.
std::setprecision
Set decimal precision Sets the decimal precision to be used to format
floating-point values on output operations.
If you're familiar with the IEEE standard for representing the floating-points, you would know that it is impossible to show floating-points with full-precision out of the scope of the standard, that is to say, it will always result in a rounding of the real value.
You need to first check whether the value is within the scope, if yes, then use:
cout << defaultfloat << d ;
std::defaultfloat
Use default floating-point notation Sets the floatfield format flag
for the str stream to defaultfloat.
When floatfield is set to defaultfloat, floating-point values are
written using the default notation: the representation uses as many
meaningful digits as needed up to the stream's decimal precision
(precision), counting both the digits before and after the decimal
point (if any).
That is also the default behavior of cout, which means you don't use it explicitly.
With ostream::precision(int)
cout.precision( numeric_limits<double>::digits10 + 1);
cout << M_PI << ", " << M_E << endl;
will yield
3.141592653589793, 2.718281828459045
Why you have to say "+1" I have no clue, but the extra digit you get out of it is correct.