so im trying to make a c++ program that can find the average of very high numbers (the range was <10^19)
heres my attemp:
#include <iostream>
int main()
{
long double a,b,result;
std::cin>>a;
std::cin>>b;
result=(a+b)/2;
std::cout<<result<<"\n";
}
but somehow i did not the result i expected. my teacher said there was a "trick" and there was no need to even use double. but i search and researched and did not found the trick. so any help?
When using floating point numbers you have to consider their precision, it is represented by std::numeric_limits<T>::digits10 in base 10, and the following program can give them (they may depend on your platform):
#include <iostream>
#include <limits>
int main() {
std::cout << "float: " << std::numeric_limits<float>::digits10 << "\n";
std::cout << "double: " << std::numeric_limits<double>::digits10 << "\n";
std::cout << "long double: " << std::numeric_limits<long double>::digits10 << "\n";
return 0;
}
On ideone I get:
float: 6
double: 15
long double: 18
Which is consistent with 32 bits, 64 bits and 80 bits floating point numbers (respectively).
Since 1019 is above 18 digits (it has 20), the type you have chosen lacks the necessary precision to represent all numbers below it, and no amount of computation can recover the lost data.
Let's switch back to integrals, while their range is more limited, they have a higher degree of precision for the same amount of bits. A 64 bits signed integer has a maximum of 9,223,372,036,854,775,807 and the unsigned version goes up to 18,446,744,073,709,551,615. For comparison 1019 is 10,000,000,000,000,000,000.
A uint64_t (from <cstdint>) gives you to necessary building block, however you'll be teetering on the edge of overflow: 2 times 1019 is too much.
You now have to find a way to compute the average without adding the two number together.
Supposing two integers M, N such that M <= N, (M + N) / 2 = M + (N - M) / 2
Related
For a number a = 1.263839, we can do -
float a = 1.263839
cout << fixed << setprecision(2) << a <<endl;
output :- 1.26
But what if i want set precision of a number and store it, for example-
convert 1.263839 to 1.26 without printing it.
But what if i want set precision of a number and store it
You can store the desired precision in a variable:
int precision = 2;
You can then later use this stored precision when converting the float to a string:
std::cout << std::setprecision(precision) << a;
I think OP wants to convert from 1.263839 to 1.26 without printing the number.
If this is your goal, then you first must realise, that 1.26 is not representable by most commonly used floating point representation. The closest representable 32 bit binary IEEE-754 value is 1.2599999904632568359375.
So, assuming such representation, the best that you can hope for is some value that is very close to 1.26. In best case the one I showed, but since we need to calculate the value, keep in mind that some tiny error may be involved beyond the inability to precisely represent the value (at least in theory; there is no error with your example input using the algorithm below, but the possibility of accuracy loss should always be considered with floating point math).
The calculation is as follows:
Let P bet the number of digits after decimal point that you want to round to (2 in this case).
Let D be 10P (100 in this case).
Multiply input by D
std::round to nearest integer.
Divide by D.
P.S. Sometimes you might not want to round to the nearest, but instead want std::floor or std::ceil to the precision. This is slightly trickier. Simply std::floor(val * D) / D is wrong. For example 9.70 floored to two decimals that way would become 9.69, which would be undesirable.
What you can do in this case is multiply with one magnitude of precision, round to nearest, then divide the extra magnitude and proceed:
Let P bet the number of digits after decimal point that you want to round to (2 in this case).
Let D be 10P (100 in this case).
Multiply input by D * 10
std::round to nearest integer.
Divide by 10
std::floor or std::ceil
Divide by D.
You would need to truncate it. Possibly the easiest way is to multiply it by a factor (in case of 2 decimal places, by a factor of 100), then truncate or round it, and lastly divide by the very same factor.
Now, mind you, that floating-point precision issues might occur, and that even after those operations your float might not be 1.26, but 1.26000000000003 instead.
If your goal is to store a number with a small, fixed number of digits of precision after the decimal point, you can do that by storing it as an integer with an implicit power-of-ten multiplier:
#include <stdio.h>
#include <math.h>
// Given a floating point value and the number of digits
// after the decimal-point that you want to preserve,
// returns an integer encoding of the value.
int ConvertFloatToFixedPrecision(float floatVal, int numDigitsAfterDecimalPoint)
{
return (int) roundf(floatVal*powf(10.0f, numDigitsAfterDecimalPoint));
}
// Given an integer encoding of your value (as returned
// by the above function), converts it back into a floating
// point value again.
float ConvertFixedPrecisionBackToFloat(int fixedPrecision, int numDigitsAfterDecimalPoint)
{
return ((float) fixedPrecision) / powf(10.0f, numDigitsAfterDecimalPoint);
}
int main(int argc, char ** arg)
{
const float val = 1.263839;
int fixedTwoDigits = ConvertFloatToFixedPrecision(val, 2);
printf("fixedTwoDigits=%i\n", fixedTwoDigits);
float backToFloat = ConvertFixedPrecisionBackToFloat(fixedTwoDigits, 2);
printf("backToFloat=%f\n", backToFloat);
return 0;
}
When run, the above program prints this output:
fixedTwoDigits=126
backToFloat=1.260000
If you're talking about storing exactly 1.26 in your variable, chances are you can't (there may be an off chance that exactly 1.26 works, but let's assume it doesn't for a moment) because floating point numbers don't work like that. There are always little inaccuracies because of the way computers handle floating point decimal numbers. Even if you could get 1.26 exactly, the moment you try to use it in a calculation.
That said, you can use some math and truncation tricks to get very close:
int main()
{
// our float
float a = 1.263839;
// the precision we're trying to accomplish
int precision = 100; // 3 decimal places
// because we're an int, this will keep the 126 but lose everything else
int truncated = a * precision; // multiplying by the precision ensures we keep that many digits
// convert it back to a float
// Of course, we need to ensure we're doing floating point division
float b = static_cast<float>(truncated) / precision;
cout << "a: " << a << "\n";
cout << "b: " << b << "\n";
return 0;
}
Output:
a: 1.26384
b: 1.26
Note that this is not really 1.26 here. But is is very close.
This can be demonstrated by using setprecision():
cout << "a: " << std:: setprecision(10) << a << "\n";
cout << "b: " << std:: setprecision(10) << b << "\n";
Output:
a: 1.263839006
b: 1.25999999
So again, it's not exactly 1.26, but very close, and slightly closer than you were before.
Using a stringstream would be an easy way to achieve that:
#include <iostream>
#include <iomanip>
#include <sstream>
using namespace std;
int main() {
stringstream s("");
s << fixed << setprecision(2) << 1.263839;
float a;
s >> a;
cout << a; //Outputs 1.26
return 0;
}
I'm trying to write a program that uses the series to compute the value of PI. The user will input how far it wants the program to compute the series and then the program should output its calculated value of PI. I believe I've successfully written the code for this, however it does not do well with large numbers and only gives me a few decimal places. When I tried to use cout << fixed << setprecision(42); It just gave me "nan" as the value of PI.
int main() {
long long seqNum; // sequence number users will input
long double val; // the series output
cout << "Welcome to the compute PI program." << endl; // welcome message
cout << "Please inter the sequence number in the form of an integer." << endl;
cin >> seqNum; // user input
while ( seqNum < 0) // validation, number must be positive
{
cout << "Please enter a positive number." << endl;
cin >> seqNum;
} // end while
if (seqNum > 0)
{
for ( long int i = 0; i < seqNum; i++ )
{
val = val + 4*(pow(-1.00,i)/(1 + 2*i)); // Gregory-Leibniz sum calculation
}// end for
cout << val;
} // end if
return 0;
}
Any help would be really appreciated. Thank you
Your problem involves an elementary, fundamental principle related to double values: a double, or any floating point type, can hold only a fixed upper limit of significant digits. There is no unlimited digits of precision with plain, garden-variety doubles. There's a hard, upper limit. The exact limit is implementation defined, but on modern C++ implementations the typical limit is just 16 or 17 digits of precision, not even close to your desired 42 digits of precision.
#include <limits>
#include <iostream>
int main()
{
std::cout << std::numeric_limits<double>::max_digits10 << std::endl;
return 0;
}
This gives you the maximum digits of precision with your platform/C++ compiler. This shows a maximum of 17 digits of precision with g++ 9.2 on Linux (max_digits10 is C++11 or later, use digits10 with old C++ compilers to show a closely-related metric).
Your desired 42 digits of precision likely far exceed what your modest doubles can handle. There are various special-purpose math libraries that can perform calculations with higher levels of precision, you can investigate those, if you wish.
You did not initialize or assign any value to val, but you are reading it when you get to the first iteration of
val = val + 4*(pow(-1.00,i)/(1 + 2*i));
This cause your program to have undefined behavior. Initialize val, probably to zero:
long double val = 0; // the series output
That aside, as mentioned in the answer of #SamVarshavchik there is a hard limit on the precision you can reach with the built-in floating point types and 42 places significance is almost certainly outside of that. Similarly the integer types that you are using are limited in size to probably at most 2^64 which is approximately 10^19.
Even if these limits weren't the problem, the series requires summation of roughly 10^42 terms to get PI to a precision of 42 places. It would take you longer than the universe has been around to calculate to that precision with all of earths current computing power combined.
How long double fits so many characters in just 12 bytes?
I made an example, a C ++ factorial
when entering a large number, 1754 for example it calculates with a number that apparently would not fit a long double type.
#include <iostream>
#include <string.h>
using namespace std;
int main()
{
unsigned int n;
long double fatorial = 1;
cout << "Enter number: ";
cin >> n;
for(int i = 1; i <=n; ++i)
{
fatorial *= i;
}
string s = to_string(fatorial);
cout << "Factorial of " << n << " = " <<fatorial << " = " << s;
return 0;
}
Important note:
GCC Compiler on Windows, by visual Studio long double behaves like a double
The problem is how is it stored or the to_string function?
std::to_string(factorial) will return a string containing the same result as std::sprintf(buf, "%Lf", value).
In turn, %Lf prints the entire integer part of a long double, a period and 6 decimal digits of the fractional part.
Since factorial is a very large number, you end up with a very long string.
However, note that this has nothing to do with long double. A simpler example with e.g. double is:
#include <iostream>
#include <string>
int main()
{
std::cout << std::to_string(1e300) << '\n';
return 0;
}
This will print:
10000000000000000525047602 [...300 decimal digits...] 540160.000000
The decimal digits are not exactly zero because the number is not exactly 1e300 but the closest to it that can be represented in the floating-point type.
It doesn't fit that many characters. Rather, to_string produces that many characters from the data.
Here is a toy program:
std::string my_to_string( bool b ) {
if (b)
return "This is a string that never ends, it goes on and on my friend, some people started typing it not knowing what it was, and now they still are typing it it just because this is the string that never ends, it goes on and on my friend, some people started typing it not knowing what it was, and now they still are typing it just because...";
else
return "no it isn't, I can see the end right ^ there";
}
bool stores exactly 1 bit of data. But the string it produces from calling my_to_string can be as long as you want.
double's to_string is like that. It generates far more characters than there is "information" in the double.
This is because it is encoded as a base 10 number on output. Inside the double, it is encoded as a combination of an unsigned number, a sign bit, and an exponential part.
The "value" is then roughly "1+number/2^constant", times +/- one for the sign, times "2^exponential part".
There are only a certain number of "bits of precision" in base 2; if you printed it in base 2 (or hex, or any power-of-2 base) the double would have a few non-zero digits, then a pile of 0s afterwards (or, if small, it would have 0.0000...000 then a handful of non-zero digits).
But when converted to base 10 there isn't a pile of zero digits in it.
Take 0b10000000 -- aka 2^8. This is 256 in base 10 -- it has no trailing 0s at all!
This is because floating point numbers only store an approximation of the actual value. If you look at the actual exact value of 1754! you'll see that your result becomes completely different after the first ~18 digits. The digits after that are just the result of writing (a multiple of) a large power of two in decimal.
The following code throws an std::out_of_range exception in Visual Studio 2013 where in my opinion it shouldn't:
#include <string>
#include <limits>
int main(int argc, char ** argv)
{
double maxDbl = std::stod(std::to_string(std::numeric_limits<double>::max()));
return 0;
}
I tested the code also with gcc 4.9.2 and there it does not throw an exception. The issue seems to be caused by an inaccurate string representation after the conversion to string. In Visual Studio std::to_string(std::numeric_limits<double>::max()) yields
179769313486231610000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000.000000
which indeed seems too large. In gcc, however, it yields
179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368.000000
which seems to be smaller than the passed value.
However, isn't std::numeric_limits<double>::max() supposed to return the
maximum finite representable floating-point number?
So why do the string representations get off? What am I missing here?
Direct answer
Gcc (and Clang and VS2105) correctly return the integer value of (21024 - 1) - (21024-53 - 1) that is what is represented with 52 one bits of significand and an unbiased exponent of 1023 (21024 - 1 would be the integer value with 1023 one bits, and I just substract all the bits below the 52 of the IEE754 format)
I can confirm that a large integer library give 179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368L
The previous exact floating point would be 2971 lesser (971 = 1023 - 52) that is : 179769313486231550856124328384506240234343437157459335924404872448581845754556114388470639943126220321960804027157371570809852884964511743044087662767600909594331927728237078876188760579532563768698654064825262115771015791463983014857704008123419459386245141723703148097529108423358883457665451722744025579520L
The next non representable value would be 2971 greater that is:
179769313486231590772930519078902473361797697894230657273430081157732675805500963132708477322407536021120113879871393357658789768814416622492847430639474124377767893424865485276302219601246094119453082952085005768838150682342462881473913110540827237163350510684586298239947245938479716304835356329624224137216L
But the value used by MSVC2013 and previous is near to 21024 + 2971, that is : 179769313486231610731333614426100589925524828262616317947942685512308090830973387504827396012048193870699768806228404251083258210739369062217227314575410731769485876273179688476358949112102859294830297395714877595371718127781702814782017661749531126051903195165027873311156314696040132728420308633064323416064L
. As it is greater than any value representable in IEEE754 double precision, it cannot be decoded to a double.
Because at most, one could say that any value between 21024 - 2971 (std::numeric_limits<double>::max()) and 21024 could be rounded to std::numeric_limits<double>::max(), but values greater than 21024 are clearly an overflow.
Discussion on accuracy
Only 16 decimal digits are accurate in a double and all other digits can be seen as garbage or random values since they do not depend on the value itself but only one the way you choose to calculate them. Just try to substract 1e+288 (that's already a big value) to maxDbl and look what happens :
maxLess = max Dbl - 1.e+288;
if (maxLess == maxDbl) {
std::cout << "Unchanged" << std::endl;
}
else std::cout << "Changed" << std::endl;
You should see ... Unchanged.
It just looks like VS 2013 is a little incoherent in the way it rounds floating point values : it rounded maxDbl by excess to one bit higher than the maximum actually representable value, and could not decode it later.
The problem is that the standard choosed to use a %f format which gives a false sentiment of accuracy. If you want to see an equivalent problem in gcc, just use :
#include <iostream>
#include <string>
#include <limits>
#include <iomanip>
#include <sstream>
int main() {
double max = std::numeric_limits<double>::max();
std::ostringstream ostr;
ostr << std::setprecision(16) << max;
std::string smax = ostr.str();
std::cout << smax << std::endl;
double m2 = std::stod(smax);
std::cout << m2 << std::endl;
return 0;
}
Rounded to 16 digits mxDbl writes (correctly) : 1.797693134862316e+308, but can no longer be decoded back
And this one :
#include <iostream>
#include <string>
#include <limits>
int main() {
double maxDbl = std::numeric_limits<double>::max();
std::string smax = std::to_string(maxDbl);
std::cout << smax << std::endl;
std::string smax2 = "179769313486231570800000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000.000000";
double max2 = std::stod(smax2);
if (max2 == maxDbl) {
std::cout << smax2 << " is same double as " << smax << std::endl;
}
return 0;
}
Displays :
179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368.000000
179769313486231570800000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000.000000 is same double as 179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368.000000
TL/DR : What I mean is that one big enoudh double value can of course be represented by an exact integer (per IEEE754). But it does represent all integers between half to the previous one and half to the next one. So any integer in that range could be an acceptable representation for the double, and one value rounded at 16 decimal digits should be acceptable, but current standard libraries only allow max floating point value to be truncated at 16 decimal digits. But VS2013 gave a number above the max of the range what was in any case an error.
Reference
IEEE floating point on wikipedia
I am having a problem with precision of a double after performing some operations on a converted string to double.
#include <iostream>
#include <sstream>
#include <math.h>
using namespace std;
// conversion function
void convert(const char * a, const int i, double &out)
{
double val;
istringstream in(a);
in >> val;
cout << "char a -- " << a << endl;
cout << "val ----- " << val << endl;
val *= i;
cout << "modified val --- " << val << endl;
cout << "FMOD ----- " << fmod(val, 1) << endl;
out = val;
return 0;
}
This isn't the case for all numbers entered as a string, so the error isn't constant.
It only affects some numbers (34.38 seems to be constant).
At the minute, it returns this when i pass in a = 34.38 and i=100:
char a -- 34.38
Val ----- 34.38
modified val --- 3438
FMOD ----- 4.54747e-13
This will work if I change the Val to a float, as there is lower precision, but I need a double.
This also is repro when i use atof, sscanf and strtod instead of sstream.
In C++, what is the best way to correctly convert a string to a double, and actually return an accurate value?
Thanks.
This is almost an exact duplicate of so many questions here - basically there is no exact representation of 34.38 in binary floating point, so your 34 + 19/50 is represented as a 34 + k/n where n is a power of two, and there is no exact power of two which has 50 as a factor, so there is no exact value of k possible.
If you set the output precision, you can see that the best double representation is not exact:
cout << fixed << setprecision ( 20 );
gives
char a -- 34.38
val ----- 34.38000000000000255795
modified val --- 3438.00000000000045474735
FMOD ----- 0.00000000000045474735
So in answer to your question, you are already using the best way to convert a string to a double (though boost lexical cast wraps up your two or three lines into one line, so might save you writing your own function). The result is due to the representation used by doubles, and would apply to any finite representation based on binary floating point.
With floats, the multiplication happens to be rounded down rather than up, so you happen to get an exact result. This is not behaviour you can depend on.
The "problem" here is simply that 34.38 cannot be exactly represented in double-precision floating point. You should read this article which describes why it's impossible to represent decimal values exactly in floating point.
If you were to examine "34.38 * 100" in hex (as per "format hex" in MATLAB for example), you'd see:
40aadc0000000001
Notice the final digit.