file.txt
123.56
89.78
8.89
468.98
567.90
5.78
178908.90
12.56
6789.90
12.56
16.780
0.00
I've parsed the numbers into an array called float A[150] = {0}.
So, A looks like:
A[0] = 123.56
A[1] = 89.78
....
....
A[10] = 16.780
A[11] = 0.00
A[12] = 0
A[13] = 0
...
...
Then, I have a sorting Algorithm
Sort(A, i) // where i is the number of elements (12)
Now A[] looks like
A[0] = 0
A[1] = 5.78
...
...
A[10] = 6789.90
A[11] = 178908.90
Then, I write it to a file called "Final.txt"
std::ofstream File (Name);
if (File.is_open())
{
for (j = 0; j < i; j++)
{
File << A[j] << std::endl;
}
}
The file output "Final.txt" looks like:
0
5.78
8.89
12.56
12.56
16.78
89.78
123.56
468.98
567.9
6789.9
178909 // Why it is NOT CORRECT !!!!!!!
The problem with A[11] after the sorting, why is it not correct in "Final.txt", even though it is correct when I debug it in the A[11] ?
Floating point values are printed with several constraints to make the values readable. For instance you would probably not want the output of
16.78 to read
16.799999999999998
Which might well be a more correct representation of 16.78. To avoid that operator << operation on ostreams uses a precision field to determine how many significant digits to print. This value is obviously set too small for your application.
Reference http://en.cppreference.com/w/cpp/io/manip/setprecision.
Other formatting settings is given by the link πάντα ῥεῖ provided.
You have different problems when trying to output 178908.90
First, the default implementation of c++ ostream normally outputs 178909, because as explained by Captain Giraffe, the default format does its best to avoid irrelevant digits and precision
You could try to force a fixed floating point with 2 digits in precision but then you will get 178908.91 (if you use float and not double). Because you would fall in second problem : in C++, floating point numbers (float or double) use IEEE-754 format which offers only about 7 decimal digits in precision, and internally it is more like : 178908.906
So the rule is : if you want to keep roughly the input precision, do not use numbers of more than 6 decimal digits for float and 14 for double, and if you want to keep strictly input precision, do not use floating point at all !
You can try to use decimal types as shown in that other answer, or build your own class.
Related
As part of a homework, I'm writing a program that takes a float decimal number as input entered from terminal, and return IEEE754 binary32 of that number AND return 1 if the binary exactly represents the number, 0 otherwise. We are only allowed to use iostream and cmath.
I already wrote the part that returns binary32 format, but I don't understand how to see if there's rounding to that format.
My idea to see the rounding was to calculate the decimal number back from binary32 form and compare it with the original number. But I am having difficulty with saving the returned binary32 as some type of data, since I can't use the vector header. I've tried using for loops and pow, but I still get the indices wrong.
Also, I'm having trouble understanding what exactly is df or *df? I wrote the code myself, but I only know that I needed to convert address pointed to float to address pointed to char.
My other idea was to compare binary32 and binary 64, which gives more precision. And again, I don't know how to do this without using vector?
int main(int argc, char* argv[]){
int i ,j;
float num;
num = atof(argv[1]);
char* numf = (char*)(&num);
for (i = sizeof(float) - 1; i >= 0; i--){
for (j = 7; j >= 0; j--)
if (numf[i] & (1 << j)) {
cout << "1";
}else{
cout << "0";
}
}
cout << endl;
}
//////
Update:
Since there's no other way around without using header files, I hard coded for loops to convert binary32 back to decimal.
Since x = 1.b31b30...b0 * 2^p. One for loop for finding the exponent and one for loop for finding the significand.
Basic idea: Convert your number d back to a string (eg. with to_string) and compare it to the input. If the strings are different, there was some loss because of the limitations of float.
Of course, this means your input always has to be in the same string format that to_string uses. No additional unneeded 0's, no whitespaces, etc.
...
That said, doing the float conversion without cast (but with manually parsing the input and calculating the IEEE754 bits) is more work initally, but in return, it sovled this problem automatically. And, as noted in the comments, your cast might not work the way you want.
Let's say I have an input 1.251564.
How can I find how many elements are after "." to have an output as follows:
int numFloating;
// code to go here that leads to
// numFloating == 6
p.s. Sorry for not providing any code, I just have no idea how that should be implemented :(
Thanks for your answers!
Let us consider your number, 1.251564. When you store this in a double, it is stored in the binary IEEE754 format. And you might find that the number is not representable. So, let us check for this number. The closest representable double is:
1.25156 39999 99999 89880 45035 73046 53152 82344 81811 52343 75
This probably comes as something of a surprise to you. There are 52 decimal digits following the decimal point.
The lesson that you need to take away from this is that if you want to ask questions about decimal representations, you need to use a decimal data type rather than double. Once you can actually represent the value exactly, then you will be able to reason about it in a manner that matches your expectations.
Simplest way would be to store it in string.
std::string str("1.1234");
size_t length = str.length();
size_t found = str.find('.', 0 );
size_t count = length-found-1;
int finallyGotTheCount = static_cast<int>(count);
This won't end up well. The problem is that sometimes there are float errors when representing numbers in binary (which is what your computer does).
For example, when adding 1 / 3 + 1 / 3 + 1 / 3 you might get 0.999999... and the number of decimal places varies greatly.
ravi already provided a good way to calculate it, so I'll provide a different one:
double number = 0; // should be equal to the number you want to check
int numFloating = 0;
while ((double)(int)number != number){
number *= 10;
numFloating++;
}
number is a double variable that holds the number you want to check for decimal places.
If you have a fractional number. Lets say .1234
Repeatedly multiply by 10 and throw away the integer portion of the number until you get zero. The number of steps will be the number of decimals. e.g:
.1234 * 10 = 1.234
.234 * 10 = 2.34
.34 * 10 = 3.4
.4 * 10 = 4.0
Problems will however occur when you have a number that is "floating" like 1.199999999.
int numFloating = 0;
double orgin = 1.251564;
double value = orgin - floor(orgin);
while(value == 0)
{
value *= 10;
value = value - floor(value);
numFloating ++;
}
By using this code sometimes answer is wrong. exp: zero in floating point is equal to (2^31)-1.
Obviously output depends on how it realy stored.
How can I transform rational numbers like 1.24234 or 45.314 into integers like 124234 or 45314 also getting the number of decimal digits?
Convert to a string
Find the position of the decimal point.
Subtract that from the length of the above string, for the number of decimals.
Then take the point out of the string.
int i=0;
float a = 1.24234;
for(i; i<20; i++){
float b=pow(10,i);
if((a*b)%10==0)
break;
}
int c = pow(10,i-1);
int result = a*c;
I think this code will help you.
If your number is W.D (Whole.Decimal)
To get W just do (int)W.D.
To get D you can do W.D - (int) W.D
Now you have your whole number and your decimal point separated. To figure out your x10 multiplier on your W keep dividing D by 10 until you get a result that is less than 10.
Now: WxN+D
(where N is the number of times you divided by 10)
Note: I didn't write the code as an example, because I feel this may be a homework assignment. Also, if you are using very long (ie: precise floating points) this won't hold, and could likely overflow. Check your bounds before implementing something like this.
This question already has answers here:
Write your own implementation of math's floor function, C
(5 answers)
Closed 1 year ago.
I would like to know how to write my own floor function to round a float down.
Is it possible to do this by setting the bits of a float that represent the numbers after the comma to 0?
If yes, then how can I access and modify those bits?
Thanks.
You can do bit twiddling on floating point numbers, but getting it right depends on knowing exactly what the floating point binary representation is. For most machines these days its IEEE-754, which is reasonably straight-forward. For example IEEE-754 32-bit floats have 1 sign bit, 8 exponent bits, and 23 mantissa bits, so you can use shifts and masks to extract those fields and do things with them. So doing trunc (round to integer towards 0) is pretty easy:
float trunc(float x) {
union {
float f;
uint32_t i;
} val;
val.f = x;
int exponent = (val.i >> 23) & 0xff; // extract the exponent field;
int fractional_bits = 127 + 23 - exponent;
if (fractional_bits > 23) // abs(x) < 1.0
return 0.0;
if (fractional_bits > 0)
val.i &= ~((1U << fractional_bits) - 1);
return val.f;
}
First, we extract the exponent field, and use that to calculate how many bits after the
decimal point are present in the number. If there are more than the size of the mantissa, then we just return 0. Otherwise, if there's at least 1, we mask off (clear) that many low bits. Pretty simple. We're ignoring denormal, NaN, and infinity her, but that works out ok, as they have exponents of all 0s or all 1s, which means we end up converting denorms to 0 (they get caught in the first if, along with small normal numbers), and leaving NaN/Inf unchanged.
To do a floor, you'd also need to look at the sign, and rounds negative numbers 'up' towards negative infinity.
Note that this is almost certainly slower than using dedicated floating point intructions, so this sort of thing is really only useful if you need to use floating point numbers on hardware that has no native floating point support. Or if you just want to play around and learn how these things work at a low level.
Define from scratch. And no, setting the bits of your floating point number representing the numbers after the comma to 0 will not work. If you look at IEEE-754, you will see that you basically have all your floating-point numbers in the form:
0.xyzxyzxyz 2^(abc)
So to implement flooring, you can get the xyzxyzxyz and shift left by abc+1 times. Drop the rest. I suggest you read up on the binary representation of a floating point number (link above), this should shed light on the solution I suggested.
NOTE: You also need to take care of the sign bit. And the mantissa of your number is off by 127.
Here is an example, Let's say you have the number pi: 3.14..., you want to get 3.
Pi is represented in binary as
0 10000000 10010010000111111011011
This translate to
sign = 0 ; e = 1 ; s = 110010010000111111011011
The above I get directly from Wikipedia. Since e is 1. You will want to shift left s by 1 + 1 = 2, so you get 11 => 3.
#include <iostream>
#include <iomanip>
double round(double input, double roundto) {
return int(input / roundto) * roundto;
}
int main() {
double pi = 3.1415926353898;
double almostpi = round(pi, 0.0001);
std::cout << std::setprecision(14) << pi << '\n' << std::setprecision(14) << almostpi;
}
http://ideone.com/mdqFA
output:
3.1415926353898
3.1415
This will pretty much be faster than any bit twiddling you can come up with. And it works on all computers (with floats) instead of just one type.
Casting to unsigned while returning as a double does what you are seeking, but under the hood. This simple piece of code works for any POSITIVE number.
#include <iostream>
double floor(const double& num) {
return (unsigned long long) num;
}
This has been tested on tio.run (Try It Online) and onlinegdb.com. The function itself doesn't require any #include files, but to print out the answers, I have included stdio.h (in the tio.run and onlinegdb.com, not here). Here it is:
long double myFloor(long double x) /* Change this to your liking: long double might
be float in your situation. */
{
long double xcopy=x<0?x*-1:x;
unsigned int zeros=0;
long double n=1;
for(n=1;xcopy>n*10;n*=10,++zeros);
for(xcopy-=n;zeros!=-1;xcopy-=n)
if(xcopy<0)
{
xcopy+=n;
n/=10;
--zeros;
}
xcopy+=n;
return x<0?(xcopy==0?x:x-(1-xcopy)):(x-xcopy);
}
This function works everywhere (pretty sure) because it just removes all of the non-decimal parts instead of trying to work with the parts of floats.
The floor of a floating point number is the biggest integer less than or equal to it. Here are a some examples:
floor(5.7) = 5
floor(3) = 3
floor(9.9) = 9
floor(7.0) = 7
floor(-7.9) = -8
floor(-5.0) = -5
floor(-3.3) = -3
floor(0) = 0
floor(-0.0) = -0
floor(-0) = -0
Note: this is almost an exact copy from my other answer which answered a question that was basically the same as this one.
This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Why does Visual Studio 2008 tell me .9 - .8999999999999995 = 0.00000000000000055511151231257827?
c++
Hey so i'm making a function to return the number of a digits in a number data type given, but i'm having some trouble with doubles.
I figure out how many digits are in it by multiplying it by like 10 billion and then taking away digits 1 by 1 until the double ends up being 0. however when putting in a double of value say .7904 i never exit the function as it keeps taking away digits which never end up being 0 as the resut of .7904 ends up being 7,903,999,988 and not 7,904,000,000.
How can i solve this problem?? Thanks =) ! oh and any other feed back on my code is WELCOME!
here's the code of my function:
/////////////////////// Numb_Digits() ////////////////////////////////////////////////////
enum{DECIMALS = 10, WHOLE_NUMBS = 20, ALL = 30};
template<typename T>
unsigned long int Numb_Digits(T numb, int scope)
{
unsigned long int length= 0;
switch(scope){
case DECIMALS: numb-= (int)numb; numb*=10000000000; // 10 bil (10 zeros)
for(; numb != 0; length++)
numb-=((int)(numb/pow((double)10, (double)(9-length))))* pow((double)10, (double)(9-length)); break;
case WHOLE_NUMBS: numb= (int)numb; numb*=10000000000;
for(; numb != 0; length++)
numb-=((int)(numb/pow((double)10, (double)(9-length))))* pow((double)10, (double)(9-length)); break;
case ALL: numb = numb; numb*=10000000000;
for(; numb != 0; length++)
numb-=((int)(numb/pow((double)10, (double)(9-length))))* pow((double)10, (double)(9-length)); break;
default: break;}
return length;
};
int main()
{
double test = 345.6457;
cout << Numb_Digits(test, ALL) << endl;
cout << Numb_Digits(test, DECIMALS) << endl;
cout << Numb_Digits(test, WHOLE_NUMBS) << endl;
return 0;
}
It's because of their binary representation, which is discussed in depth here:
http://en.wikipedia.org/wiki/IEEE_754-2008
Basically, when a number can't be represented as is, an approximation is used instead.
To compare floats for equality, check if their difference is lesser than an arbitrary precision.
The easy summary about floating point arithmetic :
http://floating-point-gui.de/
Read this and you'll see the light.
If you're more on the math side, Goldberg paper is always nice :
http://cr.yp.to/2005-590/goldberg.pdf
Long story short : real numbers are stored with a fixed, irregular precision, leading to non obvious behaviors. This is unrelated to the language but more a design choice of how to handle real numbers as a whole.
This is because C++ (like most other languages) can not store floating point numbers with infinte precision.
Floating points are stored like this:
sign * coefficient * 10^exponent if you're using base 10.
The problem is that both the coefficient and exponent are stored as finite integers.
This is a common problem with storing floating point in computer programs, you usually get a tiny rounding error.
The most common way of dealing with this is:
Store the number as a fraction (x/y)
Use a delta that allows small deviations (if abs(x-y) < delta)
Use a third party library such as GMP that can store floating point with perfect precision.
Regarding your question about counting decimals.
There is no way of dealing with this if you get a double as input. You cannot be sure that the user actually sent 1.819999999645634565360 and not 1.82.
Either you have to change your input or change the way your function works.
More info on floating point can be found here: http://en.wikipedia.org/wiki/Floating_point
This is because of the way the IEEE floating point standard is implemented, which will vary depending on operations. It is an approximation of precision. Never use logic of if(float == float), ever!
Float numbers are represented in the form Significant digits × baseexponent(IEEE 754). In your case, float 1.82 = 1 + 0.5 + 0.25 + 0.0625 + ...
Since only a limited digits could be stored, therefore there will be a round error if the float number cannot be represented as a terminating expansion in the relevant base (base 2 in the case).
You should always check relative differences with floating point numbers, not absolute values.
You need to read this, too.
Computers don't store floating point numbers exactly. To accomplish what you are doing, you could store the original input as a string, and count the number of characters.