Representing probability in C++ - c++

I'm trying to represent a simple set of 3 probabilities in C++. For example:
a = 0.1
b = 0.2
c = 0.7
(As far as I know probabilities must add up to 1)
My problem is that when I try to represent 0.7 in C++ as a float I end up with 0.69999999, which won't help when I am doing my calculations later. The same for 0.8, 0.80000001.
Is there a better way of representing numbers between 0.0 and 1.0 in C++?
Bear in mind that this relates to how the numbers are stored in memory so that when it comes to doing tests on the values they are correct, I'm not concerned with how they are display/printed out.

This has nothing to do with C++ and everything to do with how floating point numbers are represented in memory. You should never use the equality operator to compare floating point values, see here for better methods: http://www.cygnus-software.com/papers/comparingfloats/comparingfloats.htm

My problem is that when I try to
represent 0.7 in C++ as a float I end
up with 0.69999999, which won't help
when I am doing my calculations later.
The same for 0.8, 0.80000001.
Is it really a problem? If you just need more precision, use a double instead of a float. That should get you about 15 digits precision, more than enough for most work.
Consider your source data. Is 0.7 really significantly more correct than 0.69999999?
If so, you could use a rational number library such as:
http://www.boost.org/doc/libs/1_40_0/libs/rational/index.html
If the problem is that probabilities add up to 1 by definition, then store them as a collection of numbers, omitting the last one. Infer the last value by subtracting the sum of the others from 1.

How much precision do you need? You might consider scaling the values and quantizing them in a fixed-point representation.

The tests you want to do with your numbers will be incorrect.
There is no exact floating point representation in a base-2 number system for a number like 0.1, because it is a infinte periodic number. Consider one third, that is exactly representable as 0.1 in a base-3 system, but 0.333... in the base-10 system.
So any test you do with a number 0.1 in floating point will be prone to be flawed.
A solution would be using rational numbers (boost has a rational lib), which will be always exact for, ermm, rationals, or use a selfmade base-10 system by multiplying the numbers with a power of ten.

If you really need the precision, and are sticking with rational numbers, I suppose you could go with a fixed point arithemtic. I've not done this before so I can't recommend any libraries.
Alternatively, you can set a threshold when comparing fp numbers, but you'd have to err on one side or another -- say
bool fp_cmp(float a, float b) {
return (a < b + epsilon);
}
Note that excess precision is automatically truncated in each calculation, so you should take care when operating at many different orders of magnitude in your algorithm. A contrived example to illustrate:
a = 15434355e10 + 22543634e10
b = a / 1e20 + 1.1534634
c = b * 1e20
versus
c = b + 1.1534634e20
The two results will be very different. Using the first method a lot of the precision of the first two numbers will be lost in the divide by 1e20. Assuming that the final value you want is on the order of 1e20, the second method will give you more precision.

If you only need a few digits of precision then just use an integer. If you need better precision then you'll have to look to different libraries that provide guarantees on precision.

The issue here is that floating point numbers are stored in base 2. You can not exactly represent a decimal in base 10 with a floating point number in base 2.
Lets step back a second. What does .1 mean? Or .7? They mean 1x10-1 and 7x10-1. If you're using binary for your number, instead of base 10 as we normally do, .1 means 1x2-1, or 1/2. .11 means 1x2-1 + 1x2-2, or 1/2+1/4, or 3/4.
Note how in this system, the denominator is always a power of 2. You cannot represent a number without a denominator that is a power of 2 in a finite number of digits. For instance, .1 (in decimal) means 1/10, but in binary that is an infinite repeating fraction, 0.000110011... (with the 0011 pattern repeating forever). This is similar to how in base 10, 1/3 is an infinite fraction, 0.3333....; base 10 can only represent numbers exactly with a denominator that is a multiple of powers of 2 and 5. (As an aside, base 12 and base 60 are actually really convenient bases, since 12 is divisible by 2, 3, and 4, and 60 is divisible by 2, 3, 4, and 5; but for some reason we use decimal anyhow, and we use binary in computers).
Since floating point numbers (or fixed point numbers) always have a finite number of digits, they cannot represent these infinite repeating fractions exactly. So, they either truncate or round the values to be as close as possible to the real value, but are not equal to the real value exactly. Once you start adding up these rounded values, you start getting more error. In decimal, if your representation of 1/3 is .333, then three copies of that will add up to .999, not 1.
There are four possible solutions. If all you care about is exactly representing decimal fractions like .1 and .7 (as in, you don't care that 1/3 will have the same problem you mention), then you can represent your numbers as decimal, for instance using binary coded decimal, and manipulate those. This is a common solution in finance, where many operations are defined in terms of decimal. This has the downside that you will need to implement all of your own arithmetic operations yourself, without the benefits of the computer's FPU, or find a decimal arithmetic library. This also, as mentioned, does not help with fractions that can't be represented exactly in decimal.
Another solution is to use fractions to represent your numbers. If you use fractions, with bignums (arbitrarily large numbers) for your numerators and denominators, you can represent any rational number that will fit in the memory of your computer. Again, the downside is that arithmetic will be slower, and you'll need to implement arithmetic yourself or use an existing library. This will solve your problem for all rational numbers, but if you wind up with a probability that is computed based on π or √2, you will still have the same issues with not being able to represent them exactly, and need to also use one of the later solutions.
A third solution, if all you care about is getting your numbers to add up to 1 exactly, is for events where you have n possibilities, to only store the values of n-1 of those probabilities, and compute the probability of the last as 1 minus the sum of the rest of the probabilities.
And a fourth solution is to do what you always need to remember when working with floating point numbers (or any inexact numbers, such as fractions being used to represent irrational numbers), and never compare two numbers for equality. Again in base 10, if you add up 3 copies of 1/3, you will wind up with .999. When you want to compare that number to 1, you have to instead compare to see if it is close enough to 1; check that the absolute value of the difference, 1-.999, is less than a threshold, such as .01.

Binary machines always round decimal fractions (except .0 and .5, .25, .75, etc) to values that don't have an exact representation in floating point. This has nothing to do with the language C++. There is no real way around it except to deal with it from a numerical perspective within your code.
As for actually producing the probabilities you seek:
float pr[3] = {0.1, 0.2, 0.7};
float accPr[3];
float prev = 0.0;
int i = 0;
for (i = 0; i < 3; i++) {
accPr[i] = prev + pr[i];
prev = accPr[i];
}
float frand = rand() / (1 + RAND_MAX);
for (i = 0; i < 2; i++) {
if (frand < accPr[i]) break;
}
return i;

I'm sorry to say there's not really an easy answer to your problem.
It falls into a field of study called "Numerical Analysis" that deals with these types of problems (which goes far beyond just making sure you don't check for equality between 2 floating point values). And by field of study, I mean there are a slew of books, journal articles, courses etc. dealing with it. There are people who do their PhD thesis on it.
All I can say is that that I'm thankful I don't have to deal with these issues very much, because the problems and the solutions are often very non-intuitive.
What you might need to do to deal with representing the numbers and calculations you're working on is very dependent on exactly what operations you're doing, the order of those operations and the range of values that you expect to deal with in those operations.

Depending on the requirements of your applications any one of several solutions could be best:
You live with the inherent lack of precision and use floats or doubles. You cannot test either for equality and this implies that you cannot test the sum of your probabilities for equality with 1.0.
As proposed before, you can use integers if you require a fixed precision. You represent 0.7 as 7, 0.1 as 1, 0.2 as 2 and they will add up perfectly to 10, i.e., 1.0. If you have to calculate with your probabilities, especially if you do division and multiplication, you need to round the results correctly. This will introduce an imprecision again.
Represent your numbers as fractions with a pair of integers (1,2) = 1/2 = 0.5. Precise, more flexible than 2) but you don't want to calculate with those.
You can go all the way and use a library that implements rational numbers (e.g. gmp). Precise, with arbitrary precision, you can calculate with it, but slow.

yeah, I'd scale the numbers (0-100)(0-1000) or whatever fixed size you need if you're worried about such things. It also makes for faster math computation in most cases. Back in the bad-old-days, we'd define entire cos/sine tables and other such bleh in integer form to reduce floating fuzz and increase computation speed.
I do find it a bit interesting that a "0.7" fuzzes like that on storage.

Related

Is a floating-point value of 0.0 represented differently from other floating-point values?

I've been going back through my C++ book, and I came across a statement that says zero can be represented exactly as a floating-point number. I was wondering how this is possible unless the value of 0.0 is stored as a type other than a floating point value. I wrote the following code to test this:
#include <iomanip>
#include <iostream>
int main()
{
float value1 {0.0};
float value2 {0.1};
std::cout << std::setprecision(10) << std::fixed;
std::cout << value1 << '\n'
<< value2 << std::endl;
}
Running this code gave the following output:
0.0000000000
0.1000000015
To 10 digits of precision, 0.0 is still 0, and 0.1 has some inaccuracies (which is to be expected). Is a value of 0.0 different from other floating point numbers in the way it is represented, and is this a feature of the compiler or the computer's architecture?
How can 2 be represented as an exact number? 4? 15? 0.5? The answer is just that some numbers can be represented exactly in the floating-point format (which is based on base-2/binary) and others can't.
This is no different from in decimal. You can't represent 1/3 exactly in decimal, but that doesn't mean you can't represent 0.
Zero is special in a way, because (like the other real numbers) it's more trivial to prove this property than for some arbitrary fractional number. But that's about it.
So:
what is it about these values (0, 1/16, 1/2048, ...) that allows them to be represented exactly.
Simple mathematics. In any given base, in the sort of representation we're talking about, some numbers can be written out with a fixed number of decimal places; others can't. That's it.
You can play online with H. Schmidt's IEEE-754 Floating Point Converter for different numbers to see a bunch of different representations, and what errors come about as a result of encoding into those representations. For starters, try 0.5, 0.2 and 0.1.
It was my (perhaps naive) understanding that all floating point values contained some instability.
No, absolutely not.
You want to treat every floating point value in your program as potentially having some small error on it, because you generally don't know what sequence of calculations led to it. You can't trust it, in general. I expect someone half-taught this to you in the past, and that's what led to your misunderstanding.
But, if you do know the error (or lack thereof) involved at each step in the creation of the value (e.g. "all I've done is initialised it to zero"), then that's fine! No need to worry about it then.
Here is one way to look at the situation: with 64 bits to store a number, there are 2^64 bit patterns. Some of these are "not-a-number" representations, but most of the 2^64 patterns represent numbers. The number that is represented is represented exactly, with no error. This might seem strange after learning about floating point math; a caveat lurks ahead.
However, as huge as 2^64 is, there are infinitely many more real numbers. When a calculation produces a non-integer result, the odds are pretty good that the answer will not be a number represented by one of the 2^64 patterns. There are exceptions. For example, 1/2 is represented by one of the patterns. If you store 0.5 in a floating point variable, it will actually store 0.5. Let's try that for other single-digit denominators. (Note: I am writing fractions for their expressive power; I do not intend integer arithmetic.)
1/1 – stored exactly
1/2 – stored exactly
1/3 – not stored exactly
1/4 – stored exactly
1/5 – not stored exactly
1/6 – not stored exactly
1/7 – not stored exactly
1/8 – stored exactly
1/9 – not stored exactly
So with these simple examples, over half are not stored exactly. When you get into more complicated calculations, any one piece of the calculation can throw you off the islands of exact representation. Do you see why the general rule of thumb is that floating point values are not exact? It is incredibly easy to fall into that realm. It is possible to avoid it, but don't count on it.
Some numbers can be represented exactly by a floating point value. Most cannot.

Why are there random trash digits in floating point numbers? [duplicate]

There have been several questions posted to SO about floating-point representation. For example, the decimal number 0.1 doesn't have an exact binary representation, so it's dangerous to use the == operator to compare it to another floating-point number. I understand the principles behind floating-point representation.
What I don't understand is why, from a mathematical perspective, are the numbers to the right of the decimal point any more "special" that the ones to the left?
For example, the number 61.0 has an exact binary representation because the integral portion of any number is always exact. But the number 6.10 is not exact. All I did was move the decimal one place and suddenly I've gone from Exactopia to Inexactville. Mathematically, there should be no intrinsic difference between the two numbers -- they're just numbers.
By contrast, if I move the decimal one place in the other direction to produce the number 610, I'm still in Exactopia. I can keep going in that direction (6100, 610000000, 610000000000000) and they're still exact, exact, exact. But as soon as the decimal crosses some threshold, the numbers are no longer exact.
What's going on?
Edit: to clarify, I want to stay away from discussion about industry-standard representations, such as IEEE, and stick with what I believe is the mathematically "pure" way. In base 10, the positional values are:
... 1000 100 10 1 1/10 1/100 ...
In binary, they would be:
... 8 4 2 1 1/2 1/4 1/8 ...
There are also no arbitrary limits placed on these numbers. The positions increase indefinitely to the left and to the right.
Decimal numbers can be represented exactly, if you have enough space - just not by floating binary point numbers. If you use a floating decimal point type (e.g. System.Decimal in .NET) then plenty of values which can't be represented exactly in binary floating point can be exactly represented.
Let's look at it another way - in base 10 which you're likely to be comfortable with, you can't express 1/3 exactly. It's 0.3333333... (recurring). The reason you can't represent 0.1 as a binary floating point number is for exactly the same reason. You can represent 3, and 9, and 27 exactly - but not 1/3, 1/9 or 1/27.
The problem is that 3 is a prime number which isn't a factor of 10. That's not an issue when you want to multiply a number by 3: you can always multiply by an integer without running into problems. But when you divide by a number which is prime and isn't a factor of your base, you can run into trouble (and will do so if you try to divide 1 by that number).
Although 0.1 is usually used as the simplest example of an exact decimal number which can't be represented exactly in binary floating point, arguably 0.2 is a simpler example as it's 1/5 - and 5 is the prime that causes problems between decimal and binary.
Side note to deal with the problem of finite representations:
Some floating decimal point types have a fixed size like System.Decimal others like java.math.BigDecimal are "arbitrarily large" - but they'll hit a limit at some point, whether it's system memory or the theoretical maximum size of an array. This is an entirely separate point to the main one of this answer, however. Even if you had a genuinely arbitrarily large number of bits to play with, you still couldn't represent decimal 0.1 exactly in a floating binary point representation. Compare that with the other way round: given an arbitrary number of decimal digits, you can exactly represent any number which is exactly representable as a floating binary point.
For example, the number 61.0 has an exact binary representation because the integral portion of any number is always exact. But the number 6.10 is not exact. All I did was move the decimal one place and suddenly I've gone from Exactopia to Inexactville. Mathematically, there should be no intrinsic difference between the two numbers -- they're just numbers.
Let's step away for a moment from the particulars of bases 10 and 2. Let's ask - in base b, what numbers have terminating representations, and what numbers don't? A moment's thought tells us that a number x has a terminating b-representation if and only if there exists an integer n such that x b^n is an integer.
So, for example, x = 11/500 has a terminating 10-representation, because we can pick n = 3 and then x b^n = 22, an integer. However x = 1/3 does not, because whatever n we pick we will not be able to get rid of the 3.
This second example prompts us to think about factors, and we can see that for any rational x = p/q (assumed to be in lowest terms), we can answer the question by comparing the prime factorisations of b and q. If q has any prime factors not in the prime factorisation of b, we will never be able to find a suitable n to get rid of these factors.
Thus for base 10, any p/q where q has prime factors other than 2 or 5 will not have a terminating representation.
So now going back to bases 10 and 2, we see that any rational with a terminating 10-representation will be of the form p/q exactly when q has only 2s and 5s in its prime factorisation; and that same number will have a terminating 2-representatiion exactly when q has only 2s in its prime factorisation.
But one of these cases is a subset of the other! Whenever
q has only 2s in its prime factorisation
it obviously is also true that
q has only 2s and 5s in its prime factorisation
or, put another way, whenever p/q has a terminating 2-representation, p/q has a terminating 10-representation. The converse however does not hold - whenever q has a 5 in its prime factorisation, it will have a terminating 10-representation , but not a terminating 2-representation. This is the 0.1 example mentioned by other answers.
So there we have the answer to your question - because the prime factors of 2 are a subset of the prime factors of 10, all 2-terminating numbers are 10-terminating numbers, but not vice versa. It's not about 61 versus 6.1 - it's about 10 versus 2.
As a closing note, if by some quirk people used (say) base 17 but our computers used base 5, your intuition would never have been led astray by this - there would be no (non-zero, non-integer) numbers which terminated in both cases!
The root (mathematical) reason is that when you are dealing with integers, they are countably infinite.
Which means, even though there are an infinite amount of them, we could "count out" all of the items in the sequence, without skipping any. That means if we want to get the item in the 610000000000000th position in the list, we can figure it out via a formula.
However, real numbers are uncountably infinite. You can't say "give me the real number at position 610000000000000" and get back an answer. The reason is because, even between 0 and 1, there are an infinite number of values, when you are considering floating-point values. The same holds true for any two floating point numbers.
More info:
http://en.wikipedia.org/wiki/Countable_set
http://en.wikipedia.org/wiki/Uncountable_set
Update:
My apologies, I appear to have misinterpreted the question. My response is about why we cannot represent every real value, I hadn't realized that floating point was automatically classified as rational.
To repeat what I said in my comment to Mr. Skeet: we can represent 1/3, 1/9, 1/27, or any rational in decimal notation. We do it by adding an extra symbol. For example, a line over the digits that repeat in the decimal expansion of the number. What we need to represent decimal numbers as a sequence of binary numbers are 1) a sequence of binary numbers, 2) a radix point, and 3) some other symbol to indicate the repeating part of the sequence.
Hehner's quote notation is a way of doing this. He uses a quote symbol to represent the repeating part of the sequence. The article: http://www.cs.toronto.edu/~hehner/ratno.pdf and the Wikipedia entry: http://en.wikipedia.org/wiki/Quote_notation.
There's nothing that says we can't add a symbol to our representation system, so we can represent decimal rationals exactly using binary quote notation, and vice versa.
BCD - Binary-coded Decimal - representations are exact. They are not very space-efficient, but that's a trade-off you have to make for accuracy in this case.
This is a good question.
All your question is based on "how do we represent a number?"
ALL the numbers can be represented with decimal representation or with binary (2's complement) representation. All of them !!
BUT some (most of them) require infinite number of elements ("0" or "1" for the binary position, or "0", "1" to "9" for the decimal representation).
Like 1/3 in decimal representation (1/3 = 0.3333333... <- with an infinite number of "3")
Like 0.1 in binary ( 0.1 = 0.00011001100110011.... <- with an infinite number of "0011")
Everything is in that concept. Since your computer can only consider finite set of digits (decimal or binary), only some numbers can be exactly represented in your computer...
And as said Jon, 3 is a prime number which isn't a factor of 10, so 1/3 cannot be represented with a finite number of elements in base 10.
Even with arithmetic with arbitrary precision, the numbering position system in base 2 is not able to fully describe 6.1, although it can represent 61.
For 6.1, we must use another representation (like decimal representation, or IEEE 854 that allows base 2 or base 10 for the representation of floating-point values)
If you make a big enough number with floating point (as it can do exponents), then you'll end up with inexactness in front of the decimal point, too. So I don't think your question is entirely valid because the premise is wrong; it's not the case that shifting by 10 will always create more precision, because at some point the floating point number will have to use exponents to represent the largeness of the number and will lose some precision that way as well.
It's the same reason you cannot represent 1/3 exactly in base 10, you need to say 0.33333(3). In binary it is the same type of problem but just occurs for different set of numbers.
(Note: I'll append 'b' to indicate binary numbers here. All other numbers are given in decimal)
One way to think about things is in terms of something like scientific notation. We're used to seeing numbers expressed in scientific notation like, 6.022141 * 10^23. Floating point numbers are stored internally using a similar format - mantissa and exponent, but using powers of two instead of ten.
Your 61.0 could be rewritten as 1.90625 * 2^5, or 1.11101b * 2^101b with the mantissa and exponents. To multiply that by ten and (move the decimal point), we can do:
(1.90625 * 2^5) * (1.25 * 2^3) = (2.3828125 * 2^8) = (1.19140625 * 2^9)
or in with the mantissa and exponents in binary:
(1.11101b * 2^101b) * (1.01b * 2^11b) = (10.0110001b * 2^1000b) = (1.00110001b * 2^1001b)
Note what we did there to multiply the numbers. We multiplied the mantissas and added the exponents. Then, since the mantissa ended greater than two, we normalized the result by bumping the exponent. It's just like when we adjust the exponent after doing an operation on numbers in decimal scientific notation. In each case, the values that we worked with had a finite representation in binary, and so the values output by the basic multiplication and addition operations also produced values with a finite representation.
Now, consider how we'd divide 61 by 10. We'd start by dividing the mantissas, 1.90625 and 1.25. In decimal, this gives 1.525, a nice short number. But what is this if we convert it to binary? We'll do it the usual way -- subtracting out the largest power of two whenever possible, just like converting integer decimals to binary, but we'll use negative powers of two:
1.525 - 1*2^0 --> 1
0.525 - 1*2^-1 --> 1
0.025 - 0*2^-2 --> 0
0.025 - 0*2^-3 --> 0
0.025 - 0*2^-4 --> 0
0.025 - 0*2^-5 --> 0
0.025 - 1*2^-6 --> 1
0.009375 - 1*2^-7 --> 1
0.0015625 - 0*2^-8 --> 0
0.0015625 - 0*2^-9 --> 0
0.0015625 - 1*2^-10 --> 1
0.0005859375 - 1*2^-11 --> 1
0.00009765625...
Uh oh. Now we're in trouble. It turns out that 1.90625 / 1.25 = 1.525, is a repeating fraction when expressed in binary: 1.11101b / 1.01b = 1.10000110011...b Our machines only have so many bits to hold that mantissa and so they'll just round the fraction and assume zeroes beyond a certain point. The error you see when you divide 61 by 10 is the difference between:
1.100001100110011001100110011001100110011...b * 2^10b
and, say:
1.100001100110011001100110b * 2^10b
It's this rounding of the mantissa that leads to the loss of precision that we associate with floating point values. Even when the mantissa can be expressed exactly (e.g., when just adding two numbers), we can still get numeric loss if the mantissa needs too many digits to fit after normalizing the exponent.
We actually do this sort of thing all the time when we round decimal numbers to a manageable size and just give the first few digits of it. Because we express the result in decimal it feels natural. But if we rounded a decimal and then converted it to a different base, it'd look just as ugly as the decimals we get due to floating point rounding.
I'm surprised no one has stated this yet: use continued fractions. Any rational number can be represented finitely in binary this way.
Some examples:
1/3 (0.3333...)
0; 3
5/9 (0.5555...)
0; 1, 1, 4
10/43 (0.232558139534883720930...)
0; 4, 3, 3
9093/18478 (0.49209871198181621387596060179673...)
0; 2, 31, 7, 8, 5
From here, there are a variety of known ways to store a sequence of integers in memory.
In addition to storing your number with perfect accuracy, continued fractions also have some other benefits, such as best rational approximation. If you decide to terminate the sequence of numbers in a continued fraction early, the remaining digits (when recombined to a fraction) will give you the best possible fraction. This is how approximations to pi are found:
Pi's continued fraction:
3; 7, 15, 1, 292 ...
Terminating the sequence at 1, this gives the fraction:
355/113
which is an excellent rational approximation.
In the equation
2^x = y ;
x = log(y) / log(2)
Hence, I was just wondering if we could have a logarithmic base system for binary like,
2^1, 2^0, 2^(log(1/2) / log(2)), 2^(log(1/4) / log(2)), 2^(log(1/8) / log(2)),2^(log(1/16) / log(2)) ........
That might be able to solve the problem, so if you wanted to write something like 32.41 in binary, that would be
2^5 + 2^(log(0.4) / log(2)) + 2^(log(0.01) / log(2))
Or
2^5 + 2^(log(0.41) / log(2))
The problem is that you do not really know whether the number actually is exactly 61.0 . Consider this:
float a = 60;
float b = 0.1;
float c = a + b * 10;
What is the value of c? It is not exactly 61, because b is not really .1 because .1 does not have an exact binary representation.
The number 61.0 does indeed have an exact floating-point operation—but that's not true for all integers. If you wrote a loop that added one to both a double-precision floating point number and a 64-bit integer, eventually you'd reach a point where the 64-bit integer perfectly represents a number, but the floating point doesn't—because there aren't enough significant bits.
It's just much easier to reach the point of approximation on the right side of the decimal point. If you started writing out all the numbers in binary floating point, it'd make more sense.
Another way of thinking about it is that when you note that 61.0 is perfectly representable in base 10, and shifting the decimal point around doesn't change that, you're performing multiplication by powers of ten (10^1, 10^-1). In floating point, multiplying by powers of two does not affect the precision of the number. Try taking 61.0 and dividing it by three repeatedly for an illustration of how a perfectly precise number can lose its precise representation.
There's a threshold because the meaning of the digit has gone from integer to non-integer. To represent 61, you have 6*10^1 + 1*10^0; 10^1 and 10^0 are both integers. 6.1 is 6*10^0 + 1*10^-1, but 10^-1 is 1/10, which is definitely not an integer. That's how you end up in Inexactville.
A parallel can be made of fractions and whole numbers. Some fractions eg 1/7 cannot be represented in decimal form without lots and lots of decimals. Because floating point is binary based the special cases change but the same sort of accuracy problems present themselves.
There are an infinite number of rational numbers, and a finite number of bits with which to represent them. See http://en.wikipedia.org/wiki/Floating_point#Accuracy_problems.
you know integer numbers right? each bit represent 2^n
2^4=16
2^3=8
2^2=4
2^1=2
2^0=1
well its the same for floating point(with some distinctions) but the bits represent 2^-n
2^-1=1/2=0.5
2^-2=1/(2*2)=0.25
2^-3=0.125
2^-4=0.0625
Floating point binary representation:
sign Exponent Fraction(i think invisible 1 is appended to the fraction )
B11 B10 B9 B8 B7 B6 B5 B4 B3 B2 B1 B0
The high scoring answer above nailed it.
First you were mixing base 2 and base 10 in your question, then when you put a number on the right side that is not divisible into the base you get problems. Like 1/3 in decimal because 3 doesnt go into a power of 10 or 1/5 in binary which doesnt go into a power of 2.
Another comment though NEVER use equal with floating point numbers, period. Even if it is an exact representation there are some numbers in some floating point systems that can be accurately represented in more than one way (IEEE is bad about this, it is a horrible floating point spec to start with, so expect headaches). No different here 1/3 is not EQUAL to the number on your calculator 0.3333333, no matter how many 3's there are to the right of the decimal point. It is or can be close enough but is not equal. so you would expect something like 2*1/3 to not equal 2/3 depending on the rounding. Never use equal with floating point.
As we have been discussing, in floating point arithmetic, the decimal 0.1 cannot be perfectly represented in binary.
Floating point and integer representations provide grids or lattices for the numbers represented. As arithmetic is done, the results fall off the grid and have to be put back onto the grid by rounding. Example is 1/10 on a binary grid.
If we use binary coded decimal representation as one gentleman suggested, would we be able to keep numbers on the grid?
For a simple answer: The computer doesn't have infinite memory to store fraction (after representing the decimal number as the form of scientific notation). According to IEEE 754 standard for double-precision floating-point numbers, we only have a limit of 53 bits to store fraction.
For more info: http://mathcenter.oxford.emory.edu/site/cs170/ieee754/
I will not bother to repeat what the other 20 answers have already summarized, so I will just answer briefly:
The answer in your content:
Why can't base two numbers represent certain ratios exactly?
For the same reason that decimals are insufficient to represent certain ratios, namely, irreducible fractions with denominators containing prime factors other than two or five which will always have an indefinite string in at least the mantissa of its decimal expansion.
Why can't decimal numbers be represented exactly in binary?
This question at face value is based on a misconception regarding values themselves. No number system is sufficient to represent any quantity or ratio in a manner that the thing itself tells you that it is both a quantity, and at the same time also gives the interpretation in and of itself about the intrinsic value of the representation. As such, all quantitative representations, and models in general, are symbolic and can only be understood a posteriori, namely, after one has been taught how to read and interpret these numbers.
Since models are subjective things that are true insofar as they reflect reality, we do not strictly need to interpret a binary string as sums of negative and positive powers of two. Instead, one may observe that we can create an arbitrary set of symbols that use base two or any other base to represent any number or ratio exactly. Just consider that we can refer to all of infinity using a single word and even a single symbol without "showing infinity" itself.
As an example, I am designing a binary encoding for mixed numbers so that I can have more precision and accuracy than an IEEE 754 float. At the time of writing this, the idea is to have a sign bit, a reciprocal bit, a certain number of bits for a scalar to determine how much to "magnify" the fractional portion, and then the remaining bits are divided evenly between the integer portion of a mixed number, and the latter a fixed-point number which, if the reciprocal bit is set, should be interpreted as one divided by that number. This has the benefit of allowing me to represent numbers with infinite decimal expansions by using their reciprocals which do have terminating decimal expansions, or alternatively, as a fraction directly, potentially as an approximation, depending on my needs.
You can't represent 0.1 exactly in binary for the same reason you can't measure 0.1 inch using a conventional English ruler.
English rulers, like binary fractions, are all about halves. You can measure half an inch, or a quarter of an inch (which is of course half of a half), or an eighth, or a sixteenth, etc.
If you want to measure a tenth of an inch, though, you're out of luck. It's less than an eighth of an inch, but more than a sixteenth. If you try to get more exact, you find that it's a little more than 3/32, but a little less than 7/64. I've never seen an actual ruler that had gradations finer than 64ths, but if you do the math, you'll find that 1/10 is less than 13/128, and it's more than 25/256, and it's more than 51/512. You can keep going finer and finer, to 1024ths and 2048ths and 4096ths and 8192nds, but you will never find an exact marking, even on an infinitely-fine base-2 ruler, that exactly corresponds to 1/10, or 0.1.
You will find something interesting, though. Let's look at all the approximations I've listed, and for each one, record explicitly whether 0.1 is less or greater:
fraction
decimal
0.1 is...
as 0/1
1/2
0.5
less
0
1/4
0.25
less
0
1/8
0.125
less
0
1/16
0.0625
greater
1
3/32
0.09375
greater
1
7/64
0.109375
less
0
13/128
0.1015625
less
0
25/256
0.09765625
greater
1
51/512
0.099609375
greater
1
103/1024
0.1005859375
less
0
205/2048
0.10009765625
less
0
409/4096
0.099853515625
greater
1
819/8192
0.0999755859375
greater
1
Now, if you read down the last column, you get 0001100110011. It's no coincidence that the infinitely-repeating binary fraction for 1/10 is 0.0001100110011...

Parsing decimal values with GMP?

How can I parse a decimal value exactly? That is, I have a string of the value, like "43.879" and I wish to get an exact GMP value. I'm not clear from the docs how, or whether this is actually possible. It doesn't seem to fit in the integer/rational/float value types -- though perhaps twiddling with rational is possible.
My intent is to retain exact precision decimals over operations like addition and subtraction, but to switch to high-precision floating point for operations like division or exponents.
Most libraries give you arbitrarily large precision, GMP included. However even with large precision there are some numbers that cannot be represented exactly in binary format, much same as you cannot represent 1/3 in decimal. For many applications setting the precision to a high number, like 10, doing the calculations, and then rounding the results back to desired precision, like 3 works. Would it not work for you? See this - Is there a C++ equivalent to Java's BigDecimal?
You could also use http://software.intel.com/en-us/articles/intel-decimal-floating-point-math-library
********* EDITS
Exact representation DOES NOT exist in binary floating point for many numbers; the kind that most current floating point libraries offer. A number like 0.1 CANNOT be represented as a binary number, whatever be the precision.
To be able to do what you suggest the library would have to do equivalent of 'hand addition', 'hand division' - the kind you do on pencil and paper to add two decimal numbers. e.g. to store 0.1, the library might elect to represent it as a string itself and then do additions on strings. Needless to say a naive implementation would make the process exceedingly slow - orders of magnitude slow. To add 0.1 + 0.1, it would have to parse string, add 1+1, remember the carries, remember the decimal position etc. That is the thing that the computer micro code does for you in few CPU cycles (or a single instruction). Instead of single instruction, your software library would end up taking like 100 CPU cycles/instructions.
If it tries to convert 0.1 to a number, it is back to square 1 - 0.1 cannot be a number in binary.
However people do recognize the need for representing 0.1 exactly. It is just that binary number representation is NOT going to do it. That is where newer floating point standards come in, and that is where the intel decimal point library is headed.
Repeating my previous example, suppose you had a 10 base computer that could do 10 base numbers. that computer cannot store 1/3 as a 'plain' floating point number. It would have to store the representation that the number is 1/3. The equivalent of how it is written on paper. Try writing 1/3 as a base 10 floating point number on paper.
Also see Why can't decimal numbers be represented exactly in binary?

Subtracting double gives wrong result

I am trying to get the decimal part from the double and this is my code to get the decimal part
double decimalvalue = 23423.1234-23423.0;
0.12340000000040163
But after the subtraction I am expecting decimalvalue to be 0.1234 but I get 0.12340000000040163. Please help me to understand this behavior and if there is any workaround for it.
I suggest you have a look at
What Every Computer Scientist Should Know About Floating-Point Arithmetic
Wikipedia: IEEE 754
There are a finite number of values you can specify in a floating point number, but an infinite number of floating point numbers in the represented range.
Some floating point numbers therefore cannot be represented exactly in any floating/double style data type.
The typical way to handle your specific problem is to avoid a direct equality comparison, but rather do an epsilon test: See if the expected and computed values are within some small number (compared to the values being subtracted), called epsilon, of each other.
Indirectly related is the concept of Machine Epsilon, worth having a look at for a complete understanding
This is a rounding error. In base ten you cannot perfectly represent 1/3 in a given number of digits (say 15). In base 2 there are a lot more things you can not represent, 0.1234 happens to be one of them. The precision depends on the scale, but it's about 15 decimal digits for a double. I would suggest taking a look at http://en.wikipedia.org/wiki/IEEE_floating_point for more details on floating point numbers.
If you are trying to make a base 10 system (like a human used calculator for instance) and you need exact results you should use BCD.

Avoiding erratic tiny numbers when working with floating point numbers

It some times happen when I use floating point numbers in c++ and only use numbers as multiples of, say 0.1, as an increment in a for loop, the actual number which is the loop iterators is not exactly multiples of 0.1 but has unpredictably other added or subtracted tiny numbers of the order of 1^-17. How can I avoid that?
Use integers for the iteration and multiply by the floating-point increment before using.
Alternatively find a decimal math package and use it instead of floating point.
Don't iterate over floating-point numbers.
The problem is that 0.1 can't be exactly represented in floating-point. So instead, you should do something like:
for (int i = 0; i < N; i++)
{
float f = i * 0.1f;
...
}
Here is an excellent article on the subject of working with floats. There is a discussion covering precisely your example - an increment of 0.1:
for (double r=0.0; r!=1.0; r+=0.1) printf("*");
How many stars is it going to print? Ten? Run it and be surprised. The code just keeps on printing the stars until we break it.
Where's the problem? As we already know, doubles are not infinitely precise. The problem we encountered here is the following: In binary, the representation of 0.1 is not finite (as it is in base 10). Decimal 0.1 is equivalent to binary 0.0(0011), where the part in the parentheses is repeated forever. When 0.1 is stored in a double variable, it gets rounded to the closest representable value. Thus if we add it 10 times the result is not exactly equal to one.
I highly recommend reading the whole article if you work a lot with floating point numbers.