What base would be more appropriate for my BigInteger library? - c++

I have developed my own BigInteger library in C++, for didactic purpose, initially I have used base 10 and it works fine for add, subtract and multiply, but for some algorithms such as exponentiation, modular exponentiation and division it appears to be more appropriate use base 2.
I am thinking restart my project from scratch and I would know what base do you think is more adequate and why?. Thanks in advance!

If you look at most BigNum type libraries you will see that they are built on top of the existing "SmallNum" data types. These "SmallNum" data types (short, int, long, float, double, ...) are in binary for too many reasons to count. Rather than a vector of base 10 digits, your code will be much faster (much, much, much faster!) if work with a vector of (for example) unsigned ints.
This is one of those places where performance does count. Suppose you use a BigNum package to solve a problem that could be solved without resorting to BigNums. Even the best BigNum library will be much slower (much, much slower) than will a simplistic, non-BigNum approach. If you try to solve a problem that is beyond the bounds of the standard representations that performance penalty will make things even worse.
The best way to overcome this inherent penalty is to take as much advantage of the builtin types as you possibly can.

A base the same as the word size on your target machine, where you have word x word = doubleword as a primitive. The primitive operations work out neatly in terms of machine instructions.

A base 10 representation makes sense for a BigDecimal type, where you need to need to represent decimal fractions exactly (for financial calculations and the like).
I can't see much benefit to using a base 10 representation for a BigInteger. It makes string conversion really easy, but at the expense of making mathematical operations much more complicated. This doesn't seem like a good tradeoff in most cases.

Related

How raise to power works? Is it worth to use pow(x, 2)?

Is it more efficient to do multiplication than raise to power 2 in c++?
I am trying to do final detailed optimizations. Will the compiler treat
x*x the same as pow(x,2)? If I remember correctly, multiplication was
better for some reason, but maybe it does not matter in c++11.
Thanks
If you're comparing multiplication with the pow() standard library function then yes, multiplication is definitely faster.
I general, you should not worry about pico-optimizations like that unless you have evidence that there is a hot-spot (i.e. unless you've profiled your code under realistic scenarios and have identified a particular chunk of code. Also keep in mind that your clever tricks may actually cause performance regressions in new processors where your assumptions will no longer hold.
Algorithmic changes are where you will get the most bang for your computing buck. Focus on that.
Tinkering with multiplications and doing clever bit-hackery... eh not so much bang there* Because the current generation of optimizing compilers is really quite excellent at their job. That's not to say they can't be beat. They can, but not easily and probably only by a few people like Agner Fog.
* there are, of course, exceptions.
When it comes to performance, always make measurements to back up your assumptions. Never trust theory unless you have a benchmark that proves that theory right.
Also, keep in mind that x ^ 2 does not yield the square of 2 in C++:
#include <iostream>
int main()
{
int x = 4;
std::cout << (x ^ 2); // Prints 6
}
Live example.
The implementation of pow() typically involves logarithms, multiplication and expononentiaton, so it will DEFINITELY take longer than a simple multiplication. Most modern high end processors can do multiplication in a couple of clockcycles for integer values, and a dozen or so cycles for floating point multiply. exponentiation is either done as a complex (microcoded) instructions that takes a few dozen or more cycles, or as a series of multiplication and additions (typically with alternating positive and negative numbers, but not certainly). Exponentiation is a similar process.
On lower range processors (e.g. ARM or older x86 processors), the results are even worse. Hundreds of cycles in one floating point operation, or in some processors, even floating point calculations are a number of integer operations that perform the same steps as the float instructions on more advanced processors, so the time taken for pow() could be thousands of cycles, compared to a dozen or so for a multiplication.
Whichever choice is used, the whole calculation will be significantly longer than a simple multiplication.
The pow() function is useful when the exponent is either large, or not an integer. Even for relatively large exponents, you can do the calculation by squaring or cubing multiple times, and it will be faster than pow().
Of course, sometimes the compiler may be able to figure out what you want to do, and do it as a sequence of multiplications as a optimization. But I wouldn't rely on that.
Finally, as ALWAYS, for performance questions: If it's really important to your code, then measure it - your compiler may be smarter than you thin. If performance isn't important, then perform the calculation that is the makes the code most readable.
pow is a library function, not an operator. Unless the compiler is able to optimize out the call (which it legitimately do by taking advantage of its knowledge of the behavior of the standard library functions), calling pow() will impose the overhead of a function call and of all the extra stuff the pow() function has to do.
The second argument to pow() doesn't have to be an integer; for example pow(x, 1.0/3.0) will give you an approximation of the cube root of x. That's going to require some fairly sophisticated computations. It might fall back to repeated multiplication if the second argument is a small integral value, but then it has to check for that at run time.
If the number you want to square is an integer, pow will involve converting it to double, then converting the result back to an integer type, which is relatively expensive and could cause subtle rounding errors.
Using x * x is very likely to be faster and more reliable than pow(x, 2), and it's simpler. (In most contexts, simplicity and reliability are more important considerations than speed.)
C/C++ does not have a native "power" operator. ^ is the bitwise exclusive or (xor). Thus said, the pow function is probably what you are looking for.
Actually, for squaring an integer number, x*x is the most immediate way, and some compiler might optimize it to machine operation if available.
You should read the following link
Why doesn't GCC optimize a*a*a*a*a*a to (a*a*a)*(a*a*a)?
pow(x,2) will most likely be converted to xx. However, higher powers such as pow(x,4) may not be done as optimally as possible. For example pow(x,4) could be done in 3 multiplications xxxx or in two (xx)(x*x) depending on how strict you require the floating point definition to be (by default I think it will use 3 multiplications.
It would be interesting to see what for example pow(x*x,2) produces with and without -ffast-math.
you should look into boost.math's pow function template. it takes the exponent as template parameter and automatically calculate, for example, pow<4>(x) as (x*x)*(x*x).
http://www.boost.org/doc/libs/1_53_0/libs/math/doc/sf_and_dist/html/math_toolkit/special/powers/ct_pow.html

Bit shifting on non contiguous internal number representation

I am writing a c++ arbitrary integer library as an homework. I represented numbers internally as a vector of unsigned int, in base 10^n, where n is as big as possible while fitting into a single unsigned int digit.
I made this choice as a trade-off between space, performance and digit access (it allows me to have much better performances than using base 10, without adding any complexity when converting to a human readable string).
So for example:
base10(441243123294967295) 18 digits
base1000000000(441243123,294967295) 2 digits (comma separated)
Internal representation with uint32
[00011010 01001100 11010101 11110011] [00010001 10010100 11010111 11111111]
To complete the homework I have to implement bit shifting and other bitwise operators.
Does it make sense to implement shift for a number with such an internal representation?
Should I change to a base 2^n so that all bits of the internal representation are significative?
You can, but you do not have to: bit shifting will double the number, no matter what base you use for interpreting it later, because internally these ints are still interpreted as binary by the underlying shift operations. Your implementation will have to decide on the tradeoff there, because your shifting will become harder to implement. On the other hand, the printing in base-10 will remain simpler.
Another solution favoring the decimal system that you may consider is using binary-coded decimals (BCD). Back in the day, hardware used to accelerate these operations (e.g. 6502, the CPU of Apple-2) included special instructions for adding bytes in the BCD interpretation. You would have to implement special correction if you use this representation, but it may be a worthy learning exercise.
Should I change to a base 2^n so that all bits of the internal representation are significative?
Most definitely yes!
Not only that, but modern-day computers in general are all about base2. If this is an excercise, you'll most likely want to learn how to do it well.
All libraries for this kind uses base 2. They do that for a reason: faster processing, possibility for bitwise operations, more compact storage, and more. These advantages outweights the difficulty to covert to decimal. Therefore it's highly recommended that you convert to binary.

Integer division algorithm

I was thinking about an algorithm in division of large numbers: dividing with remainder bigint C by bigint D, where we know the representation of C in base b, and D is of form b^k-1. It's probably the easiest to show it on an example. Let's try dividing C=21979182173 by D=999.
We write the number as sets of three digits: 21 979 182 173
We take sums (modulo 999) of consecutive sets, starting from the left: 21 001 183 356
We add 1 to those sets preceding the ones where we "went over 999": 22 001 183 356
Indeed, 21979182173/999=22001183 and remainder 356.
I've calculated the complexity and, if I'm not mistaken, the algorithm should work in O(n), n being the number of digits of C in base b representation. I've also done a very crude and unoptimized version of the algorithm (only for b=10) in C++, tested it against GMP's general integer division algorithm and it really does seem to fare better than GMP. I couldn't find anything like this implemented anywhere I looked, so I had to resort to testing it against general division.
I found several articles which discuss what seem to be quite similar matters, but none of them concentrate on actual implementations, especially in bases different than 2. I suppose that's because of the way numbers are internally stored, although the mentioned algorithm seems useful for, say, b=10, even taking that into account. I also tried contacting some other people, but, again, to no avail.
Thus, my question would be: is there an article or a book or something where the aforementioned algorithm is described, possibly discussing the implementations? If not, would it make sense for me to try and implement and test such an algorithm in, say, C/C++ or is this algorithm somehow inherently bad?
Also, I'm not a programmer and while I'm reasonably OK at programming, I admittedly don't have much knowledge of computer "internals". Thus, pardon my ignorance - it's highly possible there are one or more very stupid things in this post. Sorry once again.
Thanks a lot!
Further clarification of points raised in the comments/answers:
Thanks, everyone - as I didn't want to comment on all the great answers and advice with the same thing, I'd just like to address one point a lot of you touched on.
I am fully aware that working in bases 2^n is, generally speaking, clearly the most efficient way of doing things. Pretty much all bigint libraries use 2^32 or whatever. However, what if (and, I emphasize, it would be useful only for this particular algorithm!) we implement bigints as an array of digits in base b? Of course, we require b here to be "reasonable": b=10, the most natural case, seems reasonable enough. I know it's more or less inefficient both considering memory and time, taking into account how numbers are internally stored, but I have been able to, if my (basic and possibly somehow flawed) tests are correct, produce results faster than GMP's general division, which would give sense to implementing such an algorithm.
Ninefingers notices I'd have to use in that case an expensive modulo operation. I hope not: I can see if old+new crossed, say, 999, just by looking at the number of digits of old+new+1. If it has 4 digits, we're done. Even more, since old<999 and new<=999, we know that if old+new+1 has 4 digits (it can't have more), then, (old+new)%999 equals deleting the leftmost digit of (old+new+1), which I presume we can do cheaply.
Of course, I'm not disputing obvious limitations of this algorithm nor I claim it can't be improved - it can only divide with a certain class of numbers and we have to a priori know the representation of dividend in base b. However, for b=10, for instance, the latter seems natural.
Now, say we have implemented bignums as I outlined above. Say C=(a_1a_2...a_n) in base b and D=b^k-1. The algorithm (which could be probably much more optimized) would go like this. I hope there aren't many typos.
if k>n, we're obviously done
add a zero (i.e. a_0=0) at the beginning of C (just in case we try to divide, say, 9999 with 99)
l=n%k (mod for "regular" integers - shouldn't be too expensive)
old=(a_0...a_l) (the first set of digits, possibly with less than k digits)
for (i=l+1; i < n; i=i+k) (We will have floor(n/k) or so iterations)
new=(a_i...a_(i+k-1))
new=new+old (this is bigint addition, thus O(k))
aux=new+1 (again, bigint addition - O(k) - which I'm not happy about)
if aux has more than k digits
delete first digit of aux
old=old+1 (bigint addition once again)
fill old with zeroes at the beginning so it has as much digits as it should
(a_(i-k)...a_(i-1))=old (if i=l+1, (a _ 0...a _ l)=old)
new=aux
fill new with zeroes at the beginning so it has as much digits as it should
(a_i...a_(i+k-1)=new
quot=(a_0...a_(n-k+1))
rem=new
There, thanks for discussing this with me - as I said, this does seem to me to be an interesting "special case" algorithm to try to implement, test and discuss, if nobody sees any fatal flaws in it. If it's something not widely discussed so far, even better. Please, let me know what you think. Sorry about the long post.
Also, just a few more personal comments:
#Ninefingers: I actually have some (very basic!) knowledge of how GMP works, what it does and of general bigint division algorithms, so I was able to understand much of your argument. I'm also aware GMP is highly optimized and in a way customizes itself for different platforms, so I'm certainly not trying to "beat it" in general - that seems as much fruitful as attacking a tank with a pointed stick. However, that's not the idea of this algorithm - it works in very special cases (which GMP does not appear to cover). On an unrelated note, are you sure general divisions are done in O(n)? The most I've seen done is M(n). (And that can, if I understand correctly, in practice (Schönhage–Strassen etc.) not reach O(n). Fürer's algorithm, which still doesn't reach O(n), is, if I'm correct, almost purely theoretical.)
#Avi Berger: This doesn't actually seem to be exactly the same as "casting out nines", although the idea is similar. However, the aforementioned algorithm should work all the time, if I'm not mistaken.
Your algorithm is a variation of a base 10 algorithm known as "casting out nines". Your example is using base 1000 and "casting out" 999's (one less than the base). This used to be taught in elementary school as way to do a quick check on hand calculations. I had a high school math teacher who was horrified to learn that it wasn't being taught anymore and filled us in on it.
Casting out 999's in base 1000 won't work as a general division algorithm. It will generate values that are congruent modulo 999 to the actual quotient and remainder - not the actual values. Your algorithm is a bit different and I haven't checked if it works, but it is based on effectively using base 1000 and the divisor being 1 less than the base. If you wanted to try it for dividing by 47, you would have to convert to a base 48 number system first.
Google "casting out nines" for more information.
Edit: I originally read your post a bit too quickly, and you do know of this as a working algorithm. As #Ninefingers and #Karl Bielefeldt have stated more clearly than me in their comments, what you aren't including in your performance estimate is the conversion into a base appropriate for the particular divisor at hand.
I feel the need to add to this based on my comment. This isn't an answer, but an explanation as to the background.
A bignum library uses what are called limbs - search for mp_limb_t in the gmp source, which are usually a fixed-size integer field.
When you do something like addition, one way (albeit inefficient) to approach it is to do this:
doublelimb r = limb_a + limb_b + carryfrompreviousiteration
This double-sized limb catches the overflow of limb_a + limb_b in the case that the sum is bigger than the limb size. So if the total is bigger than 2^32 if we're using uint32_t as our limb size, the overflow can be caught.
Why do we need this? Well, what you typically do is loop through all the limbs - you've done this yourself in dividing your integer up and going through each one - but we do it LSL first (so the smallest limb first) just as you'd do arithmetic by hand.
This might seem inefficient, but this is just the C way of doing things. To really break out the big guns, x86 has adc as an instruction - add with carry. What this does is an arithmetic and on your fields and sets the carry bit if the arithmetic overflows the size of the register. The next time you do add or adc, the processor factors in the carry bit too. In subtraction it's called the borrow flag.
This also applies to shift operations. As such, this feature of the processor is crucial to what makes bignums fast. So the fact is, there's electronic circuitry in the chip for doing this stuff - doing it in software is always going to be slower.
Without going into too much detail, operations are built up from this ability to add, shift, subtract etc. They're crucial. Oh and you use the full width of your processor's register per limb if you're doing it right.
Second point - conversion between bases. You cannot take a value in the middle of a number and change it's base, because you can't account for the overflow from the digit beneath it in your original base, and that number can't account for the overflow from the digit beneath... and so on. In short, every time you want to change base, you need to convert the entire bignum from the original base to your new base back again. So you have to walk the bignum (all the limbs) three times at least. Or, alternatively, detect overflows expensively in all other operations... remember, now you need to do modulo operations to work out if you overflowed, whereas before the processor was doing it for us.
I should also like to add that whilst what you've got is probably quick for this case, bear in mind that as a bignum library gmp does a fair bit of work for you, like memory management. If you're using mpz_ you're using an abstraction above what I've described here, for starters. Finally, gmp uses hand optimised assembly with unrolled loops for just about every platform you've ever heard of, plus more. There's a very good reason it ships with Mathematica, Maple et al.
Now, just for reference, some reading material.
Modern Computer Arithmetic is a Knuth-like work for arbitrary precision libraries.
Donald Knuth, Seminumerical Algorithms (The Art of Computer Programming Volume II).
William Hart's blog on implementing algorithm's for bsdnt in which he discusses various division algorithms. If you're interested in bignum libraries, this is an excellent resource. I considered myself a good programmer until I started following this sort of stuff...
To sum it up for you: division assembly instructions suck, so people generally compute inverses and multiply instead, as you do when defining division in modular arithmetic. The various techniques that exist (see MCA) are mostly O(n).
Edit: Ok, not all of the techniques are O(n). Most of the techniques called div1 (dividing by something not bigger than a limb are O(n). When you go bigger you end up with O(n^2) complexity; this is hard to avoid.
Now, could you implement bigints as an array of digits? Well yes, of course you could. However, consider the idea just under addition
/* you wouldn't do this just before add, it's just to
show you the declaration.
*/
uint32_t* x = malloc(num_limbs*sizeof(uint32_t));
uint32_t* y = malloc(num_limbs*sizeof(uint32_t));
uint32_t* a = malloc(num_limbs*sizeof(uint32_t));
uint32_t m;
for ( i = 0; i < num_limbs; i++ )
{
m = 0;
uint64_t t = x[i] + y[i] + m;
/* now we need to work out if that overflowed at all */
if ( (t/somebase) >= 1 ) /* expensive division */
{
m = t % somebase; /* get the overflow */
}
}
/* frees somewhere */
That's a rough sketch of what you're looking at for addition via your scheme. So you have to run the conversion between bases. So you're going to need a conversion to your representation for the base, then back when you're done, because this form is just really slow everywhere else. We're not talking about the difference between O(n) and O(n^2) here, but we are talking about an expensive division instruction per limb or an expensive conversion every time you want to divide. See this.
Next up, how do you expand your division for general case division? By that, I mean when you want to divide those two numbers x and y from the above code. You can't, is the answer, without resorting to bignum-based facilities, which are expensive. See Knuth. Taking modulo a number greater than your size doesn't work.
Let me explain. Try 21979182173 mod 1099. Let's assume here for simplicity's sake that the biggest size field we can have is three digits. This is a contrived example, but the biggest field size I know if uses 128 bits using gcc extensions. Anyway, the point is, you:
21 979 182 173
Split your number into limbs. Then you take modulo and sum:
21 1000 1182 1355
It doesn't work. This is where Avi is correct, because this is a form of casting out nines, or an adaption thereof, but it doesn't work here because our fields have overflowed for a start - you're using the modulo to ensure each field stays within its limb/field size.
So what's the solution? Split your number up into a series of appropriately sized bignums? And start using bignum functions to calculate everything you need to? This is going to be much slower than any existing way of manipulating the fields directly.
Now perhaps you're only proposing this case for dividing by a limb, not a bignum, in which case it can work, but hensel division and precomputed inverses etc do to without the conversion requirement. I have no idea if this algorithm would be faster than say hensel division; it would be an interesting comparison; the problem comes with a common representation across the bignum library. The representation chosen in existing bignum libraries is for the reasons I've expanded on - it makes sense at the assembly level, where it was first done.
As a side note; you don't have to use uint32_t to represent your limbs. You use a size ideally the size of the registers of the system (say uint64_t) so that you can take advantage of assembly-optimised versions. So on a 64-bit system adc rax, rbx only sets the overflow (CF) if the result overspills 2^64 bits.
tl;dr version: the problem isn't your algorithm or idea; it's the problem of converting between bases, since the representation you need for your algorithm isn't the most efficient way to do it in add/sub/mul etc. To paraphrase knuth: This shows you the difference between mathematical elegance and computational efficiency.
If you need to frequently divide by the same divisor, using it (or a power of it) as your base makes division as cheap as bit-shifting is for base 2 binary integers.
You could use base 999 if you want; there's nothing special about using a power-of-10 base except that it makes conversion to decimal integer very cheap. (You can work one limb at a time instead of having to do a full division over the whole integer. It's like the difference between converting a binary integer to decimal vs. turning every 4 bits into a hex digit. Binary -> hex can start with the most significant bits, but converting to non-power-of-2 bases has to be LSB-first using division.)
For example, to compute the first 1000 decimal digits of Fibonacci(109) for a code-golf question with a performance requirement, my 105 bytes of x86 machine code answer used the same algorithm as this Python answer: the usual a+=b; b+=a Fibonacci iteration, but divide by (a power of) 10 every time a gets too large.
Fibonacci grows faster than carry propagates, so discarding the low decimal digits occasionally doesn't change the high digits long-term. (You keep a few extra beyond the precision you want).
Dividing by a power of 2 doesn't work, unless you keep track of how many powers of 2 you've discarded, because the eventual binary -> decimal conversion at the end would depend on that.
So for this algorithm, you have to do extended-precision addition, and division by 10 (or whatever power of 10 you want).
I stored base-109 limbs in 32-bit integer elements. Dividing by 109 is trivially cheap: just a pointer increment to skip the low limb. Instead of actually doing a memmove, I just offset the pointer used by the next add iteration.
I think division by a power of 10 other than 10^9 would be somewhat cheap, but would require an actual division on each limb, and propagating the remainder to the next limb.
Extended-precision addition is somewhat more expensive this way than with binary limbs, because I have to generate the carry-out manually with a compare: sum[i] = a[i] + b[i]; carry = sum < a; (unsigned comparison). And also manually wrap to 10^9 based on that compare, with a conditional-move instruction. But I was able to use that carry-out as an input to adc (x86 add-with-carry instruction).
You don't need a full modulo to handle the wrapping on addition, because you know you've wrapped at most once.
This wastes a just over 2 bits of each 32-bit limb: 10^9 instead of 2^32 = 4.29... * 10^9. Storing base-10 digits one per byte would be significantly less space efficient, and very much worse for performance, because an 8-bit binary addition costs the same as a 64-bit binary addition on a modern 64-bit CPU.
I was aiming for code-size: for pure performance I would have used 64-bit limbs holding base-10^19 "digits". (2^64 = 1.84... * 10^19, so this wastes less than 1 bit per 64.) This lets you get twice as much work done with each hardware add instruction. Hmm, actually this might be a problem: the sum of two limbs might wrap the 64-bit integer, so just checking for > 10^19 isn't sufficient anymore. You could work in base 5*10^18, or in base 10^18, or do more complicated carry-out detection that checks for binary carry as well as manual carry.
Storing packed BCD with one digit per 4 bit nibble would be even worse for performance, because there isn't hardware support for blocking carry from one nibble to the next within a byte.
Overall, my version ran about 10x faster than the Python extended-precision version on the same hardware (but it had room for significant optimization for speed, by dividing less often). (70 seconds or 80 seconds vs. 12 minutes)
Still, I think for this particular implementation of that algorithm (where I only needed addition and division, and division happened after every few additions), the choice of base-10^9 limbs was very good. There are much more efficient algorithms for the Nth Fibonacci number that don't need to do 1 billion extended-precision additions.

changing float type to short but with same behaviour as float type variable

Is it possible to change the
float *pointer
type that is used in the VS c++ project
to some other type, so that it will still behave as a floating type but with less range?
I know that the floating point values never exceed some fixed value in that project, so I want to optimize the program by memory it uses. It doesn't need 4 bytes for each element of the 'float *pointer', 2 bytes will be enough I think. If I change a float to short and imitate the floating point behaviour, then it will use twice shorter memory. How to do it?
EDIT:
It calculates the probabilities. So there are divisions like
A / B
Where A < B,
And also B (and A) can be from 1 to 10 000.
There is a standard 16-bit floating point format described in IEEE 754-2008 called "binary16". It is specified as a format to store floating point values with reduced precisions. There is almost no compiler support for that yet (I think GCC supports it for certain ARM platforms), but it is quite easy to roll your own routines. This fellow:
http://blog.fpmurphy.com/2008/12/half-precision-floating-point-format_14.html
wrote a bit about it and also presents a routine to convert half-float <-> float.
Also, here seems to be a half-float C++ wrapper class:
half.h:
http://www.koders.com/cpp/fidABD00D95DE84C73BF0218AC621E400E07AA77B53.aspx
half.cpp
http://www.koders.com/cpp/fidF0DD0510FAAED03817A956D251787609BEB5989E.aspx
which supplies "HalfFloat" as a possible drop-in replacement type.
Maybe use fixed-point math? It all depends on value and precision you want to achieve.
http://www.eetimes.com/discussion/other/4024639/Fixed-point-math-in-C
For C there is a lot of code that makes fixed-point easy and I'm pretty sure there are also many C++ classes that make it even easier, but I don't know of any, I'm more into C.
The first, obvious, memory optimization would be to try and get rid of the pointer. If you can store just the float, that may, depending on the larger context, reduce your memory consumption from eight to four bytes already. (On a 64-Bit system, from twelve to four.)
Whether you can get by with a short depends on what your program does with the values. You may be able to use fix point arithmetic using an integral type such as a short, yes but your questions shows way too little context to judge that.
The code you posted and the text in the question do not deal with actual float, but with pointers to float. In all architectures I know of, the size of a pointer is the same regardless of the pointed type, so there would be no improvement in changing that to a short or char pointer.
Now, about the actual pointed elements, what is the range that you expect in your application? What is the precision you need? How many of those elements do you have? What are the memory constraints of your target platform? Unless the range and precision are small and the number of elements huge, just use floats. Also note that if you need floating point operations, storing any other type will require conversions before and after each operation, and you might be impacting performance.
Without greater knowledge of what you are doing, the ranges for short in many architectures are [-32k, 32k), where k stands for 1024. If your data ranges is [-32,32) and you can do with roughly 3 decimal digits you could use fixed point arithmetic with shorts, but there are few such situation.

Why would I use 2's complement to compare two doubles instead of comparing their differences against an epsilon value?

Referenced here and here...Why would I use two's complement over an epsilon method? It seems like the epsilon method would be good enough for most cases.
Update: I'm purely looking for a theoretical reason why you'd use one over the other. I've always used the epsilon method.
Has anyone used the 2's complement comparison successfully? Why? Why Not?
the second link you reference mentions an article that has quite a long description of the issue:
http://www.cygnus-software.com/papers/comparingfloats/comparingfloats.htm
but unless you are tweaking performance I would stick with epsilon so people can debug your code
The bits method might be faster. I say might because on modern (multicore, highly pipelined) processors it is often impossible to guess what is really faster.
Code the simplest most obviously correct implementation, then measure, then optomise.
In short, when comparing two floats with unknown origins, picking an epsilon that is valid is almost impossible.
For example:
What is a good epsilon when comparing distance in miles between Atlanta GA, Dallas TX and some place in Ohio?
What is a good epsilon when comparing distance in miles between my left foot, my right foot and the computer under my desk?
EDIT:
Ok, I'm getting a fair number of people not understanding why you wouldn't know what your epsilon is.
Back in the old days of lore, I wrote two programs that worked with NeverWinter Nights (a game made by BioWare). One of the programs took a binary model and converted it to ASCII. The other program took an ASCII model and compiled it into binary. One of the tests I wrote was to take all of BioWare's binary models, decompile them to ASCII and then back to binary. Then I compared my binary version with original one from BioWare. One of the problems during the comparison was dealing with some of the slight variances in floating point values. So instead of coming up with a bunch of different EPSILONS for each type of floating point number (vertex, normal, etc), I wanted to use something such as this twos compliment compare. Thus avoiding the whole multiple EPSILON issue.
The same type of issue can apply to any type of software that processes 3rd party data and then needs to validate their results with the original. In these cases you might not even know what the floating point values represent, you just have to compare them. We ran into this issue with our industrial automation software.
EDIT:
LOL, this has been voted up and down by different people.
I'll boil the problem down to this, given two arbitrary floating point numbers, how do you decide what epsilon to use? You can't.
How can you compare 1e23 and 1.0001e23 with an epsilon and still compare 1e-23 and 5.2e-23 using the same epsilon? Sure, you can do some dynamic epsilon tricks, but that is the whole point to the integer compare (which does NOT require the integers be exact).
The integer compare is able to compare two floats using an epsilon relative to the magnitude of the numbers.
EDIT
Steve, lets look at what you said in the comments:
"But you know what equality means to you... Hence, you should be able to find an appropriate epsilon".
Turn this statement around to say:
"If you know what equality means to you, then you should be able to find an appropriate epsilon."
The whole point to what I am trying to say is that there are applications where we don't know what equality means in the absolute sense, thus we have to resort to a relative compare which is what the integer version is trying to do.
When it comes to speed, follow these rules:
If you're not a very experienced developer, don't optimize.
If you are an experienced developer, don't optimize yet.
Do the easiest method.
Alex
Oskar's right. Don't screw with this unless you really, really need that performance.
And you don't. If you were in the situation that did, you wouldn't have needed to ask the question -- you'd already know. If you think you do, then you don't. Your performance problems lie elsewhere. Just use the readable version.
Using any method that compares bitwise will result in trouble when fractions are represented by approximations. All floating point numbers with fractions that are not denominated in powers of two (1/2, 1/4, 1/8, 1/65536, &c) are approximated. So, of course, are all irrational numbers.
float third = 1/3;
float two=2.0;
float another_two=third*6.0;
if(two != another_two)
print ("Approximation!\n");
The only time comparing bitwise would work is when you derive the floating point numbers exactly the same way or they are exact representations (whole numbers, fraction powers of two). Even then, there can be multiple representations of some numbers, though I have never seen this in a working system.