Bit manipulation for big integer classes? - c++

I'm having a problem coming up with an algorithm for a big integer class in C++. My initial idea was using arrays/lists, but it's very inefficient. I then discovered about things like the following class:
http://www.codeproject.com/KB/cpp/CppIntegerClass.aspx
However, I find that approach really confusing. I don't know how to work with bit manipulations, and I barely understood the code. Someone please explain to me how to utilise bit manipulation, how it works, etc. Eventually I would like to create my own big integer class, but I'm barely a novice programmer and I just learned how to use classes.
Basically my question is:
How do I use bit manipulation to create a big integer class? How does it work??
Thanks!

Start by reading up on binary numbers in general. That page shows how the common arithmetic operations (addition, subtraction etc) work on binary numbers, i.e. how the numbers are manipulated bit by bit to get the desired result.
Mapping that into a programming language such as C++ should be pretty straight-forward once you know why there are bit-manipulating operations being used.
In my experience, the most obvious bit-oriented thing needed when implementing something like this is bit testing, to check for overflow. Let's say you represent your big binary number as an array of uint16_t, i.e. chunks of 16 bits. When implementing addition, you will start at the least significant end of both numbers, and add those. If the sum is larger than 65,535, you need to "carry" one to the next uint16_t, just as when you add decimal numbers one digit at a time.
This can be implemented with a test like so:
const uint16_t *number1;
const uint16_t *number2;
/* assume code goes here to set up the number1 and number2 pointers. */
/* Compute sum of 16 bits. */
uint16_t carry = 0;
uint32_t sum = number1[0] + number2[0];
/* One way of testing for overflow: */
if (sum & (1 << 16))
carry = 1;
Here, the 1 << 16 expressions creates a mask by shifting a 1 sixteen steps to the left. The & bitwise and operator tests the sum against the mask; the result will be non-zero (i.e. true, in C++) if bit 16 is set in sum.

Related

How to fix the position of binary point in an unsigned N-bit interger?

I am working on developing a fixed point algorithm in C++. I know that, for a N-bit integer, the fixed point binary integer is represented as U(a,b). For example, for an 8 bit Integer (i.e 256 samples), If we represent it in the form U(6,2), it means that the binary point is to the left of the 2nd bit starting from the right of the form:
b5 b4 b3 b2 b1 b0 . b(-1) b(-2)
Thus , it has 6 integer bits and 2 fractional bits. In C++, I know there are some bit shift operators I can use, but they are basically used for shifting the bits of the input stream, my question is, how to define a binary fixed point integer of the form, fix<6,2> or U(6,2). All the major processing operation will be carried out on the fractional part and I am just finding a way to do this fix in C++. Any help regarding this would be appreciated.Thanks!
Example : Suppose I have an input discrete signal with 1024 sample points on x-axis (For now just think this input signal is coming from some sensor). Each of this sample point has a particular amplitude. Say the sample at time 2(x-axis) has an amplitude of 3.67(y-axis). Now I have a variable "int *input;" that takes the sample 2, which in binary is 0000 0100. So basically I want to make this as 00000.100 by performing the U(5,3) on the sample 2 in C++. So that I can perform the interpolation operations on fractions of the input sampling period or time.
PS - I don't want to create a separate class or use external libraries for this. I just want to take each 8 bits from my input signal, perform the U(a,b) fix on it followed by rest of the operations are done on the fractional part.
Short answer: left shift.
Long answer:
Fixed point numbers are stored as integers, usually int, which is the fastest integer type for a particular platform.
Normal integer without fractional bits are usually called Q0, Q.0 or QX.0 where X is the total number of bits of underlying storage type(usually int).
To convert between different Q.X formats, left or right shift. For example, to convert 5 in Q0 to 5 in Q4, left shift it 4 bits, or multiply it by 16.
Usually it's useful to find or write a small fixed point library that does basic calculations, like a*b>>q and (a<<q)/b. Because you will do Q.X=Q.Y*Q.Z and Q.X=Q.Y/Q.Z a lot and you need to convert formats when doing calculations. As you may have observed, using normal * operator will give you Q.(X+Y)=Q.X*Q.Y, so in order to fit the result into Q.Z format, you need to right shift the result by (X+Y-Z) bits.
Division is similar, you get Q.(X-Y)=Q.X*Q.Y form the standard / operator, and to get the result in Q.Z format you shift the dividend before the division. What's different is that division is an expensive operation, and it's not trivial to write a fast one from scratch.
Be aware of double-word support of your platform, it will make your life a lot easier. With double word arithmetic, result of a*b can be twice the size of a or b, so that you don't lose range by doing a*b>>c. Without double word, you have to limit the input range of a and b so that a*b doesn't overflow. This is not obvious when you first start, but soon you will find you need more fractional bits or rage to get the job done, and you will finally need to dig into the reference manual of your processor's ISA.
example:
float a = 0.1;// 0.1
int aQ16 = a*65536;// 0.1 in Q16 format
int bQ16 = 4<<16// 4Q16
int cQ16 = a*b>>16 // result = 0.399963378906250Q16 = 26212,
// not 0.4Q16 = 26214 because of truncating error
If this is your question:
Q. Should I define my fixed-binary-point integer as a template, U<int a, int b>(int number), or not, U(int a, int b)
I think your answer to that is: "Do you want to define operators that take two fixed-binary-point integers? If so make them a template."
The template is just a little extra complexity if you're not defining operators. So I'd leave it out.
But if you are defining operators, you don't want to be able to add U<4, 4> and U<6, 2>. What would you define your result as? The templates will give you a compile time error should you try to do that.

ADT Integer class questions

I am pretty new to programming and I have to do an Abstract Data Type (ADT) for integer numbers.
I've browsed the web for some tips, examples, tutorials but i couldn't find anything usefull, so i hope i will get here some answers.
I thinked a lot about how should i format the ADT that stores my integer and I'm thinking of something like this:
int lenght; // stores the length of the number(an limit since this numbers goes to infinite)
int[] digits; // stores the digits of my number, with the dimension equal to length
Now, I'm confused about how should i tackle the sign representation.Is it ok to hold the sign into an char something like: char sign?
But then comes the question what to do when I have to add and multiply two integers, what about the cases when i have overflows on this operations.
So , if some of you have some ideas about how should I represent the number(the format) and how should I do the multiply and add i would be very great full. I don't need any code, I i the learning stage just some ideas. Thank you.
One good way to do this is to store the sign as a bool (e.g. bool is_neg;). That way it's completely clear what that data means (vice with a char, where it's not entirely clear.
You might want to store each digit in an unsigned short (or if you want to be precise about sign, uint16_t). Then, when you do a multiply of two digits, you can just multiply them as unsigned ints (uint32_t), and then the low 16 bits are your result and the overflow is in the high 16 bits. You can then add this to the result array fairly easily. You know that the multiplication of a n-bit number by a k-bit number is at most n + k bits long, so you can preallocate your array to that size and then worry about removing extra zeros later.
Hope this helps, and let me know if you want more tips.
The first design decision you have to make is the choice of a basis.
You seem to lean towards plain decimal. Could be unpacked (one full byte per digit, numerical or ASCII representation), or packed digits pairs (Decimal Coded Binary, twice four bits in a byte).
Other schemes are more convenient for faster operations: basis being a power of 2 or a power of 10, fitting in a byte, a short, an int...
Powers of 10 have the benefit that conversion to and from base 10 can be done word by word.
Addition is an easy matter: add the words in pairs and handle the carries. Same for subtraction, with borrows.
Multiplies are a whole different story if you care about efficiency. The method of written computation taught at school can be used, but it requires length1 x length2 operations. For long numbers, more efficient methods are preferred (http://en.wikipedia.org/wiki/Multiplication_algorithm#Karatsuba_multiplication). They are also more complex.

Conversion Big Integer <-> double in C++

I am writing my own long arithmetic library in C++ for fun and it is already pretty finished, I even implemented several Cryptogrphic algorithms with that library, but one important thing is still missing: I want to convert doubles (and floats/long doubles) into my number and vice versa. My numbers are represented as a variable sized array of unsigned long ints plus a sign bit.
I tried to find the answer with google, but the problem is that people rarely ever implement such things themselves, so I only find things about how to use Java BigInteger etc.
Conceptually, it is rather easy: I take the mantissa, shift it by the number of bits dictated by the exponent and set the sign. In the other direction I truncate it so that it fits into the mantissa and set the exponent depending on my log2 function.
But I am having a hard time to figure out the details, I could either play around with some bit patterns and cast it to a double, but I didn't find an elegant way to achieve that or I could "calculate" it by starting with 2, exponentiate, multiply etc, but that doesn't seem very efficient.
I would appreciate a solution that doesn't use any library calls because I am trying to avoid libraries for my project, otherwise I could just have used gmp, furthermore, I often have two solutions on several other occasions, one using inline assembler which is efficient and one that is more platform independent, so either answer is useful for me.
edit: I use uint64_t for my parts, but I would like to be able to change it depending on the machine, but I am willing to do some different implementations with some #ifdefs to achieve that.
I'm going to make non-portable assumption here: namely, that unsigned long long has more accurate digits than double. (This is true on all modern desktop systems that I know of.)
First, convert the most significant integer(s) into an unsigned long long. Then convert that to a double S. Let M be the number of integers less than those used in that first step. multiply S by(1ull << (sizeof(unsigned)*CHAR_BIT*M). (If shifting more than 63 bits, you will have to split those into seperate shifts and do some alrithmetic) Finally, if the original number was negative you multiply this result by -1.
This rounds a lot, but even with this rounding, due to the above assumption, no digits are lost that wouldn't be lost anyway with the conversion to a double. I think this is a similar process to what Mark Ransom said, but I'm not certain.
For converting from a double to a biginteger, first seperate the mantissa into a double M and the exponent into an int E, using frexp. Multiply M by UNSIGNED_MAX, and store that result in an unsigned R. If std::numeric_limits<double>::radix() is 2 (I don't know if it is or not for x86/x64), you can easily shift R left by E-(sizeof(unsigned)*CHAR_BIT) bits and you're done. Otherwise the result will instead beR*(E**(sizeof(unsigned)*CHAR_BIT)) (where ** means to the power of)
If performance is a concern, you can add an overload to your bignum class for multiplying by std::constant_integer<unsigned, 10>, which simply returns (LHS<<4)+(LHS<<2). You can similarly optimize other constants if you wish.
This blog post might help you Clarifying and optimizing Integer>>asFloat
Otherwise, you can yet have an idea of algorithm with this SO question Converting from unsigned long long to float with round to nearest even
You don't say explicitly, but I assume your library is integer only and the unsigned longs are 32 bit and binary (not decimal). The conversion to double is simple, so I'll tackle that first.
Start with a multiplier for the current piece; if the number is positive it will be 1.0, if negative it will be -1.0. For each of the unsigned long ints in your bignum, multiply by the current multiplier and add it to the result, then multiply your multiplier by pow(2.0, 32) (4294967296.0) for 32 bits or pow(2.0, 64) (18446744073709551616.0) for 64 bits.
You can optimize this process by working with only the 2 most significant values. You need to use 2 even if the number of bits in your integer type is larger than the precision of a double, since the number of used bits in the most significant value might only be 1. You can generate the multiplier by taking a power of 2 to the number of skipped bits, e.g. pow(2.0, most_significant_count*sizeof(bit_array[0])*8). You can't use a bit shift as given in another answer because it will overflow after the first value.
To convert from double, you can get the exponent and mantissa separated from each other with the frexp function. The mantissa will come as a floating point value between 0.5 and 1.0 so you'll want to multiply it by pow(2.0, 32) or pow(2.0, 64) to convert it to an integer, then adjust the exponent by -32 or -64 to compensate.
To go from a big integer to a double, just do it the same way you parse numbers. For example, you parse the number "531" as "1 + (3 * 10) + (5 * 100)". Compute each portion using doubles, starting with the least significant portion.
To go from a double to a big integer, do it the same way but in reverse starting with the most significant portion. So, to convert 531, you first see that it's more than 100 but less than 1000. You find the first digit by dividing by 100. Then you subtract to get the remainder of 31. Then find the next digit by dividing by 10. And so on.
Of course, you won't be using tens (unless you store your big integers as digits). Exactly how you break it apart depends on how your big integer class is constructed. For example, if it's uses 64-bit units, then you'll use powers of 2^64 instead of powers of 10.

Why to use higher base for implementing BigInt?

I'm trying to implement BigInt and have read some threads and articles regarding it, most of them suggested to use higher bases (256 or 232 or even 264).
Why higher bases are good for this purpose?
Other question I have is how am I supposed to convert a string into higher base (>16). I read there is no standard way, except for base64. And the last question, how do I use those higher bases. Some examples would be great.
The CPU cycles spent multiplying or adding a number that fits in a register tend to be identical. So you will get the least number of iterations, and best performance, by using up the whole register. That is, on a 32-bit architecture, make your base unit 32 bits, and on a 64-bit architecture, make it 64 bits. Otherwise--say, if you only fill up 8 bits of your 32 bit register--you are wasting cycles.
first answer stated this best. I personally use base 2^16 to keep from overflowing in multiplication. This allows any two digits to be multiplied together once without ever overflowing.
converting to a higher base requires a fast division method as well as packing the numbers as much as possible (assuming ur BigInt lib can handle multiple bases).
Consider base 10 -> base 2. The actual conversions would be 10 -> 10000 -> 32768 -> 2. This may seem slower, but converting from base 10 to 10000 is super fast. The amount of iterations for converting between 10000 and 32768 is very fast as there are very few digits to iterate over. Unpacking 32768 to 2 is also extremely fast.
So first pack the base to the largest base it can go to. To do this, just combine the digits together. This base should be <= 2^16 to prevent overflow.
Next, combine the digits together until they are >= the target base. From here, divide by the target base using a fast division algorithm that would normally overflow in any other scenario.
Some quick code
if (!packed) pack()
from = remake() //moves all of the digits on the current BigInt to a new one, O(1)
loop
addNode()
loop
remainder = 0
remainder = remainder*fromBase + from.digit
enter code here`exitwhen remainder >= toBase
set from = from.prev
exitwhen from.head
if (from.head) node.digit = remainder
else node.digit = from.fastdiv(fromBase, toBase, remainder)
exitwhen from.head
A look at fast division
loop
digit = remainder/divide
remainder = remainder%divide
//gather digits to divide again
loop
this = prev
if (head) return remainder
remainder = remainder*base + digit
exitwhen remainder >= divide
digit = 0
return remainder
Finally, unpack if you should unpack
Packing is just combining the digits together.
Example of base 10 to base 10000
4th*10 + 3rd
*10 + 2nd
*10 + 1st
???
You should have a Base class that stores alphabet + size for toString. If the Base is invalid, then just display the digits in a comma separated list.
All of your methods should be using the BigInt's current base, not some constant.
That's all?
From there, you'll be able to do things like
BigInt i = BigInt.convertString("1234567", Base["0123456789"])
Where the [] is overloaded and will either create a new base or return the already created base.
You'll also be able to do things like
i.toString()
i.base = Base["0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ"]
i.base = 256
i.base = 45000
etc ^_^.
Also, if you are using integers and you want to be able to return BigInt remainders, division can get a bit tricky =P, but it's still pretty easy ^_^.
This is a BigInt library I coded in vjass (yes, for Warcraft 3, lol, don't judge me)
Things like TriggerEvaluate(evalBase) are just to keep the threads from crashing (evil operation limit).
Good luck :D

Are there any good reasons to use bit shifting except for quick math?

I understand bitwise operations and how they might be useful for different purposes, e.g. permissions. However, I don't seem to understand what use the bit shift operators are. I understand how they work, but I can't think of any scenarios where I might want to use them unless I want to do some really quick multiplication or division. Are there any other reasons to use bit-shifting?
There are many reasons, here are some:
Let's say you represent a black and white image as a sequence of bits and you want to set a single pixel in this image generically. For example your byte offset may be x>>3 and your bit offset may be x & 0x7 and you can set that bit by: byte = byte | (1 << (x & 0x7));
Implementing data compression algorithms where you deal with variable length bit sequences, e.g. huffman coding.
You're are interacting with some hardware, e.g. a serial communication device, and you need to read or set some control bits.
For those and other reasons most processors have bit shift and/or rotation instructions as well as other logic instructions (and/or/xor/not).
Historically multiplication and division were significantly slower as they are more complex operations and some CPUs didn't have those at all.
Also see here:
Have you ever had to use bit shifting in real projects?
As you indicate, a left shift is the same thing as a multiplication by two. At least it is when we're talking about unsigned quantities. The meaning of a "left shift" of a signed quantity is ... language dependent.
With modern compilers, there's really no difference between writing "i = x*2;" and "i = x << 1;" The compiler will generate the most efficient code. So in that sense there's no reason to prefer shift over multiply.
Some algorithms work by shifting a quantity left by one bit and then setting the low bit to either 0 or 1. Some simple compression algorithms work this way. For example, if your accumulated value is in the variable x, and the current value (0 or 1) is in y, then it makes more sense to write "x = (x << 1) | y", rather than "x = (x * 2) + y". Both do the same thing, but the first is more notationally correct. You don't have to think, "oh, right, multiply by two is the same as a left shift."
Also, when you're talking about algorithms that shift bits, it's more convenient to shift left or right by a particular number of bits than to figure out what multiple of 2 you want to multiply or divide by.
So, whereas there's typically no performance benefit to shifting rather than multiplying--at least not when working with high level languages--there are times when having the ability to shift makes what you're doing more easily understood.
There are lot of places where bit shift operations are regularly used outside of their usage in numerical computations. For example, Bitboard is a data structure that is commonly used in board games for board representation. Some of the strongest chess engines use this data structure mainly for speed and ease of move generation and evaluation. These programs use bit operations heavily and bit-shift operations specifically are used in a lot of contexts - such as finding bit masks, generating new moves on the board, computing logarithm very quickly, etc. There are even very advanced numerical computations that can be done elegantly by clever use of bit operations. Check out this site for bit twiddling hacks - a lot of those algorithms use shift operators. Bit shift operations are regularly used in device driver programming, codec development, embedded systems programming and so on.
Shifting allows accessing specific bits within a variable. The expression (n >> p) & ((1 << m) - 1) retrieves an m-bit portion of the variable n with an offset of p bits from the right.
This allows your program to use integers that aren't multiples of 8 bits, which is useful for data compression.
For example, I used it in my Netflix Prize programs to pack records (22-bit user ID + 15-bit movie ID + 12-bit date + 3-bit rating) into a uint64_t (with 12 bits to spare).
A very common special case is to pack 8 bool variables into each byte. (Unix file permissions, black-and-white bitmaps, CPU flags registers, etc.)
Also, bit manipulation is used in UTF-8, which is a very popular character encoding. Unicode characters are represented by distributing their bits across 1, 2, 3, or 4 bytes.