How to implement big int in C++ - c++

I'd like to implement a big int class in C++ as a programming exercise—a class that can handle numbers bigger than a long int. I know that there are several open source implementations out there already, but I'd like to write my own. I'm trying to get a feel for what the right approach is.
I understand that the general strategy is get the number as a string, and then break it up into smaller numbers (single digits for example), and place them in an array. At this point it should be relatively simple to implement the various comparison operators. My main concern is how I would implement things like addition and multiplication.
I'm looking for a general approach and advice as opposed to actual working code.

A fun challenge. :)
I assume that you want integers of arbitrary length. I suggest the following approach:
Consider the binary nature of the datatype "int". Think about using simple binary operations to emulate what the circuits in your CPU do when they add things. In case you are interested more in-depth, consider reading this wikipedia article on half-adders and full-adders. You'll be doing something similar to that, but you can go down as low level as that - but being lazy, I thought I'd just forego and find a even simpler solution.
But before going into any algorithmic details about adding, subtracting, multiplying, let's find some data structure. A simple way, is of course, to store things in a std::vector.
template< class BaseType >
class BigInt
{
typedef typename BaseType BT;
protected: std::vector< BaseType > value_;
};
You might want to consider if you want to make the vector of a fixed size and if to preallocate it. Reason being that for diverse operations, you will have to go through each element of the vector - O(n). You might want to know offhand how complex an operation is going to be and a fixed n does just that.
But now to some algorithms on operating on the numbers. You could do it on a logic-level, but we'll use that magic CPU power to calculate results. But what we'll take over from the logic-illustration of Half- and FullAdders is the way it deals with carries. As an example, consider how you'd implement the += operator. For each number in BigInt<>::value_, you'd add those and see if the result produces some form of carry. We won't be doing it bit-wise, but rely on the nature of our BaseType (be it long or int or short or whatever): it overflows.
Surely, if you add two numbers, the result must be greater than the greater one of those numbers, right? If it's not, then the result overflowed.
template< class BaseType >
BigInt< BaseType >& BigInt< BaseType >::operator += (BigInt< BaseType > const& operand)
{
BT count, carry = 0;
for (count = 0; count < std::max(value_.size(), operand.value_.size(); count++)
{
BT op0 = count < value_.size() ? value_.at(count) : 0,
op1 = count < operand.value_.size() ? operand.value_.at(count) : 0;
BT digits_result = op0 + op1 + carry;
if (digits_result-carry < std::max(op0, op1)
{
BT carry_old = carry;
carry = digits_result;
digits_result = (op0 + op1 + carry) >> sizeof(BT)*8; // NOTE [1]
}
else carry = 0;
}
return *this;
}
// NOTE 1: I did not test this code. And I am not sure if this will work; if it does
// not, then you must restrict BaseType to be the second biggest type
// available, i.e. a 32-bit int when you have a 64-bit long. Then use
// a temporary or a cast to the mightier type and retrieve the upper bits.
// Or you do it bitwise. ;-)
The other arithmetic operation go analogous. Heck, you could even use the stl-functors std::plus and std::minus, std::times and std::divides, ..., but mind the carry. :) You can also implement multiplication and division by using your plus and minus operators, but that's very slow, because that would recalculate results you already calculated in prior calls to plus and minus in each iteration. There are a lot of good algorithms out there for this simple task, use wikipedia or the web.
And of course, you should implement standard operators such as operator<< (just shift each value in value_ to the left for n bits, starting at the value_.size()-1... oh and remember the carry :), operator< - you can even optimize a little here, checking the rough number of digits with size() first. And so on. Then make your class useful, by befriendig std::ostream operator<<.
Hope this approach is helpful!

Things to consider for a big int class:
Mathematical operators: +, -, /,
*, % Don't forget that your class may be on either side of the
operator, that the operators can be
chained, that one of the operands
could be an int, float, double, etc.
I/O operators: >>, << This is
where you figure out how to properly
create your class from user input, and how to format it for output as well.
Conversions/Casts: Figure out
what types/classes your big int
class should be convertible to, and
how to properly handle the
conversion. A quick list would
include double and float, and may
include int (with proper bounds
checking) and complex (assuming it
can handle the range).

There's a complete section on this: [The Art of Computer Programming, vol.2: Seminumerical Algorithms, section 4.3 Multiple Precision Arithmetic, pp. 265-318 (ed.3)]. You may find other interesting material in Chapter 4, Arithmetic.
If you really don't want to look at another implementation, have you considered what it is you are out to learn? There are innumerable mistakes to be made and uncovering those is instructive and also dangerous. There are also challenges in identifying important computational economies and having appropriate storage structures for avoiding serious performance problems.
A Challenge Question for you: How do you intend to test your implementation and how do you propose to demonstrate that it's arithmetic is correct?
You might want another implementation to test against (without looking at how it does it), but it will take more than that to be able to generalize without expecting an excrutiating level of testing. Don't forget to consider failure modes (out of memory problems, out of stack, running too long, etc.).
Have fun!

addition would probably have to be done in the standard linear time algorithm
but for multiplication you could try http://en.wikipedia.org/wiki/Karatsuba_algorithm

Once you have the digits of the number in an array, you can do addition and multiplication exactly as you would do them longhand.

Don't forget that you don't need to restrict yourself to 0-9 as digits, i.e. use bytes as digits (0-255) and you can still do long hand arithmetic the same as you would for decimal digits. You could even use an array of long.

I'm not convinced using a string is the right way to go -- though I've never written code myself, I think that using an array of a base numeric type might be a better solution. The idea is that you'd simply extend what you've already got the same way the CPU extends a single bit into an integer.
For example, if you have a structure
typedef struct {
int high, low;
} BiggerInt;
You can then manually perform native operations on each of the "digits" (high and low, in this case), being mindful of overflow conditions:
BiggerInt add( const BiggerInt *lhs, const BiggerInt *rhs ) {
BiggerInt ret;
/* Ideally, you'd want a better way to check for overflow conditions */
if ( rhs->high < INT_MAX - lhs->high ) {
/* With a variable-length (a real) BigInt, you'd allocate some more room here */
}
ret.high = lhs->high + rhs->high;
if ( rhs->low < INT_MAX - lhs->low ) {
/* No overflow */
ret.low = lhs->low + rhs->low;
}
else {
/* Overflow */
ret.high += 1;
ret.low = lhs->low - ( INT_MAX - rhs->low ); /* Right? */
}
return ret;
}
It's a bit of a simplistic example, but it should be fairly obvious how to extend to a structure that had a variable number of whatever base numeric class you're using.

Use the algorithms you learned in 1st through 4th grade.
Start with the ones column, then the tens, and so forth.

Like others said, do it to old fashioned long-hand way, but stay away from doing this all in base 10. I'd suggest doing it all in base 65536, and storing things in an array of longs.

If your target architecture supports BCD (binary coded decimal) representation of numbers, you can get some hardware support for the longhand multiplication/addition that you need to do. Getting the compiler to emit BCD instruction is something you'll have to read up on...
The Motorola 68K series chips had this. Not that I'm bitter or anything.

My start would be to have an arbitrary sized array of integers, using 31 bits and the 32n'd as overflow.
The starter op would be ADD, and then, MAKE-NEGATIVE, using 2's complement. After that, subtraction flows trivially, and once you have add/sub, everything else is doable.
There are probably more sophisticated approaches. But this would be the naive approach from digital logic.

Could try implementing something like this:
http://www.docjar.org/html/api/java/math/BigInteger.java.html
You'd only need 4 bits for a single digit 0 - 9
So an Int Value would allow up to 8 digits each. I decided i'd stick with an array of chars so i use double the memory but for me it's only being used 1 time.
Also when storing all the digits in a single int it over-complicates it and if anything it may even slow it down.
I don't have any speed tests but looking at the java version of BigInteger it seems like it's doing an awful lot of work.
For me I do the below
//Number = 100,000.00, Number Digits = 32, Decimal Digits = 2.
BigDecimal *decimal = new BigDecimal("100000.00", 32, 2);
decimal += "1000.99";
cout << decimal->GetValue(0x1 | 0x2) << endl; //Format and show decimals.
//Prints: 101,000.99

The computer hardware provides facility of storing integers and doing basic arithmetic over them; generally this is limited to integers in a range (e.g. up to 2^{64}-1). But larger integers can be supported via programs; below is one such method.
Using Positional Numeral System (e.g. the popular base-10 numeral system), any arbitrarily large integer can be represented as a sequence of digits in base B. So, such integers can be stored as an array of 32-bit integers, where each array-element is a digit in base B=2^{32}.
We already know how to represent integers using numeral-system with base B=10, and also how to perform basic arithmetic (add, subtract, multiply, divide etc) within this system. The algorithms for doing these operations are sometimes known as Schoolbook algorithms. We can apply (with some adjustments) these Schoolbook algorithms to any base B, and so can implement the same operations for our large integers in base B.
To apply these algorithms for any base B, we will need to understand them further and handle concerns like:
what is the range of various intermediate values produced during these algorithms.
what is the maximum carry produced by the iterative addition and multiplication.
how to estimate the next quotient-digit in long-division.
(Of course, there can be alternate algorithms for doing these operations).
Some algorithm/implementation details can be found here (initial chapters), here (written by me) and here.

subtract 48 from your string of integer and print to get number of large digit.
then perform the basic mathematical operation .
otherwise i will provide complete solution.

Related

Represent Integers with 2000 or more digits [duplicate]

This question already has answers here:
Handling large numbers in C++?
(10 answers)
Closed 7 years ago.
I would like to write a program, which could compute integers having more then 2000 or 20000 digits (for Pi's decimals). I would like to do in C++, without any libraries! (No big integer, boost,...). Can anyone suggest a way of doing it? Here are my thoughts:
using const char*, for holding the integer's digits;
representing the number like
( (1 * 10 + x) * 10 + x )...
The obvious answer works along these lines:
class integer {
bool negative;
std::vector<std::uint64_t> data;
};
Where the number is represented as a sign bit and a (unsigned) base 2**64 value.
This means the absolute value of your number is:
data[0] + (data[1] << 64) + (data[2] << 128) + ....
Or, in other terms you represent your number as a little-endian bitstring with words as large as your target machine can reasonably work with. I chose 64 bit integers, as you can minimize the number of individual word operations this way (on a x64 machine).
To implement Addition, you use a concept you have learned in elementary school:
a b
+ x y
------------------
(a+x+carry) (b+y reduced to one digit length)
The reduction (modulo 2**64) happens automatically, and the carry can only ever be either zero or one. All that remains is to detect a carry, which is simple:
bool next_carry = false;
if(x += y < y) next_carry = true;
if(prev_carry && !++x) next_carry = true;
Subtraction can be implemented similarly using a borrow instead.
Note that getting anywhere close to the performance of e.g. libgmp is... unlikely.
A long integer is usually represented by a sequence of digits (see positional notation). For convenience, use little endian convention: A[0] is the lowest digit, A[n-1] is the highest one. In general case your number is equal to sum(A[i] * base^i) for some value of base.
The simplest value for base is ten, but it is not efficient. If you want to print your answer to user often, you'd better use power-of-ten as base. For instance, you can use base = 10^9 and store all digits in int32 type. If you want maximal speed, then better use power-of-two bases. For instance, base = 2^32 is the best possible base for 32-bit compiler (however, you'll need assembly to make it work optimally).
There are two ways to represent negative integers, The first one is to store integer as sign + digits sequence. In this case you'll have to handle all cases with different signs yourself. The other option is to use complement form. It can be used for both power-of-two and power-of-ten bases.
Since the length of the sequence may be different, you'd better store digit sequence in std::vector. Do not forget to remove leading zeroes in this case. An alternative solution would be to store fixed number of digits always (fixed-size array).
The operations are implemented in pretty straightforward way: just as you did them in school =)
P.S. Alternatively, each integer (of bounded length) can be represented by its reminders for a set of different prime modules, thanks to CRT. Such a representation supports only limited set of operations, and requires nontrivial convertion if you want to print it.

Own Big Float in C++

I want to write my own variable "type" as a homework in C++. It should be an arbitrarily long float. I was thinking of structure like...
Code:
class bigFloat
{
public:
bigFloat(arguments);
~bigFloat();
private:
std::vector<char> before; // numbers before decimal point
std::vector<char> after; // numbers after decimal point
int pos; // position of decimal point
};
Where if i have number like: 3.1415
Before = '3'; after = '1415'; pos = 1;
If that makes sense to you... BUT assignment wants me to save some memory, which I don't because for every number I allocate about it is about 1 byte, which is too much I guess.
Question:
How would you represent those arbitrarily long numbers?
(Sorry for my bad english, I hope the post makes sense)
If you need to preserve memory, all that means is that you need to use memory as efficiently as possible. In other words, given the value you're storing, you shouldn't waste bytes.
Example:
255 doesn't need 32 bits
I think your vector of chars is fine. If you're allowed to use a C++11 compiler, I'd probably change that to a vector of uint8_t and make sure when I'm storing the value that I can store a value from 0 to 255 in a vector of size 1.
However, that's not the end of it. From the sounds of it, what you're after is an arbitrary number of significant digits. However, for a true float representation, you also need to allocate storage for the base and exponent, after deciding what the base will be for your type. There is also the question of whether you want your exponent to be arbitrarily long too. Let's assume so.
So, I'd probably use something like this for members of your class:
//Assuming a base of 10.
static uint8_t const base = 10;
std::vector<uint8_t> digits_before_decimal;
std::vector<uint8_t> digits_after_decimal;
std::vector<uint8_t> exponent;
std::bitset<1> sign;
It is then a matter of implementing the various operators for your type and testing various scenarios to make sure your solution works.
If you really want to be thorough, you could use a simple testing framework to make sure that problems you fix along the way, stay fixed.
In memory, it will essentially look like a binary representation of the number.
For example:
65535 will be: before_decimal =<0xff,0xff>, after_decimal vector is empty
255.255 will be: before_decimal =<0xff>, after_decimal=<0xff>
255255 will be: before_decimal =<0x03,0xe5,0x17>, after_decimal vector is empty
255255.0 will be: before_decimal =<0x03,0xe5,0x17>, after_decimal: <0>
As others have mentioned, you don't really need two vectors for before and after the decimal. However, I'm using two in my answer because it makes it easier to understand and you don't have to keep track of the decimal. The memory requirements of two vs one vector really aren't that different when you're dealing with a long string of digits.
I should also note that using an integer to record position of the decimal point limits your number of digits to 2 billion, which is not an arbitrarily long number.
UPDATE: If this actually is homework, I would check with whoever has given you the homework if you need to support any floating point special cases, the simplest of which would be NaNs. There are other special cases too, but trying to implement all of them will very quickly turn this from a homework assignment into a thesis. Good luck :)
Don't use two separate vectors before and after. You need whole mantissa to make arithmetic operations.
Actually your pos is exponent. Name it accordingly. Exponent is signed btw.
You need sign of mantissa.
I recommend to store mantissa as rational fraction. You need two numbers: numerator and denominator. Then you can make division without round-off.
It's better to store numbers as ints with arbitrary length instead of arrays of digits.
PS. I made such calculator long time ago. To illustrate my answer, I give you declaration of class for number:
class CNumber
{
// ctors, methods....
char cSign; // sign of mantissa
CString strNumer; // numerator of mantissa
CString strDenom; // denominator of mantissa
char cExpSign; // sign of exponent
CString strExp; // exponent
};
I used MFC. CString is standard string there.

int to bool[] in c++

What I want to do is given an argument const int &i, return the bits of the binary representation of i in the form of an array of bool (And back would also be great)... Does anyone know how?
Unless you really need it to be specifically an array of bool, I'd use an std::bitset:
std::bitset bits<32>(i);
You can normally treat that pretty much like an array of bool, testing, setting and flipping individual bits, etc. Of course, if you want portability to something that has a different size of int, you may want to modify it to something like:
#define size (sizeof(int) * CHAR_BIT)
std::bitset bits<size>(i);
Edit: As people much more experienced than me point out, doing this can lead to problems if the number is negative (what happens exactly depends on your compiler). In any case, it would be meaningless to process negative numbers this way unless you also stipulated what kind of arithmetic representation the return value would use (1s complement? 2s complement? prefix sign bit?) so this kind of approach turns out to be practically useless for negative numbers as far as I can tell.
Sorry for diverting attention from more worthy answers.
Original
Well, this comes to mind:
int i = 42; // or whatever
std::vector<bool> vec;
while(i) {
vec.push_back(i & 1);
i >>= 1;
}
std::reverse(vec);
Of course this is not an array, but it's trivial to copy the contents of the vector to an array instead if that's what you want, for example:
bool boolArray[] = new bool[vec.size()];
std::copy(vec.rbegin(), vec.rend(), boolArray);

Converting variable type (or workaround)

The class below is supposed to represent a musical note. I want to be able to store the length of the note (e.g. 1/2 note, 1/4 note, 3/8 note, etc.) using only integers. However, I also want to be able to store the length using a floating point number for the rare case that I deal with notes of irregular lengths.
class note{
string tone;
int length_numerator;
int length_denominator;
public:
set_length(int numerator, int denominator){
length_numerator=numerator;
length_denominator=denominator;
}
set_length(double d){
length_numerator=d; // unfortunately truncates everything past decimal point
length_denominator=1;
}
}
The reason it is important for me to be able to use integers rather than doubles to store the length is that in my past experience with floating point numbers, sometimes the values are unexpectedly inaccurate. For example, a number that is supposed to be 16 occasionally gets mysteriously stored as 16.0000000001 or 15.99999999999 (usually after enduring some operations) with floating point, and this could cause problems when testing for equality (because 16!=15.99999999999).
Is it possible to convert a variable from int to double (the variable, not just its value)? If not, then what else can I do to be able to store the note's length using either an integer or a double, depending on the what I need the type to be?
If your only problem is comparing floats for equality, then I'd say to use floats, but read "Comparing floating point numbers" / Bruce Dawson first. It's not long, and it explains how to compare two floating numbers correctly (by checking the absolute and relative difference).
When you have more time, you should also look at "What Every Computer Scientist Should Know About Floating Point Arithmetic" to understand why 16 occasionally gets "mysteriously" stored as 16.0000000001 or 15.99999999999.
Attempts to use integers for rational numbers (or for fixed point arithmetic) are rarely as simple as they look.
I see several possible solutions: the first is just to use double. It's
true that extended computations may result in inaccurate results, but in
this case, your divisors are normally powers of 2, which will give exact
results (at least on all of the machines I've seen); you only risk
running into problems when dividing by some unusual value (which is the
case where you'll have to use double anyway).
You could also scale the results, e.g. representing the notes as
multiples of, say 64th notes. This will mean that most values will be
small integers, which are guaranteed exact in double (again, at least
in the usual representations). A number that is supposed to be 16 does
not get stored as 16.000000001 or 15.99999999 (but a number that is
supposed to be .16 might get stored as .1600000001 or .1599999999).
Before the appearance of long long, decimal arithmetic classes often
used double as a 52 bit integral type, ensuring at each step that the
actual value was exactly an integer. (Only division might cause a problem.)
Or you could use some sort of class representing rational numbers.
(Boost has one, for example, and I'm sure there are others.) This would
allow any strange values (5th notes, anyone?) to remain exact; it could
also be advantageous for human readable output, e.g. you could test the
denominator, and then output something like "3 quarter notes", or the
like. Even something like "a 3/4 note" would be more readable to a
musician than "a .75 note".
It is not possible to convert a variable from int to double, it is possible to convert a value from int to double. I'm not completely certain which you are asking for but maybe you are looking for a union
union DoubleOrInt
{
double d;
int i;
};
DoubleOrInt length_numerator;
DoubleOrInt length_denominator;
Then you can write
set_length(int numerator, int denominator){
length_numerator.i=numerator;
length_denominator.i=denominator;
}
set_length(double d){
length_numerator.d=d;
length_denominator.d=1.0;
}
The problem with this approach is that you absolutely must keep track of whether you are currently storing ints or doubles in your unions. Bad things will happen if you store an int and then try to access it as a double. Preferrably you would do this inside your class.
This is normal behavior for floating point variables. They are always rounded and the last digits may change valued depending on the operations you do. I suggest reading on floating points somewhere (e.g. http://floating-point-gui.de/) - especially about comparing fp values.
I normally subtract them, take the absolute value and compare this against an epsilon, e.g. if (abs(x-y)
Given you have a set_length(double d), my guess is that you actually need doubles. Note that the conversion from double to a fraction of integer is fragile and complexe, and will most probably not solve your equality problems (is 0.24999999 equal to 1/4 ?). It would be better for you to either choose to always use fractions, or always doubles. Then, just learn how to use them. I must say, for music, it make sense to have fractions as it is even how notes are being described.
If it were me, I would just use an enum. To turn something into a note would be pretty simple using this system also. Here's a way you could do it:
class Note {
public:
enum Type {
// In this case, 16 represents a whole note, but it could be larger
// if demisemiquavers were used or something.
Semiquaver = 1,
Quaver = 2,
Crotchet = 4,
Minim = 8,
Semibreve = 16
};
static float GetNoteLength(const Type &note)
{ return static_cast<float>(note)/16.0f; }
static float TieNotes(const Type &note1, const Type &note2)
{ return GetNoteLength(note1)+GetNoteLength(note2); }
};
int main()
{
// Make a semiquaver
Note::Type sq = Note::Semiquaver;
// Make a quaver
Note::Type q = Note::Quaver;
// Dot it with the semiquaver from before
float dottedQuaver = Note::TieNotes(sq, q);
std::cout << "Semiquaver is equivalent to: " << Note::GetNoteLength(sq) << " beats\n";
std::cout << "Dotted quaver is equivalent to: " << dottedQuaver << " beats\n";
return 0;
}
Those 'Irregular' notes you speak of can be retrieved using TieNotes

Best way to get individual digits from int for radix sort in C/C++

What is the best way to get individual digits from an int with n number of digits for use in a radix sort algorithm? I'm wondering if there is a particularly good way to do it in C/C++, if not what is the general best solution?
edit: just to clarify, i was looking for a solution other than converting it to a string and treating it like an array of digits.
Use digits of size 2^k. To extract the nth digit:
#define BASE (2<<k)
#define MASK (BASE-1)
inline unsigned get_digit(unsigned word, int n) {
return (word >> (n*k)) & MASK;
}
Using the shift and mask (enabled by base being a power of 2) avoids expensive integer-divide instructions.
After that, choosing the best base is an experimental question (time/space tradeoff for your particular hardware). Probably k==3 (base 8) works well and limits the number of buckets, but k==4 (base 16) looks more attractive because it divides the word size. However, there is really nothing wrong with a base that does not divide the word size, and you might find that base 32 or base 64 perform better. It's an experimental question and may likely differ by hardware, according to how the cache behaves and how many elements there are in your array.
Final note: if you are sorting signed integers life is a much bigger pain, because you want to treat the most significant bit as signed. I recommend treating everything as unsigned, and then if you really need signed, in the last step of your radix sort you will swap the buckets, so that buckets with a most significant 1 come before a most significant 0. This problem is definitely easier if k divides the word size.
Don't use base 10, use base 16.
for (int i = 0; i < 8; i++) {
printf("%d\n", (n >> (i*4)) & 0xf);
}
Since integers are stored internally in binary, this will be more efficient than dividing by 10 to determine decimal digits.