big integer addition without carry flag - c++

In assembly languages, there is usually an instruction that adds two operands and a carry. If you want to implement big integer additions, you simply add the lowest integers without a carry and the next integers with a carry. How would I do that efficiently in C or C++ where I don't have access to the carry flag? It should work on several compilers and architectures, so I cannot simply use inline assembly or such.

You can use "nails" (a term from GMP): rather than using all 64 bits of a uint64_t when representing a number, use only 63 of them, with the top bit zero. That way you can detect overflow with a simple bit-shift. You may even want less than 63.
Or, you can do half-word arithmetic. If you can do 64-bit arithmetic, represent your number as an array of uint32_ts (or equivalently, split 64-bit words into upper and lower 32-bit chunks). Then, when doing arithmetic operations on these 32-bit integers, you can first promote to 64 bits do the arithmetic there, then convert back. This lets you detect carry, and it's also good for multiplication if you don't have a "multiply hi" instruction.
As the other answer indicates, you can detect overflow in an unsigned addition by:
uint64_t sum = a + b;
uint64_t carry = sum < a;
As an aside, while in practice this will also work in signed arithmetic, you have two issues:
It's more complex
Technically, overflowing a signed integer is undefined behavior
so you're usually better off sticking to unsigned numbers.

You can figure out the carry by virtue of the fact that, if you overflow by adding two numbers, the result will always be less than either of those other two values.
In other words, if a + b is less than a, it overflowed. That's for positive values of a and b of course but that's what you'd almost certainly be using for a bignum library.
Unfortunately, a carry introduces an extra complication in that adding the largest possible value plus a carry of one will give you the same value you started with. Hence, you have to handle that as a special case.
Something like:
carry = 0
for i = 7 to 0:
if a[i] > b[i]:
small = b[i], large = a[i]
else:
small = a[i], large = b[i]
if carry is 1 and large is maxvalue:
c[i] = small
carry = 1
else:
c[i] = large + small + carry
if c[i] < large:
carry = 1
else
carry = 0
In reality, you may also want to consider not using all the bits in your array elements.
I've implemented libraries in the past, where the maximum "digit" is less than or equal to the square root of the highest value it can hold. So for 8-bit (octet) digits, you store values from 0 through 15 - that way, multiplying two digits and adding the maximum carry will always fit with an octet, making overflow detection moot, though at the cost of some storage.
Similarly, 16-bit digits would have the range 0 through 255 so that it won't overflow at 65536.
In fact, I've sometimes limited it to more than that, ensuring the artificial wrap value is a power of ten (so an octet would hold 0 through 9, 16-bit digits would be 0 through 99, 32-bit digits from 0 through 9999, and so on.
That's a bit more wasteful on space but makes conversion to and from text (such as printing your numbers) incredibly easy.

u can check for carry for unsigned types by checking, is result less than an operand (any operand will do).
just start the thing with carry 0.

If I understand you correctly, you want to write you own addition for you own big integer type.
You can do this with a simple function. No need to worry about the carry flag in the first run. Just go from right to left, add digit by digit and the carry flag (internally in that function), starting with a carry of 0, and set the result to (a+b+carry) %10 and the carry to (a+b+carry) / 10.
this SO could be relevant:
how to implement big int in c

Related

How do multiply an array of ints to result in a single number?

So I have a single int broken up into an array of smaller ints. For example, int num = 136928 becomes int num[3] = {13,69,28}. I need to multiply the array by a certain number. The normal operation would be 136928 * 2 == 273856. But I need to do [13,69,28] * 2 to give the same answer as 136928 * 2 would in the form of an array again - the result should be
for(int i : arr) {
i *= 2;
//Should multiply everything in the array
//so that arr now equals {27,38,56}
}
Any help would be appreciated on how to do this (also needs to work with multiplying floating numbers) e.g. arr * 0.5 should half everything in the array.
For those wondering, the number has to be split up into an array because it is too large to store in any standard type (64 bytes). Specifically I am trying to perform a mathematical operation on the result of a sha256 hash. The hash returns an array of the hash as uint8_t[64].
Consider using Boost.Multiprecision instead. Specifically, the cpp_int type, which is a representation of an arbitrary-sized integer value.
//In your includes...
#include <boost/multiprecision/cpp_int.hpp>
//In your relevant code:
bool is_little_endian = /*...*/;//Might need to flip this
uint8_t values[64];
boost::multiprecision::cpp_int value;
boost::multiprecision::cpp_int::import_bits(
value,
std::begin(values),
std::end(values),
is_little_endian
);
//easy arithmetic to perform
value *= 2;
boost::multiprecision::cpp_int::export_bits(
value,
std::begin(values),
8,
is_little_endian
);
//values now contains the properly multiplied result
Theoretically this should work with the properly sized type uint512_t, found in the same namespace as cpp_int, but I don't have a C++ compiler to test with right now, so I can't verify. If it does work, you should prefer uint512_t, since it'll probably be faster than an arbitrarily-sized integer.
If you just need multiplying with / dividing by two (2) then you can simply shift the bits in each byte that makes up the value.
So for multiplication you start at the left (I'm assuming big endian here). Then you take the most significant bit of the byte and store it in a temp var (a possible carry bit). Then you shift the other bits to the left. The stored bit will be the least significant bit of the next byte, after shifting. Repeat this until you processed all bytes. You may be left with a single carry bit which you can toss away if you're performing operations modulo 2^512 (64 bytes).
Division is similar, but you start at the right and you carry the least significant bit of each byte. If you remove the rightmost bit then you calculate the "floor" of the calculation (i.e. three divided by two will be one, not one-and-a-half or two).
This is useful if
you don't want to copy the bytes or
if you just need bit operations otherwise and you don't want to include a multi-precision / big integer library.
Using a big integer library would be recommended for maintainability.

Represent Integers with 2000 or more digits [duplicate]

This question already has answers here:
Handling large numbers in C++?
(10 answers)
Closed 7 years ago.
I would like to write a program, which could compute integers having more then 2000 or 20000 digits (for Pi's decimals). I would like to do in C++, without any libraries! (No big integer, boost,...). Can anyone suggest a way of doing it? Here are my thoughts:
using const char*, for holding the integer's digits;
representing the number like
( (1 * 10 + x) * 10 + x )...
The obvious answer works along these lines:
class integer {
bool negative;
std::vector<std::uint64_t> data;
};
Where the number is represented as a sign bit and a (unsigned) base 2**64 value.
This means the absolute value of your number is:
data[0] + (data[1] << 64) + (data[2] << 128) + ....
Or, in other terms you represent your number as a little-endian bitstring with words as large as your target machine can reasonably work with. I chose 64 bit integers, as you can minimize the number of individual word operations this way (on a x64 machine).
To implement Addition, you use a concept you have learned in elementary school:
a b
+ x y
------------------
(a+x+carry) (b+y reduced to one digit length)
The reduction (modulo 2**64) happens automatically, and the carry can only ever be either zero or one. All that remains is to detect a carry, which is simple:
bool next_carry = false;
if(x += y < y) next_carry = true;
if(prev_carry && !++x) next_carry = true;
Subtraction can be implemented similarly using a borrow instead.
Note that getting anywhere close to the performance of e.g. libgmp is... unlikely.
A long integer is usually represented by a sequence of digits (see positional notation). For convenience, use little endian convention: A[0] is the lowest digit, A[n-1] is the highest one. In general case your number is equal to sum(A[i] * base^i) for some value of base.
The simplest value for base is ten, but it is not efficient. If you want to print your answer to user often, you'd better use power-of-ten as base. For instance, you can use base = 10^9 and store all digits in int32 type. If you want maximal speed, then better use power-of-two bases. For instance, base = 2^32 is the best possible base for 32-bit compiler (however, you'll need assembly to make it work optimally).
There are two ways to represent negative integers, The first one is to store integer as sign + digits sequence. In this case you'll have to handle all cases with different signs yourself. The other option is to use complement form. It can be used for both power-of-two and power-of-ten bases.
Since the length of the sequence may be different, you'd better store digit sequence in std::vector. Do not forget to remove leading zeroes in this case. An alternative solution would be to store fixed number of digits always (fixed-size array).
The operations are implemented in pretty straightforward way: just as you did them in school =)
P.S. Alternatively, each integer (of bounded length) can be represented by its reminders for a set of different prime modules, thanks to CRT. Such a representation supports only limited set of operations, and requires nontrivial convertion if you want to print it.

Acting like unsigned int overflow. What is causing it?

I have this function which generates a specified number of so called 'triangle numbers'. If I print out the deque afterwords, the numbers increase, jumps down, then increases again. Triangle numbers should never get lower as i rises so there must be some kind of overflow happening. I tried to fix it by adding the line if(toPush > INT_MAX) return i - 1; to try to stop the function from generating more numbers (and return the number it generated) if the result is overflowing. That is not working however, the output continues to be incorrect (increases for a while, jumps down to a lower number, then increases again). The line I added doesn't actually seem to be doing anything at all. Return is not being reached. Does anyone know what's going on here?
#include <iostream>
#include <deque>
#include <climits>
int generateTriangleNumbers(std::deque<unsigned int> &triangleNumbers, unsigned int generateCount) {
for(unsigned int i = 1; i <= generateCount; i++) {
unsigned int toPush = (i * (i + 1)) / 2;
if(toPush > INT_MAX) return i - 1;
triangleNumbers.push_back(toPush);
}
return generateCount;
}
INT_MAX is the maximum value of signed int. It's about half the maximum value of unsigned int (UINT_MAX). Your calculation of toPush may well get much higher than UINT_MAX because you square the value (if it's near INT_MAX the result will be much larger than UINT_MAX that your toPush can hold). In this case the toPush wraps around and results in smaller value than previous one.
First of all, your comparison to INT_MAX is flawed since your type is unsigned int, not signed int. Secondly, even a comparison to UINT_MAX would be incorrect since it implies that toPush (the left operand of the comparison expression) can hold a value above it's maximum - and that's not possible. The correct way would be to compare your generated number with the previous one. If it's lower, you know you have got an overflow and you should stop.
Additionally, you may want to use types that can hold a larger range of values (such as unsigned long long).
The 92682th triangle number is already greater than UINT32_MAX. But the culprit here is much earlier, in the computation of i * (i + 1). There, the calculation overflows for the 65536th triangular number. If we ask Python with its native bignum support:
>>> 2**16 * (2**16+1) > 0xffffffff
True
Oops. Then if you inspect your stored numbers, you will see your sequence dropping back to low values. To attempt to emulate what the Standard says about the behaviour of this case, in Python:
>>> (int(2**16 * (2**16+1)) % 0xffffffff) >> 1
32768
and that is the value you will see for the 65536th triangular number, which is incorrect.
One way to detect overflow here is ensure that the sequence of numbers you generate is monotonic; that is, if the Nth triangle number generated is strictly greater than the (N-1)th triangle number.
To avoid overflow, you can use 64-bit variables to both generate & store them, or use a big number library if you need a large amount of triangle numbers.
In Visual C++ int (and of course unsigned int) is 32 bits even on 64-bit computers.
Either use unsigned long long or uint64_t to use a 64-bit value.

Basic integer explanation in C++

This is a very basic question.Please don't mind but I need to ask this. Adding two integers
int main()
{
cout<<"Enter a string: ";
int a,b,c;
cout<<"Enter a";
cin>>a;
cout<<"\nEnter b";
cin>>b;
cout<<a<<"\n"<<b<<"\n";
c= a + b;
cout <<"\n"<<c ;
return 0;
}
If I give a = 2147483648 then
b automatically takes a value of 4046724. Note that cin will not be prompted
and the result c is 7433860
If int is 2^32 and if the first bit is MSB then it becomes 2^31
c= 2^31+2^31
c=2^(31+31)
is this correct?
So how to implement c= a+b for a= 2147483648 and b= 2147483648 and should c be an integer or a double integer?
When you perform any sort of input operation, you must always include an error check! For the stream operator, this could look like this:
int n;
if (!(std::cin >> n)) { std::cerr << "Error!\n"; std::exit(-1); }
// ... rest of program
If you do this, you'll see that your initial extraction of a already fails, so whatever values are read afterwards are not well defined.
The reason the extraction fails is that the literal token "2147483648" does not represent a value of type int on your platform (it is too large), no different from, say, "1z" or "Hello".
The real danger in programming is to assume silently that an input operation succeeds when often it doesn't. Fail as early and as noisily as possible.
The int type is signed and therefor it's maximum value is 2^31-1 = 2147483648 - 1 = 2147483647
Even if you used unsigned integer it's maximum value is 2^32 -1 = a + b - 1 for the values of a and b you give.
For the arithmetics you are doing, you should better use "long long", which has maximum value of 2^63-1 and is signed or "unsigned long long" which has a maximum value of 2^64-1 but is unsigned.
c= 2^31+2^31
c=2^(31+31)
is this correct?
No, but you're right that the result takes more than 31 bits. In this case the result takes 32 bits (whereas 2^(31+31) would take 62 bits). You're confusing multiplication with addition: 2^31 * 2^31 = 2^(31+31).
Anyway, the basic problem you're asking about dealing with is called overflow. There are a few options. You can detect it and report it as an error, detect it and redo the calculation in such a way as to get the answer, or just use data types that allow you to do the calculation correctly no matter what the input types are.
Signed overflow in C and C++ is technically undefined behavior, so detection consists of figuring out what input values will cause it (because if you do the operation and then look at the result to see if overflow occurred, you may have already triggered undefined behavior and you can't count on anything). Here's a question that goes into some detail on the issue: Detecting signed overflow in C/C++
Alternatively, you can just perform the operation using a data type that won't overflow for any of the input values. For example, if the inputs are ints then the correct result for any pair of ints can be stored in a wider type such as (depending on your implementation) long or long long.
int a, b;
...
long c = (long)a + (long)b;
If int is 32 bits then it can hold any value in the range [-2^31, 2^31-1]. So the smallest value obtainable would be -2^31 + -2^31 which is -2^32. And the largest value obtainable is 2^31 - 1 + 2^31 - 1 which is 2^32 - 2. So you need a type that can hold these values and every value in between. A single extra bit would be sufficient to hold any possible result of addition (a 33-bit integer would hold any integer from [-2^32,2^32-1]).
Or, since double can probably represent every integer you need (a 64-bit IEEE 754 floating point data type can represent integers up to 53 bits exactly) you could do the addition using doubles as well (though adding doubles may be slower than adding longs).
If you have a library that offers arbitrary precision arithmetic you could use that as well.

Best way to get individual digits from int for radix sort in C/C++

What is the best way to get individual digits from an int with n number of digits for use in a radix sort algorithm? I'm wondering if there is a particularly good way to do it in C/C++, if not what is the general best solution?
edit: just to clarify, i was looking for a solution other than converting it to a string and treating it like an array of digits.
Use digits of size 2^k. To extract the nth digit:
#define BASE (2<<k)
#define MASK (BASE-1)
inline unsigned get_digit(unsigned word, int n) {
return (word >> (n*k)) & MASK;
}
Using the shift and mask (enabled by base being a power of 2) avoids expensive integer-divide instructions.
After that, choosing the best base is an experimental question (time/space tradeoff for your particular hardware). Probably k==3 (base 8) works well and limits the number of buckets, but k==4 (base 16) looks more attractive because it divides the word size. However, there is really nothing wrong with a base that does not divide the word size, and you might find that base 32 or base 64 perform better. It's an experimental question and may likely differ by hardware, according to how the cache behaves and how many elements there are in your array.
Final note: if you are sorting signed integers life is a much bigger pain, because you want to treat the most significant bit as signed. I recommend treating everything as unsigned, and then if you really need signed, in the last step of your radix sort you will swap the buckets, so that buckets with a most significant 1 come before a most significant 0. This problem is definitely easier if k divides the word size.
Don't use base 10, use base 16.
for (int i = 0; i < 8; i++) {
printf("%d\n", (n >> (i*4)) & 0xf);
}
Since integers are stored internally in binary, this will be more efficient than dividing by 10 to determine decimal digits.