Division of big numbers

Division of big numbers - c++

I need some division algorithm which can handle big integers (128-bit).
I've already asked how to do it via bit shifting operators. However, my current implementation seems to ask for a better approach
Basically, I store numbers as two long long unsigned int's in the format
A * 2 ^ 64 + B with B < 2 ^ 64.
This number is divisible by 24 and I want to divide it by 24.
My current approach is to transform it like
A * 2 ^ 64 + B A B
-------------- = ---- * 2^64 + ----
24 24 24
A A mod 24 B B mod 24
= floor( ---- ) * 2^64 + ---------- * 2^64 + floor( ---- ) + ----------
24 24.0 24 24.0
However, this is buggy.
(Note that floor is A / 24 and that mod is A % 24. The normal divisions are stored in long double, the integers are stored in long long unsigned int.
Since 24 is equal to 11000 in binary, the second summand shouldn't change something in the range of the fourth summand since it is shifted 64 bits to the left.
So, if A * 2 ^ 64 + B is divisible by 24, and B is not, it shows easily that it bugs since it returns some non-integral number.
What is the error in my implementation?

The easiest way I can think of to do this is to treat the 128-bit numbers as four 32-bit numbers:
A_B_C_D = A*2^96 + B*2^64 + C*2^32 + D
And then do long division by 24:
E = A/24 (with remainder Q)
F = Q_B/24 (with remainder R)
G = R_C/24 (with remainder S)
H = S_D/24 (with remainder T)
Where X_Y means X*2^32 + Y.
Then the answer is E_F_G_H with remainder T. At any point you only need division of 64-bit numbers, so this should be doable with integer operations only.

Could this possibly be solved with inverse multiplication? The first thing to note is that 24 == 8 * 3 so the result of
a / 24 == (a >> 3) / 3
Let x = (a >> 3) then the result of the division is 8 * (x / 3). Now it remains to find the value of x / 3.
Modular arithmetic states that there exists a number n such that n * 3 == 1 (mod 2^128). This gives:
x / 3 = (x * n) / (n * 3) = x * n
It remains to find the constant n. There's an explanation on how to do this on wikipedia. You'll also have to implement functionality to multiply to 128 bit numbers.
Hope this helps.
/A.B.

You shouldn't be using long double for your "normal divisions" but integers there as well. long double doesn't have enough significant figures to get the answer right (and anyway the whole point is to do this with integer operations, correct?).

Since 24 is equal to 11000 in binary, the second summand shouldn't change something in the range of the fourth summand since it is shifted 64 bits to the left.
Your formula is written in real numbers. (A mod 24) / 24 can have an arbitrary number of decimals (1/24 is for instance 0.041666666...) and can therefore interfere with the fourth term in your decomposition, even once multiplied by 2^64.
The property that Y*2^64 does not interfere with the lower weight binary digits in an addition only works when Y is an integer.

Don't.
Go grab a library to do this stuff - you'll be incredibly thankful you chose to when debugging weird errors.
Snippets.org had a C/C++ BigInt library on it's site a while ago, Google also turned up the following: http://mattmccutchen.net/bigint/

Related

Why is the result of a bitwise shift unrecoverable if there is a mathematical equivalent of the same operation?

Take for example the number 91. That number in binary is 1011011. If you shift that number to the right by 5 bits, you would get 2 (10 in binary). According to a google search, bit shifting to the left or right by a certain amount of bits is the same as multiplying or dividing the number by 2 to the power of the number of bits to be shifted, respectively. so to get from 91 to 2 by bit shifting, the equation would look like this: 91 / 2^5, which is also 91 / 32. Now, of course if you did that in your calculator, there would be some decimal values, which aren't included when bit shifting. The resulting 2 is actually 2.84357. I'm sure you know that if you do a certain operation on a number and then you do the inverse, the result would be what you had in the first place. So does decimal precision have something to do with this?

There is a mathematical equivalent of shifting to the right... and the mathematical operation is UNRECOVERABLE.
You seem to think that shifting to the right is:
bit shifting to the left or right by a certain amount of bits is the same as multiplying or dividing the number by 2
This is what you will hear people casually say, but it is only half right. As it it is not the same but only similar.
The correct statement is:
shifting a base-2 number one digit to the right is THE SAME as dividing by two in the integer domain
If you have an integer calculator, if you did 91/32 you will get 2. You will not get ANY decimal point because we are operating in the integer domain.
For real numbers, the equivalent operation is:
FLOOR(91/32)
Which is also unrecoverable because it also results in 2.
The lesson here is be careful when listening to what people CASUALLY say. Casual speech is often imprecise and assumes the listener is familiar with the subject. You need to dig deeper what the statement is actually trying to say.
As for why it is unrecoverable? Division of integers give two results: the quotient (which is the main result) and the remainder. When we divide 91 by 32 we are doing this:
2
_____
32 ) 91
64
__
27
So we get the result of 2 and a remainder of 27. The reason you can't get 91 by multiplying 2*32 is because we threw away the remainder.
You can get the result back if you saved the remainder. However, calculating the remainder is not a matter of simple shifts. Here's an example of how to make it reversable in C:
int test () {
int a = 91;
int b = 32;
int result;
int remainder;
result = a / b; // result will be 2
remainder = a % b; // remainder will be 27
return (result * b) + remainder; // returns 91
}

You can only recover the result of an operation if it has a 1-1 mapping between the inputs and outputs, i.e. it has an inverse function. But not all mathematical functions have an inverse function
For example if f(x) = x >> n with >> is the shift operator then it'll be equivalent to
f(x) = ⌊x/2n⌋
with ⌊ ⌋ being the floor function. Since there are many inputs that lead to the same output, the relationship isn't 1-1 and there can't be an inverse function for it. This function works the same for both signed and unsigned right shift:
91 >> 5 == floor(91.0/32.0) == 2
-91 >> 5 == floor(-91.0/32.0) == -3
Similarly for an unsigned left shift function g(x) = x << n then the equivalent is
g(x) = (x * 2n) mod 2N
with N being the size in bits of x, because integer math in hardware, C and many other languages always reduce modulo 2N due to the limit of register size and the use of two's complement. And it's clear that the modulo function also isn't invertible/recoverable. The signed left shift is almost the same with some small modifications

What is the math behind * 1233 >> 12 in this code counting decimal digits

I am a bit confused how this short function from the C++ {fmt} library works.
inline std::uint32_t digits10_clz(std::uint32_t n) {
std::uint32_t t = (32 - __builtin_clz(n | 1)) * 1233 >> 12;
return t - (n < powers_of_10_u32[t]) + 1;
}
I understand the logic that you can approximate the log10 using log2(__builtin_clz) and that you need to adjust for exact value, but the multiplication is a mystery to me.

Recall the formula for changing the base of logarithm from b to d is
logdx = logbx / logbd
In our case, b is 2 (binary), and d is 10 (decimal). Hence, you need to divide by log210, which is the same as multiplying by 1/log210, i.e by 0.30102999566.
Now recall that shifting by 12 is the same as dividing by 212, which is 4096. Dividing 1233 by 4096 yields 0.30102539062, which is a pretty good approximation for the denominator in the base change formula.

Modulus operator over int

Is it possible that after performing a modulus(%) of 10^9 + 7 over a number then number might still be out of range.
I was doing this question on CodeChef http://www.codechef.com/problems/FIRESC and was getting a wrong answer, after looking at the authors solution I changed my final answer type to long long int to int and got a correct answer. Why did that happen?

If you perform multiplications like result = (result * x) % MOD where both result and x can be up to MOD - 1, the intermediate expression result * x can be up to (MOD - 1) squared. And for modulo 109 + 7, this surely does not fit into a 32-bit integer type. Thus it is calculated incorrectly: basically, you get not result * x, but the same quantity modulo 232.
For example, from a mathematical point of view, (100,001 * 100,001) modulo 109 + 7 is 199,931, but when calculated in a 32-bit integer, 100,001 * 100,001 becomes 1,410,265,409, and taking it modulo 109 + 7 gives 410,265,402.

Why perform multiplication in this way?

I've run into this function:
static inline INT32 MPY48SR(INT16 o16, INT32 o32)
{
UINT32 Temp0;
INT32 Temp1;
// A1. get the lower 16 bits of the 32-bit param
// A2. multiply them with the 16-bit param
// A3. add 16384 (TODO: why?)
// A4. bitshift to the right by 15 (TODO: why 15?)
Temp0 = (((UINT16)o32 * o16) + 0x4000) >> 15;
// B1. Get the higher 16 bits of the 32-bit param
// B2. Multiply them with the 16-bit param
Temp1 = (INT16)(o32 >> 16) * o16;
// 1. Shift B to the left (TODO: why do this?)
// 2. Combine with A and return
return (Temp1 << 1) + Temp0;
}
The inline comments are mine. It seems that all it's doing is multiplying the two arguments. Is this right, or is there more to it? Why would this be done in such a way?

Those parameters don't represent integers. They represent real numbers in fixed-point format with 15 bits to the right of the radix point. For instance, 1.0 is represented by 1 << 15 = 0x8000, 0.5 is 0x4000, -0.5 is 0xC000 (or 0xFFFFC000 in 32 bits).
Adding fixed-point numbers is simple, because you can just add their integer representation. But if you want to multiply, you first have to multiply them as integers, but then you have twice as many bits to the right of the radix point, so you have to discard the excess by shifting. For instance, if you want to multiply 0.5 by itself in 32-bit format, you multiply 0x00004000 (1 << 14) by itself to get 0x10000000 (1 << 28), then shift right by 15 bits to get 0x00002000 (1 << 13). To get better accuracy, when you discard the lowest 15-bits, you want to round to the nearest number, not round down. You can do this by adding 0x4000 = 1 << 14. Then if the discarded 15 bits is less than 0x4000, it gets rounded down, and if it's 0x4000 or more, it gets rounded up.
(0x3FFF + 0x4000) >> 15 = 0x7FFF >> 15 = 0
(0x4000 + 0x4000) >> 15 = 0x8000 >> 15 = 1
To sum up, you can do the multiplication like this:
return (o32 * o16 + 0x4000) >> 15;
But there's a problem. In C++, the result of a multiplication has the same type as its operands. So o16 is promoted to the same size as o32, then they are multiplied to get a 32-bit result. But this throws away the top bits, because the product needs 16 + 32 = 48 bits for accurate representation. One way to do this is to cast the operands to 64 bits and then multiply, but that might be slower, and it's not supported on all machines. So instead it breaks o32 into two 16-bit pieces, then does two multiplications in 32-bits, and combines the results.

This implements multiplication of fixed-point numbers. The numbers are viewed as being in the Q15 format (having 15 bits in the fractional part).
Mathematically, this function calculates (o16 * o32) / 2^15, rounded to nearest integer (hence the 2^14 factor, which represents 1/2, added to a number in order to round it). It uses unsigned and signed 16-bit multiplications with 32-bit result, which are presumably supported by the instruction set.
Note that there exists a corner case, where each of the numbers has a minimal value (-2^15 and -2^31); in this case, the result (2^31) is not representable in the output, and gets wrapped over (becomes -2^31 instead). For all other combinations of o16 and o32, the result is correct.

Bit shifts with ABAP

I'm trying to port some Java code, which requires arithmetic and logical bit shifts, to ABAP.
As far as I know, ABAP only supports the bitwise NOT, AND, OR and XOR operations.
Does anyone know another way to implement these kind of shifts with ABAP? Is there perhaps a way to get the same result as the shifts, by using just the NOT, AND, OR and XOR operations?

Disclaimer: I am not specifically familiar with ABAP, hence this answer is given on a more general level.
Assuming that what you said is true (ABAP doesn't support shifts, which I somewhat doubt), you can use multiplications and divisions instead.
Logical shift left (LSHL)
Can be expressed in terms of multiplication:
x LSHL n = x * 2^n
For example given x=9, n=2:
9 LSHL 2 = 9 * 2^2 = 36
Logical shift right (LSHR)
Can be expressed with (truncating) division:
x LSHR n = x / 2^n
Given x=9, n=2:
9 LSHR 2 = 9 / 2^2 = 2.25 -> 2 (truncation)
Arithmetic shift left (here: "ASHL")
If you wish to perform arithmetic shifts (=preserve sign), we need to further refine the expressions to preserve the sign bit.
Assuming we know that we are dealing with a 32-bit signed integer, where the highest bit is used to represent the sign:
x ASHL n = ((x AND (2^31-1)) * 2^n) + (x AND 2^31)
Example: Shifting Integer.MAX_VALUE to left by one in Java
As an example of how this works, let us consider that we want to shift Java's Integer.MAX_VALUE to left by one. Logical shift left can be represented as *2. Consider the following program:
int maxval = (int)(Integer.MAX_VALUE);
System.out.println("max value : 0" + Integer.toBinaryString(maxval));
System.out.println("sign bit : " + Integer.toBinaryString(maxval+1));
System.out.println("max val<<1: " + Integer.toBinaryString(maxval<<1));
System.out.println("max val*2 : " + Integer.toBinaryString(maxval*2));
The program's output:
max value : 01111111111111111111111111111111 (2147483647)
sign bit : 10000000000000000000000000000000 (-2147483648)
max val<<1: 11111111111111111111111111111110 (-2)
max val*2 : 11111111111111111111111111111110 (-2)
The result is negative due that the highest bit in integer is used to represent sign. We get the exact number of -2, because of the way negative numbers are represents in Java (for details, see for instance http://www.javabeat.net/qna/30-negative-numbers-and-binary-representation-in/).

Edit: the updated code can now be found over here: github gist

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js