Optimising code for modular arithmetic

Optimising code for modular arithmetic - c++

I am trying to calculate below expression for large numbers.
Since the value of this expression will be very large, I just need the value of this expression modulus some prime number. Suppose the value of this expression is x and I choose the prime number 1000000007; I'm looking for x % 1000000007.
Here is my code.
#include<iostream>
#define MOD 1000000007
using namespace std;
int main()
{
unsigned long long A[1001];
A[2]=2;
for(int i=4;i<=1000;i+=2)
{
A[i]=((4*A[i-2])/i)%MOD;
A[i]=(A[i]*(i-1))%MOD;
while(1)
{
int N;
cin>>N;
cout<<A[N];
}
}
But even this much optimisation is failing for large values of N. For example if N is 50, the correct output is 605552882, but this gives me 132924730. How can I optimise it further to get the correct output?
Note : I am only considering N as even.

When you do modular arithmetic, there is no such operation as division. Instead, you take the modular inverse of the denominator and multiply. The modular inverse is computed using the extended Euclidean algorithm, discovered by Etienne Bezout in 1779:
# return y such that x * y == 1 (mod m)
function inverse(x, m)
a, b, u := 0, m, 1
while x > 0
q, r := divide(b, x)
x, a, b, u := b % x, u, x, a - q * u
if b == 1 return a % m
error "must be coprime"
The divide function returns both quotient and remainder. All of the assignment operators given above are simultaneous assignment, where all of the right hand sides are computed first, then all of the left hand sides are assigned simultaneously. You can see more about modular arithmetic at my blog.

For starters no modulo division is needed at all, your formula can be rewrited as follows:
N!/((N/2)!^2)
=(1.2.3...N)/((1.2.3...N/2)*(1.2.3...N/2))
=((N/2+1)...N)/(1.2.3...N/2))
ok now you are dividing bigger number by the smaller
so you can iterate the result by multiplicating divisor and divident
so booth sub results have similar magnitude
any time both numbers are divisible 2 shift them left
this will ensure that the do not overflow
if you are at the and of (N/2)! than continue the the multiplicetion only for the rest.
any time both subresults are divisible by anything divide them
until you are left with divison by 1
after this you can multiply with modulo arithmetics till the end normaly.
for more advanced approach see this.
N! and (N/2)! are decomposable much further than it seems at the first look
i had solved that for some time now,...
here is what i found: Fast exact bigint factorial
in shortcut your terms N! and ((N/2)!)^2 will disappear completely.
only simple prime decomposition + 4N <-> 1N correction will remind
solution:
I. (4N!)=((2N!)^2) . mul(i=all primes<=4N) of [i^sum(j=1,2,3,4,5,...4N>=i^j) of [(4N/(i^j))%2]]
II. (4N)!/((4N/2)!^2) = (4N)!/((2N)!^2)
----------------------------------------
I.=II. (4N)!/((2N)!^2)=mul(i=all primes<=4N) of [i^sum(j=1,2,3,4,5,...4N>=i^j) of [(4N/(i^j))%2]]
the only thing is that N must be divisible by 4 ... therefore 4N in all terms.
if you have N%4!=0 than solve for N-N%4 and the result correct by the misin 1-3 numbers.
hope it helps

Related

Correctness of multiplication with overflow detection

The following C++ template detects overflows from multiplying two unsigned integers.
template<typename UInt> UInt safe_multiply(UInt a, UInt b) {
UInt x = a * b; // x := ab mod n, for n := 2^#bits > 0
if (a != 0 && x / a != b)
cerr << "Overflow for " << a << " * " << b << "." << endl;
return x;
}
Can you give a proof that this algorithm detects every potential overflow, regardless of how many bits UInt uses?
The case
cannot result in overflows, so we can consider
.
It seems that the correctness proof boils down to leading
to a contradiction, since x / a actually means .
When assuming
, this leads to the straightforward consequence
thus
which contradicts n > 0.
So it remains to show
or there must be another way.
If the last equation is true, WolframAlpha fails to confirm that (also with exponents).
However, it asserts that the original assumptions have no integer solutions, so the algorithms seems to be correct indeed.
But it doesn't provide an explanation. So why is it correct?
I am looking for the smallest possible explanation that is still mathematically profound, ideally that it fits in a single-line comment. Maybe I am missing something trivial, or the problem is not as easy as it looks.
On a side note, I used Codecogs Equation Editor for the LaTeX markup images, which apparently looks bad in dark mode, so consider switching to light mode or, if you know, please tell me how to use different images depending on the client settings. It is just \bg{white} vs. \bg{black} as part of the image URLs.

To be clear, I'll use the multiplication and division symbols (*, /) mathematically.
Also, for convenience let's name the set N = {0, 1, ..., n - 1}.
Let's clear up what unsigned multiplication is:
Unsigned multiplication for some magnitude, n, is a modular n operation on unsigned-n inputs (inputs that are in N) that results in an unsigned-n output (ie. also in N).
In other words, the result of unsigned multiplication, x, is x = a*b (mod n), and, additionally, we know that x,a,b are in N.
It's important to be able to expand many modular formulas like so: x = a*b - k*n, where k is an integer - but in our case x,a,b are in N so this implies that k is in N.
Now, let's mathematically say what we're trying to prove:
Given positive integers, a,n, and non-negative integers x,b, where x,a,b are in N, and x = a*b (mod n), then a*b >= n (overflow) implies floor(x/a) != b.
Proof:
If overflow (a*b >= n) then x >= n - k*n = (1 - k)*n (for k in N),
As x < n then (1 - k)*n < n, so k > 0.
This means x <= a*b - n.
So, floor(x/a) <= floor([a*b - n]/a) = floor(a*b/a - n/a) = b - floor(n/a) <= b - 1,
Abbreviated: floor(x/a) <= b - 1
Therefore floor(x/a) != b

The multiplication gives either the mathematically correct result, or it is off by some multiple of 2^64. Since you check for a=0, the division always gives the correct result for its input. But in the case of overflow, the input is off by 2^64 or more, so the test will fail as you hoped.
The last bit is that unsigned operations don’t have undefined behaviour except for division by zero, so your code is fine.

Why is the result of a bitwise shift unrecoverable if there is a mathematical equivalent of the same operation?

Take for example the number 91. That number in binary is 1011011. If you shift that number to the right by 5 bits, you would get 2 (10 in binary). According to a google search, bit shifting to the left or right by a certain amount of bits is the same as multiplying or dividing the number by 2 to the power of the number of bits to be shifted, respectively. so to get from 91 to 2 by bit shifting, the equation would look like this: 91 / 2^5, which is also 91 / 32. Now, of course if you did that in your calculator, there would be some decimal values, which aren't included when bit shifting. The resulting 2 is actually 2.84357. I'm sure you know that if you do a certain operation on a number and then you do the inverse, the result would be what you had in the first place. So does decimal precision have something to do with this?

There is a mathematical equivalent of shifting to the right... and the mathematical operation is UNRECOVERABLE.
You seem to think that shifting to the right is:
bit shifting to the left or right by a certain amount of bits is the same as multiplying or dividing the number by 2
This is what you will hear people casually say, but it is only half right. As it it is not the same but only similar.
The correct statement is:
shifting a base-2 number one digit to the right is THE SAME as dividing by two in the integer domain
If you have an integer calculator, if you did 91/32 you will get 2. You will not get ANY decimal point because we are operating in the integer domain.
For real numbers, the equivalent operation is:
FLOOR(91/32)
Which is also unrecoverable because it also results in 2.
The lesson here is be careful when listening to what people CASUALLY say. Casual speech is often imprecise and assumes the listener is familiar with the subject. You need to dig deeper what the statement is actually trying to say.
As for why it is unrecoverable? Division of integers give two results: the quotient (which is the main result) and the remainder. When we divide 91 by 32 we are doing this:
2
_____
32 ) 91
64
__
27
So we get the result of 2 and a remainder of 27. The reason you can't get 91 by multiplying 2*32 is because we threw away the remainder.
You can get the result back if you saved the remainder. However, calculating the remainder is not a matter of simple shifts. Here's an example of how to make it reversable in C:
int test () {
int a = 91;
int b = 32;
int result;
int remainder;
result = a / b; // result will be 2
remainder = a % b; // remainder will be 27
return (result * b) + remainder; // returns 91
}

You can only recover the result of an operation if it has a 1-1 mapping between the inputs and outputs, i.e. it has an inverse function. But not all mathematical functions have an inverse function
For example if f(x) = x >> n with >> is the shift operator then it'll be equivalent to
f(x) = ⌊x/2n⌋
with ⌊ ⌋ being the floor function. Since there are many inputs that lead to the same output, the relationship isn't 1-1 and there can't be an inverse function for it. This function works the same for both signed and unsigned right shift:
91 >> 5 == floor(91.0/32.0) == 2
-91 >> 5 == floor(-91.0/32.0) == -3
Similarly for an unsigned left shift function g(x) = x << n then the equivalent is
g(x) = (x * 2n) mod 2N
with N being the size in bits of x, because integer math in hardware, C and many other languages always reduce modulo 2N due to the limit of register size and the use of two's complement. And it's clear that the modulo function also isn't invertible/recoverable. The signed left shift is almost the same with some small modifications

Modular Exponentiation over a Power of 2

So I've been doing some work recently with the modpow function. One of the forms I needed was Modular Exponentiation when the Modulus is a Power of 2. So I got the code up and running. Great, no problems. Then I read that one trick you can make to get it faster is, instead of using the regular exponent, takes it's modulus over the totient of the modulus.
Now when the modulus is a power of two, the answer is simply the power of 2 less than the current one. Well, that's simple enough. So I coded it, and it worked..... sometimes.
For some reason there are some values that aren't working, and I just can't figure out what it is.
uint32 modpow2x(uint32 B, uint32 X, uint32 M)
{
uint32 D;
M--;
B &= M;
X &= (M >> 1);
D = 1;
if ((X & 1) == 1)
{
D = B;
}
while ((X >>= 1) != 0)
{
B = (B * B) & M;
if ((X & 1) == 1)
{
D = (D * B) & M;
}
}
return D;
}
And this is one set of numbers that it doesn't work for.
Base = 593803430
Exponent = 3448538912
Modulus = 8
And no, there is no check in this function to determine if the Modulus is a power of 2. The reason is that this is an internal function and I already know that only Powers of 2 will be passed to it. However, I have already double checked to make sure that no non-powers of 2 are getting though.
Thanks for any help you guys can give!

It's true that if x is relatively prime to n (x and n have no common factors), then x^a = x^(phi(a)) (mod n), where phi is Euler's totient function. That's because then x belongs to the multiplicative group of (Z/nZ), which has order phi(a).
But, for x not relatively prime to n, this is no longer true. In your example, the base does have a common factor with your modulus, namely 2. So the trick will not work here. If you wanted to, though, you could write some extra code to deal with this case -- maybe find the largest power of 2 that x is divisible by, say 2^k. Then divide x by 2^k, run your original code, shift its output left by k*e, where e is your exponent, and reduce modulo M. Of course, if k isn't zero, this would usually result in an answer of zero.

Calculating Probability C++ Bernoulli Trials

The program asks the user for the number of times to flip a coin (n; the number of trials).
A success is considered a heads.
Flawlessly, the program creates a random number between 0 and 1. 0's are considered heads and success.
Then, the program is supposed to output the expected values of getting x amount of heads. For example if the coin was flipped 4 times, what are the following probabilities using the formula
nCk * p^k * (1-p)^(n-k)
Expected 0 heads with n flips: xxx
Expected 1 heads with n flips: xxx
...
Expected n heads with n flips: xxx
When doing this with "larger" numbers, the numbers come out to weird values. It happens if 15 or twenty are put into the input. I have been getting 0's and negative values for the value that should be xxx.
Debugging, I have noticed that the nCk has come out to be negative and not correct towards the upper values and beleive this is the issue. I use this formula for my combination:
double combo = fact(n)/fact(r)/fact(n-r);
here is the psuedocode for my fact function:
long fact(int x)
{
int e; // local counter
factor = 1;
for (e = x; e != 0; e--)
{
factor = factor * e;
}
return factor;
}
Any thoughts? My guess is my factorial or combo functions are exceeding the max values or something.

You haven't mentioned how is factor declared. I think you are getting integer overflows. I suggest you use double. That is because since you are calculating expected values and probabilities, you shouldn't be concerned much about precision.
Try changing your fact function to.
double fact(double x)
{
int e; // local counter
double factor = 1;
for (e = x; e != 0; e--)
{
factor = factor * e;
}
return factor;
}
EDIT:
Also to calculate nCk, you need not calculate factorials 3 times. You can simply calculate this value in the following way.
if k > n/2, k = n-k.
n(n-1)(n-2)...(n-k+1)
nCk = -----------------------
factorial(k)

You're exceeding the maximum value of a long. Factorial grows so quickly that you need the right type of number--what type that is will depend on what values you need.
Long is an signed integer, and as soon as you pass 2^31, the value will become negative (it's using 2's complement math).
Using an unsigned long will buy you a little time (one more bit), but for factorial, it's probably not worth it. If your compiler supports long long, then try an "unsigned long long". That will (usually, depends on compiler and CPU) double the number of bits you're using.
You can also try switching to use double. The problem you'll face there is that you'll lose accuracy as the numbers increase. A double is a floating point number, so you'll have a fixed number of significant digits. If your end result is an approximation, this may work okay, but if you need exact values, it won't work.
If none of these solutions will work for you, you may need to resort to using an "infinite precision" math package, which you should be able to search for. You didn't say if you were using C or C++; this is going to be a lot more pleasant with C++ as it will provide a class that acts like a number and that would use standard arithmetic operators.

How does the noise function actually work?

I've looked into the libnoise sources and found the ValuNoise3D function:
double noise::ValueNoise3D (int x, int y, int z, int seed)
{
return 1.0 - ((double)IntValueNoise3D (x, y, z, seed) / 1073741824.0);
}
int noise::IntValueNoise3D (int x, int y, int z, int seed)
{
// All constants are primes and must remain prime in order for this noise
// function to work correctly.
int n = (
X_NOISE_GEN * x
+ Y_NOISE_GEN * y
+ Z_NOISE_GEN * z
+ SEED_NOISE_GEN * seed)
& 0x7fffffff;
n = (n >> 13) ^ n;
return (n * (n * n * 60493 + 19990303) + 1376312589) & 0x7fffffff;
}
But when I am looking at this, it is a magic for me. How does this actually work? I mean why the guy who wrote this, took those prime numbers instead of others? Why such equations? How did he decide to use those equations instead of others? Just... how to understand this?

The libnoise Web site has a good explanation of the mathematics behind this noise function. In particular, with regards to the prime numbers:
These large integers are primes. These integers may be modified as long as they remain prime; non-prime numbers may introduce discernible patterns to the output.
noise::IntValueNoise3D actually operates in two steps: the first step converts the (x, y, z) coordinates to a single integer, and the second step puts this integer through an integer noise function to produce a noise value roughly between -1073741824 and 1073741824. noise::ValueNoise3D just converts that integer to a floating-point value between -1 and 1.
As for why noise::IntValueNoise3D performs all those convoluted operations, it basically boils down to the fact that this particular sequence of operations produces a nice, noisy result with no clear pattern visible. This is not the only sequence of operations that could have been used; anything that produces a sufficiently noisy result would have worked.

There is an art to randomness. There are are many things that make a pseudorandom number "look good." For a lot of 3d function, the most important thing in making it "look good" is having a proper looking frequency distribution. Anything which ensures a good frequency distribution modulo 2^32 will yield very good looking numbers. Multiplying by a large prime number yields good frequency distributions because they share no factors with 2^32.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js