Inverse of A*X MOD (2^N)-1 - c++

Given a function y = f(A,X):
unsigned long F(unsigned long A, unsigned long x) {
return ((unsigned long long)A*X)%4294967295;
}
How would I find the inverse function x = g(A,y) such that x = g(A, f(A,x)) for all values of 'x'?
If f() isn't invertible for all values of 'x', what's the closest to an inverse?
(F is an obsolete PRNG, and I'm trying to understand how one inverts such a function).
Updated
If A is relatively prime to (2^N)-1, then g(A,Y) is just f(A-1, y).
If A isn't relatively prime, then the range of y is constrained...
Does g( ) still exist if restricted to that range?

You need the Extended Euclidean algorithm. This gives you R and S such that
gcd(A,2^N-1) = R * A + S * (2^N-1).
If the gcd is 1 then R is the multiplicative inverse of A. Then the inverse function is
g(A,y) = R*y mod (2^N-1).
Ok, here is an update for the case where the G = Gcd(A, 2^N-1) is not 1. In that case
R*y mod (2^N-1) = R*A*x mod (2^N-1) = G*x mod (2^N-1).
If y was computed by the function f then y is divisible by G. Hence we can divide the equation above by G and get an equation modulo (2^N-1)/G. Thus the set of solutions is
x = R*y/G + k*(2^N-1)/G, where k is an arbitrary integer.

The solution is given here (http://en.wikipedia.org/wiki/Linear_congruence_theorem), and includes a demonstration of how the extended Euclidean algorithm is used to find the solutions.
The modulus function in general does not have an inverse function, but you can sometimes find a set of x's that map to the given y.

Accipitridae, Glenn, and Jeff Moser have the answer between them, but it's worth explaining a little more why not every number has an inverse under mod 4294967295. The reason is that 4294967295 is not a prime number; it is the product of five factors: 3 x 5 x 17 x 257 x 65537. A number x has a mutiplicative inverse under mod m if and only if x and m are coprime, so any number that is a multiple of those factors cannot have an inverse in your function.
This is why the modulus chosen for such PRNGs is usually prime.

You need to compute the inverse of A mod ((2^N) - 1), but you might not always have an inverse given your modulus. See this on Wolfram Alpha.
Note that
A = 12343 has an inverse (A^-1 = 876879007 mod 4294967295)
but 12345 does not have an inverse.
Thus, if A is relatively prime with (2^n)-1, then you can easily create an inverse function using the Extended Euclidean Algorithm where
g(A,y) = F(A^-1, y),
otherwise you're out of luck.
UPDATE: In response to your updated question, you still can't get a unique inverse in the restricted range. Even if you use CookieOfFortune's brute force solution, you'll have problems like
G(12345, F(12345, 4294967294)) == 286331152 != 4294967294.

Eh... here's one that will work:
unsigned long G(unsigned long A, unsigned long y)
{
for(unsigned int i = 0; i < 4294967295; i++)
{
if(y == F(A, i)) return i);
}
}

Related

How can you calculate a factor if you have the other factor and the product with overflows?

a * x = b
I have a seemingly rather complicated multiplication / imul problem: if I have a and I have b, how can I calculate x if they're all 32-bit dwords (e.g. 0-1 = FFFFFFFF, FFFFFFFF+1 = 0)?
For example:
0xcb9102df * x = 0x4d243a5d
In that case, x is 0x1908c643. I found a similar question but the premises were different and I'm hoping there's a simpler solution than those given.
Numbers have a modular multiplicative inverse modulo a power of two precisely iff they are odd. Everything else is a bit-shifted odd number (even zero, which might be anything, with all bits shifted out). So there are a couple of cases:
Given a * x = b
tzcnt(a) > tzcnt(b) no solution
tzcnt(a) <= tzcnt(b) solvable, with 2tzcnt(a) solutions
The second case has a special case with 1 solution, for odd a, namely x = inverse(a) * b
More generally, x = inverse(a >> tzcnt(a)) * (b >> tzcnt(a)) is a solution, because you write a as (a >> tzcnt(a)) * (1 << tzcnt(a)), so we cancel the left factor with its inverse, we leave the right factor as part of the result (cannot be cancelled anyway) and then multiply by what remains to get it up to b. Still only works in the second case, obviously. If you wanted, you could enumerate all solutions by filling in all possibilities for the top tzcnt(a) bits.
The only thing that remains is getting the inverse, you've probably seen it in the other answer, whatever it was, but for completeness you can compute it as follows: (not tested)
; input x
dword y = (x * x) + x - 1;
dword t = y * x;
y *= 2 - t;
t = y * x;
y *= 2 - t;
t = y * x;
y *= 2 - t;
; result y

Linear Programming - variable that equals the sign of an expression

I am trying to write a linear program and need a variable z that equals the sign of x-c, where x is another variable, and c is a constant.
I considered z = (x-c)/|x-c|. Unfortunately, if x=c, then this creates division by 0.
I cannot use z=x-c, because I don't want to weight it by the magnitude of the difference between x and c.
Does anyone know of a good way to express z so that it is the sign of x-c?
Thank you for any help and suggestions!
You can't model z = sign(x-c) exactly with a linear program (because the constraints in an LP are restricted to linear combinations of variables).
However, you can model sign if you are willing to convert your linear program into a mixed integer program, you can model this with the following two constraints:
L*b <= x - c <= U*(1-b)
z = 1 - 2*b
Where b is a binary variable, and L and U are lower and upper bounds on the quantity x-c. If b = 0, we have 0 <= x - c <= U and z = 1. If b = 1, we have L <= x - c <= 0 and z = 1 - 2*1 = -1.
You can use a solver like Gurobi to solve mixed integer programs.
For k » 1 this is a smooth approximation of the sign function:
Also
when ε → 0
These two approximations haven't the division by 0 issue but now you must tune a parameter.
In some languages (e.g. C++ / C) you can simply write something like this:
double sgn(double x)
{
return (x > 0.0) - (x < 0.0);
}
Anyway consider that many environments / languages already have a sign function, e.g.
Sign[x] in Mathematica
sign(x) in Matlab
Math.signum(x) in Java
sign(1, x) in Fortran
sign(x) in R
Pay close attention to what happens when x is equal to 0 (e.g. the Fortran function will return 1, with other languages you'll get 0).

How to calculate (n!)%1000000009

I need to find n!%1000000009.
n is of type 2^k for k in range 1 to 20.
The function I'm using is:
#define llu unsigned long long
#define MOD 1000000009
llu mulmod(llu a,llu b) // This function calculates (a*b)%MOD caring about overflows
{
llu x=0,y=a%MOD;
while(b > 0)
{
if(b%2 == 1)
{
x = (x+y)%MOD;
}
y = (y*2)%MOD;
b /= 2;
}
return (x%MOD);
}
llu fun(int n) // This function returns answer to my query ie. n!%MOD
{
llu ans=1;
for(int j=1; j<=n; j++)
{
ans=mulmod(ans,j);
}
return ans;
}
My demand is such that I need to call the function 'fun', n/2 times. My code runs too slow for values of k around 15. Is there a way to go faster?
EDIT:
In actual I'm calculating 2*[(i-1)C(2^(k-1)-1)]*[((2^(k-1))!)^2] for all i in range 2^(k-1) to 2^k. My program demands (nCr)%MOD caring about overflows.
EDIT: I need an efficient way to find nCr%MOD for large n.
The mulmod routine can be speeded up by a large factor K.
1) '%' is overkill, since (a + b) are both less than N.
- It's enough to evaluate c = a+b; if (c>=N) c-=N;
2) Multiple bits can be processed at once; see optimization to "Russian peasant's algorithm"
3) a * b is actually small enough to fit 64-bit unsigned long long without overflow
Since the actual problem is about nCr mod M, the high level optimization requires using the recurrence
(n+1)Cr mod M = (n+1)nCr / (n+1-r) mod M.
Because the left side of the formula ((nCr) mod M)*(n+1) is not divisible by (n+1-r), the division needs to be implemented as multiplication with the modular inverse: (n+r-1)^(-1). The modular inverse b^(-1) is b^(M-1), for M being prime. (Otherwise it's b^(phi(M)), where phi is Euler's Totient function.)
The modular exponentiation is most commonly implemented with repeated squaring, which requires in this case ~45 modular multiplications per divisor.
If you can use the recurrence
nC(r+1) mod M = nCr * (n-r) / (r+1) mod M
It's only necessary to calculate (r+1)^(M-1) mod M once.
Since you are looking for nCr for multiple sequential values of n you can make use of the following:
(n+1)Cr = (n+1)! / ((r!)*(n+1-r)!)
(n+1)Cr = n!*(n+1) / ((r!)*(n-r)!*(n+1-r))
(n+1)Cr = n! / ((r!)*(n-r)!) * (n+1)/(n+1-r)
(n+1)Cr = nCr * (n+1)/(n+1-r)
This saves you from explicitly calling the factorial function for each i.
Furthermore, to save that first call to nCr you can use:
nC(n-1) = n //where n in your case is 2^(k-1).
EDIT:
As Aki Suihkonen pointed out, (a/b) % m != a%m / b%m. So the method above so the method above won't work right out of the box. There are two different solutions to this:
1000000009 is prime, this means that a/b % m == a*c % m where c is the inverse of b modulo m. You can find an explanation of how to calculate it here and follow the link to the Extended Euclidean Algorithm for more on how to calculate it.
The other option which might be easier is to recognize that since nCr * (n+1)/(n+1-r) must give an integer, it must be possible to write n+1-r == a*b where a | nCr and b | n+1 (the | here means divides, you can rewrite that as nCr % a == 0 if you like). Without loss of generality, let a = gcd(n+1-r,nCr) and then let b = (n+1-r) / a. This gives (n+1)Cr == (nCr / a) * ((n+1) / b) % MOD. Now your divisions are guaranteed to be exact, so you just calculate them and then proceed with the multiplication as before. EDIT As per the comments, I don't believe this method will work.
Another thing I might try is in your llu mulmod(llu a,llu b)
llu mulmod(llu a,llu b)
{
llu q = a * b;
if(q < a || q < b) // Overflow!
{
llu x=0,y=a%MOD;
while(b > 0)
{
if(b%2 == 1)
{
x = (x+y)%MOD;
}
y = (y*2)%MOD;
b /= 2;
}
return (x%MOD);
}
else
{
return q % MOD;
}
}
That could also save some precious time.

Finding solution set of a Linear equation?

I need to find all possible solutions for this equation:
x+2y = N, x<100000 and y<100000.
given N=10, say.
I'm doing it like this in python:
for x in range(1,100000):
for y in range(1,100000):
if x + 2*y == 10:
print x, y
How should I optimize this for speed? What should I do?
Essentially this is a Language-Agnostic question. A C/C++ answer would also help.
if x+2y = N, then y = (N-x)/2 (supposing N-x is even). You don't need to iterate all over range(1,100000)
like this (for a given N)
if (N % 2): x0 = 1
else: x0 = 0
for x in range(x0, min(x,100000), 2):
print x, (N-x)/2
EDIT:
you have to take care that N-x does not turn negative. That's what min is supposed to do
The answer of Leftris is actually better than mine because these special cases are taken care of in an elegant way
we can iterate over the domain of y and calculate x. Also taking into account that x also has a limited range, we further limit the domain of y as [1, N/2] (as anything over N/2 for y will give negative value for x)
x=N;
for y in range(1,N/2-1):
x = x-2
print x, y
This just loops N/2 times (instead of 50000)
It doesn't even do those expensive multiplications and divisions
This runs in quadratic time. You can reduce it to linear time by rearranging your equation to the form y = .... This allows you to loop over x only, calculate y, and check whether it's an integer.
Lefteris E 's answer is the way to go,
but I do feel y should be in the range [1,N/2] instead of [1,2*N]
Explanation:
x+2*y = N
//replace x with N-2*y
N-2*(y) + 2*y = N
N-2*(N/2) + 2*y = N
2*y = N
//therefore, when x=0, y is maximum, and y = N/2
y = N/2
So now you can do:
for y in range(1,int(N/2)):
x = N - (y<<1)
print x, y
You may try to only examine even numbers for x given N =10;
the reason is that: 2y must be even, therefore, x must be even. This should reduce the total running time to half of examining all x.
If you also require that the answer is natural number, so negative numbers are ruled out. you can then only need to examine numbers that are even between [0,10] for x, since both x and 2y must be not larger than 10 alone.

Probability density function from a paper, implemented using C++, not working as intended

So i'm implementing a heuristic algorithm, and i've come across this function.
I have an array of 1 to n (0 to n-1 on C, w/e). I want to choose a number of elements i'll copy to another array. Given a parameter y, (0 < y <= 1), i want to have a distribution of numbers whose average is (y * n). That means that whenever i call this function, it gives me a number, between 0 and n, and the average of these numbers is y*n.
According to the author, "l" is a random number: 0 < l < n . On my test code its currently generating 0 <= l <= n. And i had the right code, but i'm messing with this for hours now, and i'm lazy to code it back.
So i coded the first part of the function, for y <= 0.5
I set y to 0.2, and n to 100. That means it had to return a number between 0 and 99, with average 20.
And the results aren't between 0 and n, but some floats. And the bigger n is, smaller this float is.
This is the C test code. "x" is the "l" parameter.
//hate how code tag works, it's not even working now
int n = 100;
float y = 0.2;
float n_copy;
for(int i = 0 ; i < 20 ; i++)
{
float x = (float) (rand()/(float)RAND_MAX); // 0 <= x <= 1
x = x * n; // 0 <= x <= n
float p1 = (1 - y) / (n*y);
float p2 = (1 - ( x / n ));
float exp = (1 - (2*y)) / y;
p2 = pow(p2, exp);
n_copy = p1 * p2;
printf("%.5f\n", n_copy);
}
And here are some results (5 decimals truncated):
0.03354
0.00484
0.00003
0.00029
0.00020
0.00028
0.00263
0.01619
0.00032
0.00000
0.03598
0.03975
0.00704
0.00176
0.00001
0.01333
0.03396
0.02795
0.00005
0.00860
The article is:
http://www.scribd.com/doc/3097936/cAS-The-Cunning-Ant-System
pages 6 and 7.
or search "cAS: cunning ant system" on google.
So what am i doing wrong? i don't believe the author is wrong, because there are more than 5 papers describing this same function.
all my internets to whoever helps me. This is important to my work.
Thanks :)
You may misunderstand what is expected of you.
Given a (properly normalized) PDF, and wanting to throw a random distribution consistent with it, you form the Cumulative Probability Distribution (CDF) by integrating the PDF, then invert the CDF, and use a uniform random predicate as the argument of the inverted function.
A little more detail.
f_s(l) is the PDF, and has been normalized on [0,n).
Now you integrate it to form the CDF
g_s(l') = \int_0^{l'} dl f_s(l)
Note that this is a definite integral to an unspecified endpoint which I have called l'. The CDF is accordingly a function of l'. Assuming we have the normalization right, g_s(N) = 1.0. If this is not so we apply a simple coefficient to fix it.
Next invert the CDF and call the result G^{-1}(x). For this you'll probably want to choose a particular value of gamma.
Then throw uniform random number on [0,n), and use those as the argument, x, to G^{-1}. The result should lie between [0,1), and should be distributed according to f_s.
Like Justin said, you can use a computer algebra system for the math.
dmckee is actually correct, but I thought that I would elaborate more and try to explain away some of the confusion here. I could definitely fail. f_s(l), the function you have in your pretty formula above, is the probability distribution function. It tells you, for a given input l between 0 and n, the probability that l is the segment length. The sum (integral) for all values between 0 and n should be equal to 1.
The graph at the top of page 7 confuses this point. It plots l vs. f_s(l), but you have to watch out for the stray factors it puts on the side. You notice that the values on the bottom go from 0 to 1, but there is a factor of x n on the side, which means that the l values actually go from 0 to n. Also, on the y-axis there is a x 1/n which means these values don't actually go up to about 3, they go to 3/n.
So what do you do now? Well, you need to solve for the cumulative distribution function by integrating the probability distribution function over l which actually turns out to be not too bad (I did it with the Wolfram Mathematica Online Integrator by using x for l and using only the equation for y <= .5). That however was using an indefinite integral and you are really integration along x from 0 to l. If we set the resulting equation equal to some variable (z for instance), the goal now is to solve for l as a function of z. z here is a random number between 0 and 1. You can try using a symbolic solver for this part if you would like (I would). Then you have not only achieved your goal of being able to pick random ls from this distribution, you have also achieved nirvana.
A little more work done
I'll help a little bit more. I tried doing what I said about for y <= .5, but the symbolic algebra system I was using wasn't able to do the inversion (some other system might be able to). However, then I decided to try using the equation for .5 < y <= 1. This turns out to be much easier. If I change l to x in f_s(l) I get
y / n / (1 - y) * (x / n)^((2 * y - 1) / (1 - y))
Integrating this over x from 0 to l I got (using Mathematica's Online Integrator):
(l / n)^(y / (1 - y))
It doesn't get much nicer than that with this sort of thing. If I set this equal to z and solve for l I get:
l = n * z^(1 / y - 1) for .5 < y <= 1
One quick check is for y = 1. In this case, we get l = n no matter what z is. So far so good. Now, you just generate z (a random number between 0 and 1) and you get an l that is distributed as you desired for .5 < y <= 1. But wait, looking at the graph on page 7 you notice that the probability distribution function is symmetric. That means that we can use the above result to find the value for 0 < y <= .5. We just change l -> n-l and y -> 1-y and get
n - l = n * z^(1 / (1 - y) - 1)
l = n * (1 - z^(1 / (1 - y) - 1)) for 0 < y <= .5
Anyway, that should solve your problem unless I made some error somewhere. Good luck.
Given that for any values l, y, n as described, the terms you call p1 and p2 are both in [0,1) and exp is in [1,..) making pow(p2, exp) also in [0,1) thus I don't see how you'd ever get an output with the range [0,n)