How to linearize the product of an integer and a binary variable? - linear-programming

I am implementing an algorithm in "An optimization-based approach to network inference",
and have some trouble in linearizing the product of an integer and a binary variable. The author in that paper prompts as follows,
Suppose that a bilinear term has the form ib, where b is a binary variable and i is an integer variable lower bounded by 0 and upper bounded by I. The product term ib can be linearized as follows. Replace the term ib with a new integer variable z and add four constraints: z <= Ib, z <= i, z >= i-(1-b)I, and z >= i.
I use pulp to solve this problem, should I just abandon the last constraint, z>=i?
I am a new leaner in linear programming.

Related

How can find out the contents of exp() built in function of the C numerics library of <cmath>

I recently decided to build a simple calculator programme, but when it came to exponents i was lost. OK you can use , but i'd rather know how they solves the problem of that function, other than an impossible amount of if statements e.g.
if(y==2){
x=xx;
}
else if (y==3){
x=xx*x;
}
And so on... So, how did 's exp() do it, and how can i find out?
From An algorithm for calculating exp(x) or e^x:
An algorithm for calculating exp(x) or e^x
This algorithm makes it possible for exp(x) or e^x to be calculated
using only the operations of addition, subtraction, multiplication and
division. The basic idea is to to use a polynomial approximation in
step 3 to calculate e^x. But because this approximation is only
accurate for small arguments x we must take steps 1 and 2 to reduce x
to a smaller value.
Split up x: Write x = n + r, where n is the
nearest integer to x and r is a real number between −½ and +½. Then e^x = e^n · e^r.
Evaluate e^n: Multiply the number e by itself n times. To 14 digits, e
= 2.7182818284590. The multiplication can be done quite efficiently. For example e 8 can be evaluated with just 3 multiplications if it is
written as (((e) 2 ) 2 ) 2. To further increase efficiency various
integer powers of e can be calculated once and stored in a lookup
table.
Evaluate e^r using the polynomial: EXP(r)=e^r=1 + r + (r^2)/2 + (r^3)/6 + (r^4)/24 + (r^5)/120
For r between −½ and +½ this polynomial is accurate to within
±0.00003.
EDIT:
If you are interested in the original implementation in the GNU libc library then you can download the sources from here.

Calculating Probability C++ Bernoulli Trials

The program asks the user for the number of times to flip a coin (n; the number of trials).
A success is considered a heads.
Flawlessly, the program creates a random number between 0 and 1. 0's are considered heads and success.
Then, the program is supposed to output the expected values of getting x amount of heads. For example if the coin was flipped 4 times, what are the following probabilities using the formula
nCk * p^k * (1-p)^(n-k)
Expected 0 heads with n flips: xxx
Expected 1 heads with n flips: xxx
...
Expected n heads with n flips: xxx
When doing this with "larger" numbers, the numbers come out to weird values. It happens if 15 or twenty are put into the input. I have been getting 0's and negative values for the value that should be xxx.
Debugging, I have noticed that the nCk has come out to be negative and not correct towards the upper values and beleive this is the issue. I use this formula for my combination:
double combo = fact(n)/fact(r)/fact(n-r);
here is the psuedocode for my fact function:
long fact(int x)
{
int e; // local counter
factor = 1;
for (e = x; e != 0; e--)
{
factor = factor * e;
}
return factor;
}
Any thoughts? My guess is my factorial or combo functions are exceeding the max values or something.
You haven't mentioned how is factor declared. I think you are getting integer overflows. I suggest you use double. That is because since you are calculating expected values and probabilities, you shouldn't be concerned much about precision.
Try changing your fact function to.
double fact(double x)
{
int e; // local counter
double factor = 1;
for (e = x; e != 0; e--)
{
factor = factor * e;
}
return factor;
}
EDIT:
Also to calculate nCk, you need not calculate factorials 3 times. You can simply calculate this value in the following way.
if k > n/2, k = n-k.
n(n-1)(n-2)...(n-k+1)
nCk = -----------------------
factorial(k)
You're exceeding the maximum value of a long. Factorial grows so quickly that you need the right type of number--what type that is will depend on what values you need.
Long is an signed integer, and as soon as you pass 2^31, the value will become negative (it's using 2's complement math).
Using an unsigned long will buy you a little time (one more bit), but for factorial, it's probably not worth it. If your compiler supports long long, then try an "unsigned long long". That will (usually, depends on compiler and CPU) double the number of bits you're using.
You can also try switching to use double. The problem you'll face there is that you'll lose accuracy as the numbers increase. A double is a floating point number, so you'll have a fixed number of significant digits. If your end result is an approximation, this may work okay, but if you need exact values, it won't work.
If none of these solutions will work for you, you may need to resort to using an "infinite precision" math package, which you should be able to search for. You didn't say if you were using C or C++; this is going to be a lot more pleasant with C++ as it will provide a class that acts like a number and that would use standard arithmetic operators.

Optimising code for modular arithmetic

I am trying to calculate below expression for large numbers.
Since the value of this expression will be very large, I just need the value of this expression modulus some prime number. Suppose the value of this expression is x and I choose the prime number 1000000007; I'm looking for x % 1000000007.
Here is my code.
#include<iostream>
#define MOD 1000000007
using namespace std;
int main()
{
unsigned long long A[1001];
A[2]=2;
for(int i=4;i<=1000;i+=2)
{
A[i]=((4*A[i-2])/i)%MOD;
A[i]=(A[i]*(i-1))%MOD;
while(1)
{
int N;
cin>>N;
cout<<A[N];
}
}
But even this much optimisation is failing for large values of N. For example if N is 50, the correct output is 605552882, but this gives me 132924730. How can I optimise it further to get the correct output?
Note : I am only considering N as even.
When you do modular arithmetic, there is no such operation as division. Instead, you take the modular inverse of the denominator and multiply. The modular inverse is computed using the extended Euclidean algorithm, discovered by Etienne Bezout in 1779:
# return y such that x * y == 1 (mod m)
function inverse(x, m)
a, b, u := 0, m, 1
while x > 0
q, r := divide(b, x)
x, a, b, u := b % x, u, x, a - q * u
if b == 1 return a % m
error "must be coprime"
The divide function returns both quotient and remainder. All of the assignment operators given above are simultaneous assignment, where all of the right hand sides are computed first, then all of the left hand sides are assigned simultaneously. You can see more about modular arithmetic at my blog.
For starters no modulo division is needed at all, your formula can be rewrited as follows:
N!/((N/2)!^2)
=(1.2.3...N)/((1.2.3...N/2)*(1.2.3...N/2))
=((N/2+1)...N)/(1.2.3...N/2))
ok now you are dividing bigger number by the smaller
so you can iterate the result by multiplicating divisor and divident
so booth sub results have similar magnitude
any time both numbers are divisible 2 shift them left
this will ensure that the do not overflow
if you are at the and of (N/2)! than continue the the multiplicetion only for the rest.
any time both subresults are divisible by anything divide them
until you are left with divison by 1
after this you can multiply with modulo arithmetics till the end normaly.
for more advanced approach see this.
N! and (N/2)! are decomposable much further than it seems at the first look
i had solved that for some time now,...
here is what i found: Fast exact bigint factorial
in shortcut your terms N! and ((N/2)!)^2 will disappear completely.
only simple prime decomposition + 4N <-> 1N correction will remind
solution:
I. (4N!)=((2N!)^2) . mul(i=all primes<=4N) of [i^sum(j=1,2,3,4,5,...4N>=i^j) of [(4N/(i^j))%2]]
II. (4N)!/((4N/2)!^2) = (4N)!/((2N)!^2)
----------------------------------------
I.=II. (4N)!/((2N)!^2)=mul(i=all primes<=4N) of [i^sum(j=1,2,3,4,5,...4N>=i^j) of [(4N/(i^j))%2]]
the only thing is that N must be divisible by 4 ... therefore 4N in all terms.
if you have N%4!=0 than solve for N-N%4 and the result correct by the misin 1-3 numbers.
hope it helps

knuth multiplicative hash

Is this a correct implementation of the Knuth multiplicative hash.
int hash(int v)
{
v *= 2654435761;
return v >> 32;
}
Does overflow in the multiplication affects the algorithm?
How to improve the performance of this method?
Knuth multiplicative hash is used to compute an hash value in {0, 1, 2, ..., 2^p - 1} from an integer k.
Suppose that p is in between 0 and 32, the algorithm goes like this:
Compute alpha as the closest integer to 2^32 (-1 + sqrt(5)) / 2. We get alpha = 2 654 435 769.
Compute k * alpha and reduce the result modulo 2^32:
k * alpha = n0 * 2^32 + n1 with 0 <= n1 < 2^32
Keep the highest p bits of n1:
n1 = m1 * 2^(32-p) + m2 with 0 <= m2 < 2^(32 - p)
So, a correct implementation of Knuth multiplicative algorithm in C++ is:
std::uint32_t knuth(int x, int p) {
assert(p >= 0 && p <= 32);
const std::uint32_t knuth = 2654435769;
const std::uint32_t y = x;
return (y * knuth) >> (32 - p);
}
Forgetting to shift the result by (32 - p) is a major mistake. As you would lost all the good properties of the hash. It would transform an even sequence into an even sequence which would be very bad as all the odd slots would stay unoccupied. That's like taking a good wine and mixing it with Coke. By the way, the web is full of people misquoting Knuth and using a multiplication by 2 654 435 761 without taking the higher bits. I just opened the Knuth and he never said such a thing. It looks like some guy who decided he was "smart" decided to take a prime number close to 2 654 435 769.
Bare in mind that most hash tables implementations don't allow this kind of signature in their interface, as they only allow
uint32_t hash(int x);
and reduce hash(x) modulo 2^p to compute the hash value for x. Those hash tables cannot accept the Knuth multiplicative hash. This might be a reason why so many people completely ruined the algorithm by forgetting to take the higher p bits.
So you can't use the Knuth multiplicative hash with std::unordered_map or std::unordered_set. But I think that those hash tables use a prime number as a size, so the Knuth multiplicative hash is not useful in this case. Using hash(x) = x would be a good fit for those tables.
Source: "Introduction to Algorithms, third edition", Cormen et al., 13.3.2 p:263
Source: "The Art of Computer Programming, Volume 3, Sorting and Searching", D.E. Knuth, 6.4 p:516
Ok, I looked it up in TAOCP volume 3 (2nd edition), section 6.4, page 516.
This implementation is not correct, though as I mentioned in the comments it may give the correct result anyway.
A correct way (I think - feel free to read the relevant chapter of TAOCP and verify this) is something like this: (important: yes, you must shift the result right to reduce it, not use bitwise AND. However, that is not the responsibility of this function - range reduction is not properly part of hashing itself)
uint32_t hash(uint32_t v)
{
return v * UINT32_C(2654435761);
// do not comment about the lack of right shift. I'm not ignoring it. read on.
}
Note the uint32_t's (as opposed to int's) - they make sure the multiplication overflows modulo 2^32, as it is supposed to do if you choose 32 as the word size. There is also no right shift by k here, because there is no reason to give responsibility for range-reduction to the basic hashing function and it is actually more useful to get the full result. The constant 2654435761 is from the question, the actual suggested constant is 2654435769, but that's a small difference that as far as I know does not affect the quality of the hash.
Other valid implementations shift the result right by some amount (not the full word size though, that doesn't make sense and C++ doesn't like it), depending on how many bits of hash you need. Or they may use an other constant (subject to certain conditions) or an other word size. Reducing the hash modulo something is not a valid implementation, but a common mistake, likely it is a de-facto standard way to do range-reduction on a hash. The bottom bits of a multiplicative hash are the worst-quality bits (they depend on less of the input), you only want to use them if you really need more bits, while reducing the hash modulo a power of two would return only the worst bits. Indeed that is equivalent to throwing away most of the input bits too. Reducing modulo a non-power-of-two is not so bad since it does mix in the higher bits, but it's not how the multiplicative hash was defined.
So to be clear, yes there is a right shift, but that is range reduction not hashing and can only be the responsibility of the hash table, since it depends on its internal size.
The type should be unsigned, otherwise the overflow is unspecified (thus possibly wrong, not just on non-2's-complement architectures but also on overly clever compilers) and the optional right shift would be a signed shift (wrong).
On the page I mention at the top, there is this formula:
Here we have A = 2654435761 (or 2654435769), w = 232 and M = 232. Calculating AK/w gives a fixed-point result with the format Q32.32, the mod 1 step takes only the 32 fraction bits. But that's just the same thing as doing a modular multiplication and then saying that the result is the fraction bits. Of course when multiplied by M, all the fraction bits become integer bits because of how M was chosen, and so it simplifies to just a plain old modular multiplication. When M is a lower power of two, that just right-shifts the result, as mentioned.
Might be late, but heres a Java Implementation of Knuth's Method :
For a hashtable of Size N :
public long hash(int key) {
long l = 2654435769L;
return (key * l >> 32) % N ;
}
If the input argument is a pointer then I use this
#include <inttypes.h>
uint32_t knuth_mul_hash(void* k) {
ptrdiff_t v = (ptrdiff_t)k * UINT32_C(2654435761);
v >>= ((sizeof(ptrdiff_t) - sizeof(uint32_t)) * 8); // Right-shift v by the size difference between a pointer and a 32-bit integer (0 for x86, 32 for x64)
return (uint32_t)(v & UINT32_MAX);
}
I usually use this as the default fallback hashing function in hashmap implementations, dictionaries, sets, etc...

how to determine base of a number?

Given a integer number and its reresentation in some arbitrary number system. The purpose is to find the base of the number system. For example, number is 10 and representation is 000010, then the base should be 10. Another example: number 21 representation is 0010101 then base is 2. One more example is: number is 6 and representation os 10100 then base is sqrt(2). Does anyone have any idea how to solve such problem?
___
\
number = /__ ( digit[i] * base ^ i )
You know number, you know all digit[i], you just have to find out base.
Whether solving this equation is simple or complex is left as an exercise.
I do not think that an answer can be given for every case. And I actually have a reason to think so! =)
Given a number x, with representation a_6 a_5 a_4 a_3 a_2 a_1 in base b, finding the base means solving
a_6 b^5 + a_5 b^4 + a_4 b^3 + a_3 b^2 + a_2 b^1 + a_1 = x.
This cannot be done generally, as shown by Abel and Ruffini. You might be luckier with shorter numbers, but if more than four digits are involved, the formulas are increasingly ugly.
There are quite a lot good approximation algorithms, though. See here.
For integers only, it's not that difficult (we can enumerate).
Let's look at 21 and its representation 10101.
1 * base^4 <= 21 < (1+1) * base^4
Let's generate the numbers for some bases:
base low high
2 16 32
3 81 162
More generally, we have N represented as ∑ ai * basei. Considering I the maximum power for which aI is non null we have:
a[I] * base^I <= N < (a[I] + 1) * base^I # does not matter if not representable
# Isolate base term
N / (a[I] + 1) < base^I <= N / a[I]
# Ith root
Ithroot( N / (a[I] + 1) ) < base <= Ithroot( N / a[I] )
# Or as a range
base in ] Ithroot(N / (a[I] + 1)), Ithroot( N / a[I] ) ]
In the case of an integer base, or if you have a list of known possible bases, I doubt they'll be many possibilities, so we can just try them out.
Note that it may be faster to actually take the Ithroot of N / (a[I] + 1) and iterate from here instead of computing the second one (which should be close enough)... but I'd need math review on that gut feeling.
If you really don't have any idea (trying to find a floating base)... well it's a bit more difficult I guess, but you can always refine the inequality (including one or two more terms) following the same property.
An algorithm like this should find the base if it is an integer, and should at least narrow down the choices for a non-integer base:
Let N be your integer and R be its representation in the mystery base.
Find the largest digit in R and call it r.
You know that your base is at least r + 1.
For base == (r+1, r+2, ...), let I represent R interpreted in base base
If I equals N, then base is your mystery base.
If I is less than N, try the next base.
If I is greater than N, then your base is somewhere between base - 1 and base.
It's a brute-force method, but it should work. You may also be able to speed it up a bit by incrementing base by more than one if I is significantly smaller than N.
Something else that might help speed things up, particularly in the case of a non-integer base: Remember that as several people have mentioned, a number in an arbitrary base can be expanded as a polynomial like
x = a[n]*base^n + a[n-1]*base^(n-1) + ... + a[2]*base^2 + a[1]*base + a[0]
When evaluating potential bases, you don't need to convert the entire number. Start by converting only the largest term, a[n]*base^n. If this is larger than x, then you already know your base is too big. Otherwise, add one term at a time (moving from most-significant to least-significant). That way, you don't waste time computing terms after you know your base is wrong.
Also, there is another quick way to eliminate a potential base. Notice that you can re-arrange the above polynomial expression and get
(x - a[0]) = a[n]*base^n + a[n-1]*base^(n-1) + ... + a[2]*base^2 + a[1]*base
or
(x - a[0]) = (a[n]*base^(n-1) + a[n-1]*base^(n-2) + ... + a[2]*base + a[1])*base
You know the values of x and a[0] (the "ones" digit, you can interpret it regardless of base). What this gives you the extra condition that (x - a[0]) must be evenly divisible by base (since all your a[] values are integers). If you calculate (x - a[0]) % base and get a non-zero result, then base cannot be the correct base.
Im not sure if this is efficiently solvable. I would just try to pick a random base, see if given the base the result is smaller, larger or equal to the number. In case its smaller, pick a larger base, in case its larger pick a smaller base, otherwise you have the correct base.
This should give you a starting point:
Create an equation from the number and representation, number 42 and represenation "0010203" becomes:
1 * base ^ 4 + 2 * base ^ 2 + 3 = 42
Now you solve the equation to get the value of base.
I'm thinking you will need try and check different bases. To be efficient, your starting base could be max(digit) + 1 as you know it won't be less than that. If that's too small double until you exceed, and then use binary search to narrow it down. This way your algorithm should run in O(log n) for normal situations.
Several of the other posts suggest that the solution might be found by finding the roots of the polynomial the number represents. These will, of course, generally work, though they will have a tendency to produce negative and complex bases as well as positive integers.
Another approach would be to cast this as an integer programming problem and solve using branch-and-bound.
But I suspect that the suggestion of guessing-and-testing will be quicker than any of the cleverer proposals.