Mixed Integer Linear Programming: how to find second greatest value in array

Mixed Integer Linear Programming: how to find second greatest value in array - linear-programming

I am trying to determine the logic necessary to find the second greatest value in an array of values in a MILP.
Finding the maximum value is easy:
forall(val in Array):
maxVal >= val
However the second greatest value is much more difficult (if impossible?).
I have tried the following:
forall(val in Array | val != maxVal):
secondMaxVal >= val
This logic fails in the MILP (I am programming in Mosel - the FICO Xpress language), perhaps because my Array actually has two dimensions and contains decision variables. Because of the two dimensions, the logic in my program looks more like:
forall(i in Index1):
maxVal >= sum(j in Index2) Array(i, j)
forall(i in Index1 | sum(j in Index2) Array(i, j) != maxVal):
secondMaxVal >= sum(j in Index2) Array(i, j)
Note that | is interpreted as "where". The program balks at the != statement, calling it invalid logic - probably because Array contains decision variables, i.e. the values are determined by the objective function, which effectively minimizes the values in Array, where all values in Array are > 0.
Is it possible to define the second greatest value in an array of decision variables in a MILP?

Related

Proving that a two-pointer approach works (pair sum)

I was trying to solve the pair sum problem, i.e., given a sorted array, we need to if there exist two indices i and j such that i!=j and a[i]+a[j] == k for some k.
One of the approaches to do the same problem is running two nested for loops, resulting in a complexity of O(n*n).
Another way to solve it is using a two-pointer technique. I wasn't able to solve the problem using the two-pointer method and therefore looked it up, but I couldn't understand why it works. How do I prove that it works?
#define lli long long
//n is size of array
bool f(lli sum) {
int l = 0, r = n - 1;
while ( l < r ) {
if ( A[l] + A[r] == sum ) return 1;
else if ( A[l] + A[r] > sum ) r--;
else l++;
}
return 0;
}

Well, think of it this way:
You have a sorted array (you didn't mention that the array is sorted, but for this problem, that is generally the case):
{ -1,4,8,12 }
The algorithm starts by choosing the first element in the array and the last element, adding them together and comparing them to the sum you are after.
If our initial sum matches the sum we are looking for, great!! If not, well, we need to continue looking at possible sums either greater than or less than the sum we started with. By starting with the smallest and the largest value in the array for our initial sum, we can eliminate one of those elements as being part of a possible solution.
Let's say we are looking for the sum 3. We see that 3 < 11. Since our big number (12) is paired with the smallest possible number (-1), the fact that our sum is too large means that 12 cannot be part of any possible solution, since any other sum using 12 would have to be larger than 11 (12 + 4 > 12 - 1, 12 + 8 > 12 - 1).
So we know we cannot possibly make a sum of 3 using 12 + one other number in the array; they would all be too big. So we can eliminate 12 from our search by moving down to the next largest number, 8. We do the same thing here. We see 8 + -1 is still too big, so we move down to the next number, 4, and voila! We find a match.
The same logic applies if the sum we get is too small. We can eliminate our small number, because any sum we can get using our current smallest number has to be less than or equal to the sum we get when it is paired with our current largest number.
We keep doing this until we find a match, or until the indices cross each other, since, after they cross, we are simply adding up pairs of numbers we have already checked (i.e. 4 + 8 = 8 + 4).
This may not be a mathematical proof, but hopefully it illustrates how the algorithm works.

Stephen Docy made a great job tracing the program's execution and explaining the rationale behind its decisions. Maybe making the answer closer to a mathematical proof of the algorithm's correctness could make it easier to generalize to problems like the one mentioned by zzzzzzz in the comments.
We are given a sorted array A of length n and an integer sum. We need to find if there are any two indices i and j such that i != j and A[i] + A[j] == sum.
The solutions (i, j) and (j, i) are equivalent, so we can assume that i < j without loss of generality. In the program, the current guess at i is called l and the current guess at j is called r.
We iteratively slice the array till we find a slice that has the two summands that sum to sum at its boundary, or we find there is no such slice. The slice starts at index l and ends at index r and I will write it as (l, r).
Initially, the slice is the whole array. In each iteration, the length of the slice is decreased by 1: either the left boundary index l increases or the right boundary index r decreases. When the slice length decreases to 1 (l == r), there are no pairs of different indexes inside the slice, so false is returned. This means that the algorithm halts for any input. The O(n) complexity is also immediately clear. The correctness remains to be proven.
We can assume there is a solution; if there is none, the analysis in the above paragraph applies and the branch returning true can never be executed.
The loop has an invariant (statement that holds true regardless of how many iterations have been done yet): When a solution exists, it is either (l, r) itself or its sub-slice. Mathematically, such an invariant is a lemma -- something that is not very useful by itself but makes a stepping stone in the overall proof. We get the overall correctness by initially making (l, r) the whole array and observing that as each iteration makes the slice shorter, the invariant ensures that we will eventually find the solution. Now, we just need to prove the invariant.
We will prove the invariant by induction. The induction base is trivial -- the initial slice (l, r) either is the solution, or contains it as a sub-slice. The hard part is the induction step, i.e. proving that when (l, r) contains the solution, either it is the solution itself or the slice for the next iteration contains the solution as a sub-slice.
When A[l] + A[r] == sum, (l, r) is the solution itself; the first condition in the loop is triggered, true is returned, and everyone is happy.
When A[l] + A[r] > sum, the slice for the next iteration is (l, r - 1), which still contains the solution. Let's prove that by contradiction, assuming (l, r - 1) does not contain the solution. How could that happen, when (l, r) contained the solution (by induction hypothesis)? The only way would be that the solution (i, j) has j == r (r is the only index we removed from the slice). Because by definition A[i] + A[j] == sum, we get A[i] + A[r] == sum < A[l] + A[r] in this branch. When we subtract A[r] from both sides of the inequality, we get A[i] < A[l]. But A[l] is the smallest value in the (l, r) slice (the array is sorted), so this is a contradiction.
When A[l] + A[r] < sum, the slice for the next iteration is (l + 1, r). The argument is symmetric to the previous case.
∎
The algorithm may be easily rewritten as recursive, which simplifies the analysis at the expense of actual performance. This is the functional programming approach.
#define lli long long
//n is size of array
bool f(lli sum) {
return g(sum, 0, n - 1);
}
bool g(lli sum, int l, int r) {
if ( l >= r ) return 0;
else if ( A[l] + A[r] == sum ) return 1;
else if ( A[l] + A[r] > sum ) return g(sum, l, r - 1);
else return g(sum, l + 1, r);
}
The f function still contains the initialization, but it calls the new g function, which implements the original loop. Instead of keeping the state in local variables, it uses its parameters. Each call of the g function corresponds to a single iteration of the original loop.
The g function is a solution to a more general problem than the original one: Given a sorted array A, are there any two indices i and j such that i != j and A[i] + A[j] == sum and both i and j are between l and r (inclusive)?
This makes reading the analysis even simpler. The loop invariant is actually the proof of correctness of g and the structure of g guides the proof.

[Competitive Programming]:How do I optimise this brute force method? [duplicate]

If n numbers are given, how would I find the total number of possible triangles? Is there any method that does this in less than O(n^3) time?
I am considering a+b>c, b+c>a and a+c>b conditions for being a triangle.

Assume there is no equal numbers in given n and it's allowed to use one number more than once. For example, we given a numbers {1,2,3}, so we can create 7 triangles:
1 1 1
1 2 2
1 3 3
2 2 2
2 2 3
2 3 3
3 3 3
If any of those assumptions isn't true, it's easy to modify algorithm.
Here I present algorithm which takes O(n^2) time in worst case:
Sort numbers (ascending order).
We will take triples ai <= aj <= ak, such that i <= j <= k.
For each i, j you need to find largest k that satisfy ak <= ai + aj. Then all triples (ai,aj,al) j <= l <= k is triangle (because ak >= aj >= ai we can only violate ak < a i+ aj).
Consider two pairs (i, j1) and (i, j2) j1 <= j2. It's easy to see that k2 (found on step 2 for (i, j2)) >= k1 (found one step 2 for (i, j1)). It means that if you iterate for j, and you only need to check numbers starting from previous k. So it gives you O(n) time complexity for each particular i, which implies O(n^2) for whole algorithm.
C++ source code:
int Solve(int* a, int n)
{
int answer = 0;
std::sort(a, a + n);
for (int i = 0; i < n; ++i)
{
int k = i;
for (int j = i; j < n; ++j)
{
while (n > k && a[i] + a[j] > a[k])
++k;
answer += k - j;
}
}
return answer;
}
Update for downvoters:
This definitely is O(n^2)! Please read carefully "An Introduction of Algorithms" by Thomas H. Cormen chapter about Amortized Analysis (17.2 in second edition).
Finding complexity by counting nested loops is completely wrong sometimes.
Here I try to explain it as simple as I could. Let's fix i variable. Then for that i we must iterate j from i to n (it means O(n) operation) and internal while loop iterate k from i to n (it also means O(n) operation). Note: I don't start while loop from the beginning for each j. We also need to do it for each i from 0 to n. So it gives us n * (O(n) + O(n)) = O(n^2).

There is a simple algorithm in O(n^2*logn).
Assume you want all triangles as triples (a, b, c) where a <= b <= c.
There are 3 triangle inequalities but only a + b > c suffices (others then hold trivially).
And now:
Sort the sequence in O(n * logn), e.g. by merge-sort.
For each pair (a, b), a <= b the remaining value c needs to be at least b and less than a + b.
So you need to count the number of items in the interval [b, a+b).
This can be simply done by binary-searching a+b (O(logn)) and counting the number of items in [b,a+b) for every possibility which is b-a.
All together O(n * logn + n^2 * logn) which is O(n^2 * logn). Hope this helps.

If you use a binary sort, that's O(n-log(n)), right? Keep your binary tree handy, and for each pair (a,b) where a b and c < (a+b).

Let a, b and c be three sides. The below condition must hold for a triangle (Sum of two sides is greater than the third side)
i) a + b > c
ii) b + c > a
iii) a + c > b
Following are steps to count triangle.
Sort the array in non-decreasing order.
Initialize two pointers ‘i’ and ‘j’ to first and second elements respectively, and initialize count of triangles as 0.
Fix ‘i’ and ‘j’ and find the rightmost index ‘k’ (or largest ‘arr[k]‘) such that ‘arr[i] + arr[j] > arr[k]‘. The number of triangles that can be formed with ‘arr[i]‘ and ‘arr[j]‘ as two sides is ‘k – j’. Add ‘k – j’ to count of triangles.
Let us consider ‘arr[i]‘ as ‘a’, ‘arr[j]‘ as b and all elements between ‘arr[j+1]‘ and ‘arr[k]‘ as ‘c’. The above mentioned conditions (ii) and (iii) are satisfied because ‘arr[i] < arr[j] < arr[k]'. And we check for condition (i) when we pick 'k'
4.Increment ‘j’ to fix the second element again.
Note that in step 3, we can use the previous value of ‘k’. The reason is simple, if we know that the value of ‘arr[i] + arr[j-1]‘ is greater than ‘arr[k]‘, then we can say ‘arr[i] + arr[j]‘ will also be greater than ‘arr[k]‘, because the array is sorted in increasing order.
5.If ‘j’ has reached end, then increment ‘i’. Initialize ‘j’ as ‘i + 1′, ‘k’ as ‘i+2′ and repeat the steps 3 and 4.
Time Complexity: O(n^2).
The time complexity looks more because of 3 nested loops. If we take a closer look at the algorithm, we observe that k is initialized only once in the outermost loop. The innermost loop executes at most O(n) time for every iteration of outer most loop, because k starts from i+2 and goes upto n for all values of j. Therefore, the time complexity is O(n^2).

I have worked out an algorithm that runs in O(n^2 lgn) time. I think its correct...
The code is wtitten in C++...
int Search_Closest(A,p,q,n) /*Returns the index of the element closest to n in array
A[p..q]*/
{
if(p<q)
{
int r = (p+q)/2;
if(n==A[r])
return r;
if(p==r)
return r;
if(n<A[r])
Search_Closest(A,p,r,n);
else
Search_Closest(A,r,q,n);
}
else
return p;
}
int no_of_triangles(A,p,q) /*Returns the no of triangles possible in A[p..q]*/
{
int sum = 0;
Quicksort(A,p,q); //Sorts the array A[p..q] in O(nlgn) expected case time
for(int i=p;i<=q;i++)
for(int j =i+1;j<=q;j++)
{
int c = A[i]+A[j];
int k = Search_Closest(A,j,q,c);
/* no of triangles formed with A[i] and A[j] as two sides is (k+1)-2 if A[k] is small or equal to c else its (k+1)-3. As index starts from zero we need to add 1 to the value*/
if(A[k]>c)
sum+=k-2;
else
sum+=k-1;
}
return sum;
}
Hope it helps........

possible answer
Although we can use binary search to find the value of 'k' hence improve time complexity!

N0,N1,N2,...Nn-1
sort
X0,X1,X2,...Xn-1 as X0>=X1>=X2>=...>=Xn-1
choice X0(to Xn-3) and choice form rest two item x1...
choice case of (X0,X1,X2)
check(X0<X1+X2)
OK is find and continue
NG is skip choice rest

It seems there is no algorithm better than O(n^3). In the worst case, the result set itself has O(n^3) elements.
For Example, if n equal numbers are given, the algorithm has to return n*(n-1)*(n-2) results.

Whats the efficient way to sum up the elements of an array in following way?

Suppose you are given an n sized array A and a integer k
Now you have to follow this function:
long long sum(int k)
{
long long sum=0;
for(int i=0;i<n;i++){
sum+=min(A[i],k);
}
return sum;
}
what is the most efficient way to find sum?
EDIT: if I am given m(<=100000) queries, and given a different k every time, it becomes very time consuming.

If set of queries changes with each k then you can't do better than in O(n). Your only options for optimizing is to use multiple threads (each thread sums some region of array) or at least ensure that your loop is properly vectorized by compiler (or write vectorized version manually using intrinsics).
But if set of queries is fixed and only k is changed, then you may do in O(log n) by using following optimization.
Preprocess array. This is done only once for all ks:
Sort elements
Make another array of the same length which contains partial sums
For example:
inputArray: 5 1 3 8 7
sortedArray: 1 3 5 7 8
partialSums: 1 4 9 16 24
Now, when new k is given, you need to perform following steps:
Make binary search for given k in sortedArray -- returns index of maximal element <= k
Result is partialSums[i] + (partialSums.length - i) * k

You can do way better than that if you can sort the array A[i] and have a secondary array prepared once.
The idea is:
Count how many items are less than k, and just compute the equivalent sum by the formula: count*k
Prepare an helper array which will give you the sum of the items superior to k directly
Preparation
Step 1: sort the array
std::sort(begin(A), end(A));
Step 2: prepare an helper array
std::vector<long long> p_sums(A.size());
std::partial_sum(rbegin(A), rend(A), begin(p_sums));
Query
long long query(int k) {
// first skip all items whose value is below k strictly
auto it = std::lower_bound(begin(A), end(A), k);
// compute the distance (number of items skipped)
auto index = std::distance(begin(A), it);
// do the sum
long long result = index*k + p_sums[index];
return result;
}
The complexity of the query is: O(log(N)) where N is the length of the array A.
The complexity of the preparation is: O(N*log(N)). We could go down to O(N) with a radix sort but I don't think it is useful in your case.
References
std::sort()
std::partial_sum()
std::lower_bound()

What you do seems absolutely fine. Unless this is really absolutely time critical (that is customers complain that your app is too slow and you measured it, and this function is the problem, in which case you can try some non-portable vector instructions, for example).
Often you can do things more efficiently by looking at them from a higher level. For example, if I write
for (n = 0; n < 1000000; ++n)
printf ("%lld\n", sum (100));
then this will take an awful long time (half a trillion additions) and can be done a lot quicker. Same if you change one element of the array A at a time and recalculate sum each time.

Suppose there are x elements of array A which are no larger than k and set B contains those elements which are larger than k and belongs to A.
Then the result of function sum(k) equals
k * x + sum_b
,where sum_b is the sum of elements belonging to B.
You can firstly sort the the array A, and calculate the array pre_A, where
pre_A[i] = pre_A[i - 1] + A[i] (i > 0),
or 0 (i = 0);
Then for each query k, use binary search on A to find the largest element u which is no larger than k. Assume the index of u is index_u, then sum(k) equals
k * index_u + pre_A[n] - pre_A[index_u]
. The time complex for each query is log(n).
In case array A may be dynamically changed, you can use BST to handle it.

How can I find number of consecutive sequences of various lengths satisfy a particular property?

I am given a array A[] having N elements which are positive integers
.I have to find the number of sequences of lengths 1,2,3,..,N that satisfy a particular property?
I have built an interval tree with O(nlogn) complexity.Now I want to count the number of sequences that satisfy a certain property ?
All the properties required for the problem are related to sum of the sequences
Note an array will have N*(N+1)/2 sequences. How can I iterate over all of them in O(nlogn) or O(n) ?

If we let k be the moving index from 0 to N(elements), we will run an algorithm that is essentially looking for the MIN R that satisfies the condition (lets say I), then every other subset for L = k also is satisfied for R >= I (this is your short circuit). After you find I, simply return an output for (L=k, R>=I). This of course assumes that all numerics in your set are >= 0.
To find I, for every k, begin at element k + (N-k)/2. Figure out if this defined subset from (L=k, R=k+(N-k)/2) satisfies your condition. If it does, then decrement R until your condition is NOT met, then R=1 is your MIN (your could choose to print these results as you go, but they results in these cases would be essentially printed backwards). If (L=k, R=k+(N-k)/2) does not satisfy your condition, then INCREMENT R until it does, and this becomes your MIN for that L=k. This degrades your search space for each L=k by a factor of 2. As k increases and approaches N, your search space continuously decreases.
// This declaration wont work unless N is either a constant or MACRO defined above
unsigned int myVals[N];
unsigned int Ndiv2 = N / 2;
unsigned int R;
for(unsigned int k; k < N; k++){
if(TRUE == TESTVALS(myVals, k, Ndiv2)){ // It Passes
for(I = NDiv2; I>=k; I--){
if(FALSE == TESTVALS(myVals, k, I)){
I++;
break;
}
}
}else{ // It Didnt Pass
for(I = NDiv2; I>=k; I++){
if(TRUE == TESTVALS(myVals, k, I)){
break;
}
}
}
// PRINT ALL PAIRS from L=k, from R=I to R=N-1
if((k & 0x00000001) == 0) Ndiv2++;
} // END --> for(unsigned int k; k < N; k++)
The complexity of the algorithm above is O(N^2). This is because for each k in N(i.e. N iterations / tests) there is no greater than N/2 values for each that need testing. Big O notation isnt concerned about the N/2 nor the fact that truly N gets smaller as k grows, it is concerned with really only the gross magnitude. Thus it would say N tests for every N values thus O(N^2)
There is an Alternative approach which would be FASTER. That approach would be to whenever you wish to move within the secondary (inner) for loops, you could perform a move have the distance algorithm. This would get you to your O(nlogn) set of steps. For each k in N (which would all have to be tested), you run this half distance approach to find your MIN R value in logN time. As an example, lets say you have a 1000 element array. when k = 0, we essentially begin the search for MIN R at index 500. If the test passes, instead of linearly moving downward from 500 to 0, we test 250. Lets say the actual MIN R for k = 0 is 300. Then the tests to find MIN R would look as follows:
R=500
R=250
R=375
R=312
R=280
R=296
R=304
R=300
While this is oversimplified, your are most likely going to have to optimize, and test 301 as well 299 to make sure youre in the sweet spot. Another not is to be careful when dividing by 2 when you have to move in the same direction more than once in a row.

#user1907531: First of all , if you are participating in an online contest of such importance at national level , you should refrain from doing this cheap tricks and methodologies to get ahead of other deserving guys. Second, a cheater like you is always a cheater but all this hampers the hard work of those who have put in making the questions and the competitors who are unlike you. Thirdly, if #trumetlicks asks you why haven't you tagged the ques as homework , you tell another lie there.And finally, I don't know how could so many people answer this question this cheater asked without knowing the origin/website/source of this question. This surely can't be given by a teacher for homework in any Indian school. To tell everyone this cheater has asked you the complete solution of a running collegiate contest in India 6 hours before the contest ended and he has surely got a lot of direct helps and top of that invited 100's others to cheat from the answers given here. So, good luck to all these cheaters .

Given an array of N numbers,find the number of sequences of all lengths having the range of R?

This is a follow up question to Given a sequence of N numbers ,extract number of sequences of length K having range less than R?
I basically need a vector v as an answer of size N such that V[i] denotes number of sequences of length i which have range <=R.

Traditionally, in recursive solutions, you would compute the solution for K = 0, K = 1, and then find some kind of recurrence relation between subsequent elements to avoid recomputing the solution from scratch each time.
However here I believe that maybe attacking the problem from the other side would be interesting, because of the property of the spread:
Given a sequence of spread R (or less), any subsequence has a spread inferior to R as well
Therefore, I would first establish a list of the longest subsequences of spread R beginning at each index. Let's call this list M, and have M[i] = j where j is the higher index in S (the original sequence) for which S[j] - S[i] <= R. This is going to be O(N).
Now, for any i, the number of sequences of length K starting at i is either 0 or 1, and this depends whether K is greater than M[i] - i or not. A simple linear pass over M (from 0 to N-K) gives us the answer. This is once again O(N).
So, if we call V the resulting vector, with V[k] denoting the number of subsequences of length K in S with spread inferior to R, then we can do it in a single iteration over M:
for i in [0, len(M)]:
for k in [0, M[i] - i]:
++V[k]
The algorithm is simple, however the number of updates can be rather daunting. In the worst case, supposing than M[i] - i equals N - i, it is O(N*N) complexity. You would need a better data structure (probably an adaptation of a Fenwick Tree) to use this algorithm an lower the cost of computing those numbers.

If you are looking for contiguous sequences, try doing it recursively : The K-length subsequences set having a range inferior than R are included in the (K-1)-length subsequences set.
At K=0, you have N solutions.
Each time you increase K, you append (resp. prepend) the next (resp.previous) element, check if it the range is inferior to R, and either store it in a set (look for duplicates !) or discard it depending on the result.
If think the complexity of this algorithm is O(n*n) in the worst-case scenario, though it may be better on average.

I think Matthieu has the right answer when looking for all sequences with spread R.
As you are only looking for sequences of length K, you can do a little better.
Instead of looking at the maximum sequence starting at i, just look at the sequence of length K starting at i, and see if it has range R or not. Do this for every i, and you have all sequences of length K with spread R.
You don't need to go through the whole list, as the latest start point for a sequence of length K is n-K+1. So complexity is something like (n-K+1)*K = n*K - K*K + K. For K=1 this is n,
and for K=n it is n. For K=n/2 it is n*n/2 - n*n/4 + n/2 = n*n/2 + n/2, which I think is the maximum. So while this is still O(n*n), for most values of K you get a little better.

Start with a simpler problem: count the maximal length of sequences, starting at each index and having the range, equal to R.
To do this, let first pointer point to the first element of the array. Increase second pointer (also starting from the first element of the array) while sequence between pointers has the range, less or equal to R. Push every array element, passed by second pointer, to min-max-queue, made of a pair of mix-max-stacks, described in this answer. When difference between max and min values, reported by min-max-queue exceeds R, stop increasing second pointer, increment V[ptr2-ptr1], increment first pointer (removing element, pointed by it, from min-max-queue), and continue increasing second pointer (keeping range under control).
When second pointer leaves bounds of the array, increment V[N-ptr1] for all remaining ptr1 (corresponding ranges may be less or equal to R). To add all other ranges, that are less than R, compute cumulative sum of array V[], starting from its end.
Both time and space complexities are O(N).
Pseudo-code:
p1 = p2 = 0;
do {
do {
min_max_queue.push(a[p2]);
++p2;
} while (p2 < N && min_max_queue.range() <= R);
if (p2 < N) {
++v[p2 - p1 - 1];
min_max_queue.pop();
++p1;
}
} while (p2 < N);
for (i = 1; i <= N-p1; ++i) {
++v[i];
}
sum = 0;
for (j = N; j > 0; --j) {
value = v[j];
v[j] += sum;
sum += value;
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js