Finding the smallest window - c++

Given two arrays A[n] and B[m], how can I find the smallest window in A that contains all the elements of B.
I am trying to solve this problem in O(n) time but I am having problem doing it. Is there any well know algorithm or procedure for solving it.

If m > n, A cannot contain all the elements of B (and hence we have an O(1) solution).
Otherwise:
Create a hash table mapping elements of B to the sequence {0..m-1} (this is O(n) since m <= n) .
Create an array C[m] to count occurences of the members of B (initialise to 0) in the current window.
Create a variable z to count the the number of 0 elements of C (initialise to m).
Create variables s and e to denote the start and end of the current window
while e < n:
If z is nonzero, increment e and update C and z. O(1)
else consider this window as a possible solution (i.e. if it's the min so far, store it), then increment s and update C and z. O(1)
The while loop can be shown to have no more than 2n iterations. So the whole thing is O(n), I think.

countLet's call window 'minimal' if it can't be reduced. I.e., after increasing its left border or decreasing its right border it's no longer valid window (doesn't contain all elements from B). There three in your example: [0, 2], [2, 6], [6, 7]
Let's assume say that you already found leftmost minimal window [left, right]. ([0, 2] in your example) Now we'll just slide it to the right.
// count[x] tells how many times number 'x'
// happens between 'left' and 'right' in 'A'
while (right < N - 1) {
// move right border by 1
++right;
if (A[right] is in B) {
++count[A[right]];
}
// check if we can move left border now
while (count[A[left]] > 1) {
--count[A[left]];
++left;
}
// check if current window is optimal
if (right - left + 1 < currentMin) {
currentMin = right - left + 1;
}
}
This sliding works because different 'minimal' windows can't contain one another.

Related

Intuition behind initializing both the pointers at the beginning versus one at the beginning and other at the ending

I solved a problem few days ago:
Given an unsorted array A containing N integers and an integer B, find if there exists a pair of elements in the array whose difference is B. Return true if any such pair exists else return false. For [2, 3, 5, 10, 50, 80]; B=40;, it should return true.
as:
int Solution::solve(vector<int> &A, int B) {
if(A.size()==1) return false;
int i=0, j=0; //note: both initialized at the beginning
sort(begin(A), end(A));
while(i< A.size() && j<A.size()) {
if(A[j]-A[i]==B && i!=j) return true;
if(A[j]-A[i]<B) j++;
else i++;
}
return false;
}
While solving this problem the mistake I had committed earlier was initializing i=0 and j=A.size()-1. Due to this, decrementing j and incrementing i both decreased the differences and so valid differences were missed. On initializing both at the beginning as above, I was able to solve the problem.
Now I am solving a follow-up 3sum problem:
Given an integer array nums, return all the triplets [nums[i], nums[j], nums[k]] such that i != j, i != k, and j != k, and nums[i] + nums[j] + nums[k] == 0. Notice that the solution set must not contain duplicate triplets. If nums = [-1,0,1,2,-1,-4], output should be: [[-1,-1,2],[-1,0,1]] (any order works).
A solution to this problem is given as:
vector<vector<int>> threeSum(vector<int>& nums) {
sort(nums.begin(), nums.end());
vector<vector<int>> res;
for (unsigned int i=0; i<nums.size(); i++) {
if ((i>0) && (nums[i]==nums[i-1]))
continue;
int l = i+1, r = nums.size()-1; //note: unlike `l`, `r` points to the end
while (l<r) {
int s = nums[i]+nums[l]+nums[r];
if (s>0) r--;
else if (s<0) l++;
else {
res.push_back(vector<int> {nums[i], nums[l], nums[r]});
while (nums[l]==nums[l+1]) l++;
while (nums[r]==nums[r-1]) r--;
l++; r--;
}
}
}
return res;
}
The logic is pretty straightforward: each of nums[i]s (from the outer loop) is the 'target' that we search for, in the inner while loop using a two pointer approach like in the first code at the top.
What I don't follow is the logic behind initializing r=nums.size()-1 and working backwards - how are valid differences (in this case, the 'sum's actually) not being missed?
Edit1: Both problems contain negative and positive numbers, as well as zeroes.
Edit2: I understand how both snippets work. My question specifically is the reasoning behind r=nums.size()-1 in code# 2: as we see in code #1 above it, starting r from the end misses some valid pairs (http://cpp.sh/36y27 - the valid pair (10,50) is missed); so why do we not miss valid pair(s) in the second code?
Reformulating the problem
The difference between the two algorithms boils down to addition and subtraction, not 3 vs 2 sums.
Your 3-sum variant asks for the sum of 3 numbers matching a target. When you fix one number in the outer loop, the inner loop reduces to a 2-sum that's actually a 2-sum (i.e. addition). The "2-sum" variant in your top code is really a 2-difference (i.e. subtraction).
You're comparing 2-sum (A[i] + A[j] == B s.t. i != j) to a 2-difference (A[i] - A[j] == B s.t. i != j). I'll use those terms going forward, and forget about the outer loop in 3-sum as a red herring.
2-sum
Why L = 0, R = length - 1 works for 2-sum
For 2-sum, you probably already see the intuition of starting at the ends and working towards the middle, but it's worth making the logic explicit.
At any iteration in the loop, if the sum of A[L] + A[R] > B, then we have no choice but to decrement the right pointer to a lower index. Incrementing the left pointer is guaranteed to increase our sum or leave it the same and we'll get further and further away from the target, potentially closing off the potential to find the solution pair, which may well still include A[L].
On the other hand, if A[L] + A[R] < B, then you must increase your sum by moving the left pointer forward to a larger number. There's a chance A[R] is still part of that sum -- we can't guarantee it's not a part of the sum until A[L] + A[R] > B.
The key takeaway is that there is no decision to be made at each step: either the answer was found or one of the two numbers at either index can be definitively eliminated from further consideration.
Why L = 0, R = 0 doesn't work for 2-sum
This explains why starting both numbers at 0 won't help for 2-sum. What rule would you use to increment the pointers to find a solution? There's no way to know which pointer needs to move forward and which should wait. Both moves increase the sum at best and neither move decreases the sum (the start is the minimum sum, A[0] + A[0]). Moving the wrong one could prohibit finding the solution later on, and there's no way to definitively eliminate either number.
You're back to keeping left at 0 and moving the right pointer forward to the first element that causes A[R] + A[L] > B, then running the tried-and-true original two-pointer logic. You might as well just start R at length - 1.
2-difference
Why L = 0, R = length - 1 doesn't work for 2-difference
Now that we understand how 2-sum works, let's look at 2-difference. Why is it that the same approach starting from both ends and working towards the middle won't work?
The reason is that when you're subtacting two numbers, you lose the all-important guarantee from 2-sum that moving the left pointer forward will always increase the sum and that moving the right pointer backwards will always decrease it.
In subtraction between two numbers in a sorted array, A[R] - A[L] s.t. R > L, regardless of whether you move L forward or R backwards, the sum will decrease, even in an array of only positive numbers. This means that at a given index, there's no way to know which pointer needs to move to find the correct pair later on, breaking the algorithm for the same reason as 2-sum with both pointers starting at 0.
Why L = 0, R = 0 works for 2-difference
Finally, why does starting both pointers at 0 work on 2-difference? The reason is that you're back to the 2-sum guarantee that moving one pointer increases the difference while the other decreases the difference. Specifically, if A[R] - A[L] < B, then L++ is guaranteed to decrease the difference, while R++ is guaranteed to increase it.
We're back in business: there is no choice or magical oracle necessary to decide which index to move. We can systematically eliminate values that are either too large or too small and hone in on the target. The logic works for the same reasons L = 0, R = length - 1 works on 2-sum.
As an aside, the first solution is suboptimal O(n log(n)) instead of O(n) with two passes and O(n) space. You can use an unordered map to keep track of the items seen so far, then perform a lookup for every item in the array: if B - A[i] for some i is in the map, you found your pair.
Conside this:
A = {2, 3, 5, 10, 50, 80}
B = 40
i = 0, j = 5;
When you have something like
while(i<j) {
if(A[j]-A[i]==B && i!=j) return true;
if(A[j]-A[i]>B) j--;
else i++;
}
consider the case when if(A[j]-A[i]==B && i!=j) is not true. Your code makes an incorrect assumption that if the difference of the two endpoints is > B then one should decrement j. Given a sorted array, you don't know whether you decrementing j and then taking the difference would give you the target difference, or incrementing i and then taking the difference would give you the target number since it can go both ways. In your example, when A[5] - A[0] != 10 you could've gone both ways, A[4] - A[0] (which is what you do) or A[5] - A[1]. Both would still give you a difference greater than the target difference. In short, the presumption in your algorithm is incorrect and hence isn't the right way to go about.
In the second approach, that's not the case. When the triplet nums[i]+nums[l]+nums[r] isn't found, you know that the array is sorted and if the sum was more than 0, it has to mean that the num[r] needs to be decremented since incrementing l would only further increase the sum further since num[l + 1] > num[l].
Your question boils down to the following:
For a sorted array in ascending order A, why is it that we perform a different two-pointer search for t for the problem A[i] + A[j] == t versus A[i] - A[j] == t, where j > i?
It's more intuitive why for the first problem, we can fix i and j to be at opposite ends and decrease the j or increase i, so I'll focus on the second problem.
With array problems it's sometimes easiest to draw out the solution space, then come up with the algorithm from there. First, let's draw out the solution space B, where B[i][j] = -(A[i] - A[j]) (defined only for j > i):
B, for A of length N
i ---------------------------->
j B[0][0] B[0][1] ... B[0][N - 1]
| B[1][0] B[1][1] ... B[1][N - 1]
| . . .
| . . .
| . . .
v B[N - 1][0] B[N - 1][1] ... B[N - 1][N - 1]
---
In terms of A:
X -(A[0] - A[1]) -(A[0] - A[2]) ... -(A[0] - A[N - 2]) -(A[0] - A[N - 1])
X X -(A[1] - A[2]) ... -(A[1] - A[N - 2]) -(A[1] - A[N - 1])
. . . . .
. . . . .
. . . . .
X X X ... X -(A[N - 2] - A[N - 1])
X X X ... X X
Notice that B[i][j] = A[j] - A[i], so the rows of B are in ascending order and the columns of B are in descending order. Let's compute B for A = [2, 3, 5, 10, 50, 80].
B = [
i------------------------>
j X 1 3 8 48 78
| X X 2 7 47 77
| X X X 5 45 75
| X X X X 40 70
| X X X X X 30
v X X X X X X
]
Now the equivalent problem is searching for t = 40 in B. Note that if we start with i = 0 and j = N = 5 there's no good/guaranteed way to reach 40. However, if we start in a position where we can always increment/decrement our current element in B in small steps, we can guarantee that we'll get as close to t as possible.
In this case, the small steps we take involve traversing either right/downwards in the matrix, starting from the top left (could equivalently traverse left/upwards from the bottom right), which corresponds to incrementing both i and j in the original question in A.

Counting inversion after swapping two elements of array

You are given a permutation p1,p2,...,pn of numbers from 1 to n.
A permutation is a sequence of integers from 1 to n of length n containing each number exactly once.
You are given q queries where each query consists of two integers a and b, In response to each query you need to return a number of inversions of permutation after swapping elements at index a and b, Here every query is independent i.e. after each query the permutation is restored to its initial state.
An inversion in a permutation p is a pair of indices (i, j) such that i > j and pi < pj. For example, a permutation [4, 1, 3, 2] contains 4 inversions: (2, 1), (3, 1), (4, 1), (4, 3).
Input: The first line contains n,q.
The second line contains the space-separated permutation p1,p2,...,pn.
Each line of the next q lines contains two integers a,b.
Output: For each query Print an integer denoting the number of Inversion on a new line.
Sample input:
5 5
1 2 3 4 5
1 2
1 3
2 5
2 4
3 3
Output:
1
3
5
3
0
Constraints:
2<=n<=1000
1<=q<=200000
My approach: I am counting no of inversions using BIT (https://www.geeksforgeeks.org/count-inversions-array-set-3-using-bit/) for each query after swapping elements at position a and b..and then again swapping it so that my array remains unchanged. But this solution gives TLE for large test cases. Is there any better approach for this problem?
You are getting TLE probably because number of computations in this approach is q * (n * log(n)) = 2 * 10^5 * 10^3 * log(1000) = ~10^9, which is more than generally accepted computations ~10^8.
I can think of the following solution. Please note that I have not coded / verified it:
Denoting ri == number of indices j, such that i > j && pi < pj. Eg: [2, 3, 1, 4], r3 = 2. Basically, it means the number of inversions with the farther index as i. (Please note that I am using 1-based index as per the question. Also,a < b as per the question)
Thus we have: Sum of ri == #invs (number of inversions)
We can calculate initial total #invs in O(n^2)
When a and b are swapped, we can observe that:
a) ri remains constant, where i < a .
b) ri remains constant, where i > b.
Only ri changes where a <= i <=b, and that too on these following conditions. I am considering the case when pa < pb. Exact opposite case will need to considered when pa > pb.
a) Since pa < pb, thus this swap causes #invs = #invs + 1
b) If (pi < pa && pi < pb) || (pi > pa && pi > pb), this swap does not change ri. Eg: [2,....10,....5]. Here Swapping 2 and 5 does not change the r value for 10.
c) If pa < pi < pb, it will increment ri by 1, and new rb by 1. Eg: [2,....3,.....4], when 2 and 4 are swapped, we have [4,....3,....2], the rvalue 3 increases by 1 (because of 4); and also the r value of 2 increase by 1 (because of 3). Please note that increment because of what about 4 > 2? was already calculated in step (a), and needs to be done once only.
d) We need to find all such indicies i where pa < pi < pb as we started with above. Let us call it f(a, b). Then the total change in #invs delta = (2 * f(a, b)) + 1, and answer will be #original_invs + delta.
As I mentioned, all the exact opposite steps need to be done for the case pa > pb. The delta will be negative in that case.
Now, the only thing remained is to solve: Given a, b, find f(a, b) efficiently. For this, we can pre-process and store it for all pairs of indices. This will take O(N^2) space, and O(N^2 * log(N)) time, using a balanced binary-search-tree (BST). Again showing steps for pre-processing for case pa < pb only. Another set of pre-processing steps needs to be done for the other case:
We will use self-balancing BST, in which each node also contains the following fields:
a) field_1: This denotes the size of the left sub-tree. This value will be updated on every insert operation, if size of left-sub-tree changes.
b) field_2: This denotes the number of elements < node.value that this tree has. This value is initialized once when the node is inserted and does not change thereafter. I have added a small explanation of how it will be achieved in Addendum-A. This field is basically our pre-processing, which will determine f(a, b).
With all of this now, for each index i, where 0 <= i < n, do the following: Create new tree. Insert pj values into the tree one by one, where (i < j < n ) && (pa < pj) . (Please note we are not inserting values where pa > pj). The method given in Addendum-A will make sure we find f(i, j) while inserting.
There will be n such pre-processed trees, one for every index. For finding f(a, b): We need to look into ath tree, and search node.value = pb. This node's field_2 = f(a, b).
The complexity of insertion is O(logN). So, the total pre-processing computation = O(N * N(logN)). Search is O(logN), so the query complexity is O(q * logN). Total complexity = O(N^2) + O(N * N (logN)) + O(q * logN) which will turn out ~10^7
==============================================================================
Addendum A: How to populate field_2 while inserting node:
i) Insert the node, and balance the tree. Update field_1 as required.
i) Initailze ans = 0. Traverse the BST from root searching for your node.
iii) do {
If node.value < search_key_b, ans += node.left_subtree_size + 1
} while(!node.found)
iv) ans -= 1
We can solve this in O(n log n) space and O(n log n + Q * log^2(n)) time with a merge-sort tree. The merge-sort tree allows us to find the number of elements inside a subarray that are greater than or lower than an input number in O(log^2(n)) time and O(n log n) space.
First we record the total number of inversions in O(n log n) time, for which there are known methods. To query the effect of a swap bound by left and right, consider the subarray between:
subtract the number of elements greater
than right in the subarray (those will
no longer be inversions)
subtract the number of elements smaller
than left in the subarray (those will
no longer be inversions)
add the number of elements greater
than left in the subarray (those will
be new inversions)
add the number of elements smaller
than right in the subarray (those will
be new inversions)
if right > left, add 1
if left > right, subtract 1

Intuition behind incrementing the iteration variable?

I am solving a question on LeetCode.com:
Given an array with n objects colored red, white or blue, sort them in-place so that objects of the same color are adjacent, with the colors in the order red, white and blue. Here, they use the integers 0, 1, and 2 to represent the color red, white, and blue respectively. [The trivial counting sort cannot be used].
For the input: [2,0,2,1,1,0]; the output expected is: [0,0,1,1,2,2].
One of the highly upvoted solutions goes like this:
public void sortColors(vector<int>& A) {
if(A.empty() || A.size()<2) return;
int low = 0;
int high = A.size()-1;
for(int i = low; i<=high;) {
if(A[i]==0) {
// swap A[i] and A[low] and i,low both ++
int temp = A[i];
A[i] = A[low];
A[low]=temp;
i++;low++;
}else if(A[i]==2) {
//swap A[i] and A[high] and high--;
int temp = A[i];
A[i] = A[high];
A[high]=temp;
high--;
}else {
i++;
}
}
}
My question is, why is i incremented when A[i]==0 and A[i]==1 and not when A[i]==2? Using pen and paper, the algorithm just works to give me the answer; but could you please provide some intuition?
Thanks!
This steps through the array and maintains the constraint that the elements 0..i are sorted, and all either 0 or 1. (The 2's that were there get swapped to the end of the array.)
When A[i]==0, you're swapping the element at i (which we just said was 0) with the element at low, which is the first 1-element (if any) in the range 0..i. Hence, after the swap, A[i]==1 which is OK (the constraint is still valid). We can safely move forward in the array now. The same is true if A[i]==1 originally, in which case no swap is performed.
When A[i]==2, you're essentially moving element i (which we just said was 2) to the end of the array. But you're also moving something from the end of the array into element i's place, and we don't know what that element is (because we haven't processed it before, unlike the A[i]==0 case). Hence, we cannot safely move i forward, because the new element at A[i] might not be in the right place yet. We need another iteration to process the new A[i].
That is, because for 0s and 1s, only items left of the current item are handled and those have already been reviewed / sorted. Only for 2s items from the right end of the array are handled, which haven't been looked at yet.
To be more specific: In this specific example only three different states are handled:
the current item being reviewed equals 0: in this case this sorting algorithm just puts this item at the end of all zeros, which have already been sorted (aka A[low]). Also the item which was at A[low] before can only be a 0 or 1 (since they have already sorted) which means you can just swap with the current item and not break the sequence. Now the interesting part: up until now, every item from A[0] over A[low] to A[i] has been already sorted, so the next item which has to be reviewed will be A[i + 1], hence the i++
the current item equals 1: in this case, no swapping has to be done, since all 0s and 1s has already been put in A[0] to A[i - 1] and all 2s have already been put at the end of the array. That means, the next item to be reviewed is A[i + 1], hence the i++
the current item equals 2: in this case, the current item will be put at the end of the array, next to (i.e., to the left of) all the other already sorted 2s (A[high]). The item, which will be swapped from A[high] to A[i] has not been sorted yet and therefor has to be reviewed in the next step, hence th i = i;

Largest rectangles in histogram

I'm working on the below algorithm puzzle and here is the detailed problem statement.
Find the largest rectangle of the histogram; for example, given histogram = [2,1,5,6,2,3], the algorithm should return 10.
I am working on the below version of code. My question is, I think i-nextTop-1 could be replaced by i-top, but in some test cases (e.g. [2,1,2]), they have different results (i-nextTop-1 always produces the correct results). I think logically they should be the same, and wondering in what situations i-nextTop-1 is not equal to i-top
class Solution {
public:
int largestRectangleArea(vector<int>& height) {
height.push_back(0);
int result=0;
stack<int> indexStack;
for(int i=0;i<height.size();i++){
while(!indexStack.empty()&&height[i]<height[indexStack.top()]){
int top=indexStack.top();
indexStack.pop();
int nextTop=indexStack.size()==0?-1:indexStack.top();
result=max((i-nextTop-1)*height[top],result);
}
indexStack.push(i);
}
return result;
}
};
The situations where i-nextTop-1 != i-top occur are when the following is true:
nextTop != top-1
This can be seen by simply rearranging terms in the inequality i-nextTop-1 != i-top.
The key to understanding when this occurs lies in the following line within your code, in which you define the value of nextTop:
int nextTop = indexStack.size() == 0 ? -1 : indexStack.top();
Here, you are saying that if indexStack is empty (following the pop() on the previous line of code), then set nextTop to -1; otherwise set nextTop to the current indexStack.top().
So the only times when nextTop == top-1 are when
indexStack is empty and top == 0, or
indexStack.top() == top - 1.
In those cases, the two methods will always agree. In all other situations, they will not agree, and will produce different results.
You can see what is happening by printing the values of i, nextTop, (i - top), (i - nextTop - 1), and result for each iteration at the bottom of the while loop. The vector {5, 4, 3, 2, 1} works fine, but { 1, 2, 3, 4, 5} does not, when replacing i-nextTop-1 with i-top.
Theory of the Algorithm
The outer for loop iterates through the histogram elements one at a time. Elements are pushed onto the stack from left to right, and upon entry to the while loop the top of stack contains the element just prior to (or just to the left of) the current element. (This is because the current element is pushed onto the stack at the bottom of the for loop, right before looping back to the top.)
An element is popped off the stack within the while loop, when the algorithm has determined that the best possible solution that includes that element has already been considered.
The inner while loop will keep iterating as long as height[i] < height[indexStack.top()], that is, as long as the the height of the current element is less than the height of the element on the top of the stack.
At the start of each iteration of the while loop, the elements on the stack represent all of the contiguous elements to the immediate left of the current element, that are larger than the current element.
This allows the algorithm to calculate the area of the largest rectangle to the left of and including the current element. This calculation is done in the following two lines of code:
int nextTop = indexStack.size() == 0 ? -1 : indexStack.top();
result = max((i - nextTop - 1) * height[top], result);
Variable i is the index of the current histogram element, and represents the rightmost edge of the rectangle being currently calculated.
Variable nextTop represents the index of the leftmost edge of the rectangle.
The expression (i - nextTop - 1) represents the horizontal width of the rectangle. height[top] is the vertical height of the rectangle, so the result is the product of these two terms.
Each new result is the larger of the new calculation and the previous value for result.

Algorithm for finding the maximum number of non-overlapping lines on the x axis

I'm not exactly sure how to ask this, but I'll try to be as specific as possible.
Imagine a tetris screen with only rectangles, of different shapes, falling to the bottom.
I want to compute the maximum number of rectangles that I can fit one next to the other without any overlapping ones. I've named them lines in the title because I'm actually only interested in the length of the rectangle when computing, or the line parallel to the x axis that it's falling towards.
So basically I have a custom type with a start and end, both integers between 0 and 100. Say we have a list of these rectangles ranging from 1 to n. rectangle_n.start (unless it's the rectangle closest to the origin) has to be > rectangle_(n-1).end so that they will never overlap.
I'm reading the rectangle coordinates (both are x axis coordinates) from a file with random numbers.
As an example:
consider this list of rectangle type objects
rectangle_list {start, end} = {{1,2}, {3,5}, {4,7} {9,12}}
We can observe that the 3rd object has its start coordinate 4 < the previous rectangle's end coordinate which is 5. So in sorting this list, I would have to remove the 2nd or the 3rd object so that they don't overlap.
I'm not sure if there is a type for this kind of problem so I didn't know how else to name it. I'm interested in an algorithm that can be applied on a list of such objects and would sort them out accordingly.
I've tagged this with c++ because the code I'm writing is c++ but any language would do for the algorithm.
You are essentially solving the following problem. Suppose we have n intervals {[x_1,y_1),[x_2,y_2),...,[x_n,y_n)} with x_1<=x_2<=...<=x_n. We want to find a maximal subset of these intervals such that there are no overlaps between any intervals in the subset.
The naive solution is dynamic programming. It guarantees to find the best solution. Let f(i), 0<=i<=n, be the size of the maximal subset up to interval [x_i,y_i). We have equation (this is latex):
f(i)=\max_{0<=j<i}{f(j)+d(i,j)}
where d(i,j)=1 if and only if [x_i,y_i) and [x_j,y_j) have no overlaps; otherwise d(i,j) takes zero. You can iteratively compute f(i), starting from f(0)=0. f(n) gives the size of the maximal subset. To get the actual subset, you need to keep a separate array s(i)=\argmax_{0<=j<i}{f(j)+d(i,j)}. You then need to backtrack to get the 'path'.
This is an O(n^2) algorithm: you need to compute each f(i) and for each f(i) you need i number of tests. I think there should be a O(nlogn) algorithm, but I am not so sure.
EDIT: an implementation in Lua:
function find_max(list)
local ret, f, b = {}, {}, {}
f[0], b[0] = 0, 0
table.sort(list, function(a,b) return a[1]<b[1] end)
-- dynamic programming
for i, x in ipairs(list) do
local max, max_j = 0, -1
x = list[i]
for j = 0, i - 1 do
local e = j > 0 and list[j][2] or 0
local score = e <= x[1] and 1 or 0
if f[j] + score > max then
max, max_j = f[j] + score, j
end
end
f[i], b[i] = max, max_j
end
-- backtrack
local max, max_i = 0, -1
for i = 1, #list do
if f[i] > max then -- don't use >= here
max, max_i = f[i], i
end
end
local i, ret = max_i, {}
while true do
table.insert(ret, list[i])
i = b[i]
if i == 0 then break end
end
return ret
end
local l = find_max({{1,2}, {4,7}, {3,5}, {8,11}, {9,12}})
for _, x in ipairs(l) do
print(x[1], x[2])
end
The name of this problem is bin packing, it is usually considered as a hard problem but can be computed reasonably well for small number of bins.
Here is a video explaining common approaches to this problem
EDIT : By hard problem, I mean that some kind of brute force has to be employed. You will have to evaluate a lot of solutions and reject most of them, so usually you need some kind of evaluation mechanism. You need to be able to compare solution, such as "This solution packs 4 rectangles with area of 15" is better than "This solution packs 3 rectangles with area of 16".
I can't think of a shortcut, so you may have to enumerate the power set in descending order of size and stop on the first match.
The straightforward way to do this is to enumerate combinations of decreasing size. You could do something like this in C++11:
template <typename I>
std::set<Span> find_largest_non_overlapping_subset(I start, I finish) {
std::set<Span> result;
for (size_t n = std::distance(start, finish); n-- && result.empty();) {
enumerate_combinations(start, finish, n, [&](I begin, I end) {
if (!has_overlaps(begin, end)) {
result.insert(begin, end);
return false;
}
return true;
});
}
return result;
}
The implementation of enumerate_combination is left as an exercise. I assume you already have has_overlap.