DP: Longest Increasing Subsequence Thought Process & Solution

DP: Longest Increasing Subsequence Thought Process & Solution - c++

For the Longest Increasing Subsequence problem I envisioned keeping a DP array that is always in order keeping the max value at the farthest end. Something that would look like this:
{1, 1, 2, 3, 3, 4, 5, 6, 6, 6}
The thought process I followed to produce my first incorrect solution was, we want to look at the entire array starting with only the first element, calculate the LIS, then incrementally add on a value to the end of our array. While doing this, we incrementally calculate the LIS in our DP array to the LIS of the old subarray plus the new element we added on. This means at index i of the dp array exists the value of the LCS of the subarray of length i.
More clearly put
array => {5, 6, 7, 1, 2, 3, 4}
dp => {1, 2, 3, 3, 3, 3, 4}
This way the very last entry of the DP array will be the LIS of the current array. This would act as our invariant, so when we get the end, we can be assured that the last value is the only one we need. It then dawned on me that while we're traversing an array with a DP kinda feel, the next value does not depend on any of the previously tabulated values in the array, so this method is the same as maintaining a maxLIS variable, a pattern I've seen in many O(n) solutions. So my closest-to-correct solution is as follows:
1.) Save a copy of the input array/vector as old
2.) Sort the original input array
3.) Traverse the sorted array, incrementing a variable longest by one every time the next value (which should be larger than the current`) appears before the current in the original array.
4.) Return longest
The code would be ~this:
int lengthOfLIS(vector<int>& seq) {
if (!seq.size()) return 0;
vector<int> old = seq;
sort(seq.begin(), seq.end());
int longest = 1;
for (int i = 1; i < seq.size(); ++i) {
if (seq[i] > seq[i-1] && find(old.begin(), old.end(), seq[i]) - old.begin() > find(old.begin(), old.end(), seq[i-1]) - old.begin()) longest++;
}
return longest;
}
Where we have the find method (I'm assuming a linear operation) we could make a constant operation by just making a data structure to store the original index of the value along with the the value itself so we don't have to do any traversing to find the index of an element in the original array (old). I believe this would be an O(nlog(n)) solution however fails with this input array: [1,3,6,7,9,4,10,5,6]. CHECK HERE
Finally I did some research and I found that all solution guides I have read sneak in the fact that their solution keeps the values of their DP array not in order, but instead like this: A value in the DP array represents the length of an increasing subsequence with the last value of the subsequence being the value of originalArray[index].
More clearly put,
array => {5, 6, 7, 1, 2, 3}
dp => {1, 2, 3, 1, 2, 3}
Here, where 5 is the last value of an increasing subsequence, no values come before it so it must be of length 1. If 6 is the last value of an increasing subsequence, we must look at all values before it to determine how long a subsequence ending with 6 can be. Only 5 can come before it, thus making the longest increasing subsequence thus far 2. This continues, and you return the maximum value in the DP array. Time complexity for this solution is O(n^2), standard naive solution.
Questions:
I'm curious as to how I can think about this problem correctly. I want to fine-tune my thought process so that I can come up with an optimal solution from scratch (that's the goal at least) so I'd like to know
1.) What property of this problem should've triggered to me to use a DP array differently than how I would've used it? In hindsight, my original way was simply equivalent to keeping a max variable but even then I struggle seeing a property of this problem that would trigger the thought `Hey, the value of an entry in my DP array at index i should be the length of the increasing subsequence ending with originalArray[i]. I'm struggling to see how I should've come up with that.
2.) Is it possible to get my proposed O(nlog(n)) solution to work? I know an O(nlog(n)) solution exists, but since I can't get mine working I think I need a nudge in the right direction.

I admit, it is an interesting question and i do not have exact answer to it but i guess i can give you a nudge in right direction. So here it goes:
While facing with such dilemma, I would usually turn to the basics. Like in your case go through definition of Dynamic Programming. It has two properties:
Overlapping Subproblems
Optimal Substructure.
You can easily find these property reflecting in standard solution but not yours. You can read about them in cormen or just google them in context of DP.
In my opinion your solution is not a DP, you just found some pattern and your are trying to solve based on this pattern. If you are not getting the solution, it means that either your pattern is wrong or your solution is overlooking something. In scenarios like this try to prove, mathematically, that the pattern you are observing is correct and prove that the solution should also work.
Give me some more time, while i work through your solution but mean while you can also try to develop a proof for your solution.

Related

Is it possible to find all the smallest elements in a list in O(n) time?

Suppose I have an unsorted list such as the one below:
[1, 2, 3, 1, 1, 5, 2, 1]
and I want to return the number of minimum elements (in this case, min = 1), which is 4.
A quick solution is to just find the minimum using some built in min() function, and then iterate over the list again and compare values, then count them up. O(2n) time.
But I'm wondering if it's possible to do it in strictly O(n) time - only make one pass through the list. Is there a way to do so?

Remember that big-O notation talks about the way in which a runtime scales, not the absolute runtime. In that sense, an algorithm that makes two passes over an array that each take time O(n) also has runtime O(n) - the runtime will scale linearly as the input size increases. So your two-pass algorithm will work just fine.
A stronger requirement is that you have a one-pass algorithm, in which you get to see all the elements once. In that case, you can do this by tracking the smallest number you've seen so far and all the positions where you've seen it. Whenever you see a value,
if that value is bigger than the smallest you've seen, ignore it;
if that value equals the smallest you've seen, add it to the list of positions; and
if that value is smaller than the smallest you've seen, discard your list of all the smallest elements (they weren't actually the smallest) and reset it to a list of just the current position.
This also takes time O(n), but does so in a single pass.

Find all differences in an array in O(nlogn) where n is the max range of elements

Question: Given a sorted array A find all possible difference of elements from A, where each element is an integer in the range [1, ..., n]. Also, you can assume there are no duplicates. So max size of the array will be <= n.
Note: Total possible differences will be in the range of [1, ..., n-1] because of the above constraints.
Example (for N=12):
Input: 1, 6, 10, 12
Output: 2, 4, 5, 6, 9, 11
The question is similar to this one, except n is the no. of elements in that question, not the upper bound of the element.
There's also an answer in the same question, this one: https://stackoverflow.com/a/8455336/2109808
This guy claims it can really be done in O(nlogn) using fft and self convolution, but I don't get it, and it also seems to be incorrect when I try it on online convolution calculators (like this one).
So, does anyone know how this can be achieved in O(nlogn)?
Thank you in advance :)

This answer linked by OP suggest the following steps:
Assume an array with non-repeating elements in the range [0, n-1].*
Create an array of length n, where elements whose index matches an element of the input are set to 1, other elements are set to 0. This can be created in O(n). For example, given in input array [1,4,5], we create an array [0,1,0,0,1,1].
Compute the autocorrelation function. This can be computed by taking the FFT, squaring its magnitude, and then taking the IFFT. This is O(n log n).
The output is non-zero for indices corresponding to a difference present in the input. The element at index 0 is always non-zero, and should be ignored. Finding and printing these elements is O(n).
Note that this process is not correct, because the auto-correlation function as computed through the FFT is circular. That is, given an input array with two values, 0 and n-1, the output will have a non-zero element at index 1 as well as at index n-1. To avoid this, it would be necessary to make the array in step #2 of length 2n, leaving half of it set to 0. The second half of the output array should then be ignored. Doubling the array size doesn't change the computational complexity of the algorithm, which is still O(n log n).
* I changed the range from that given by OP for simplicity, it is trivial to change this range by adding an offset to all indices.

Fastest way to find pair in a vector, remove it while iterating

I am currently working on a greedy algorithm, which is similar to the Activity Selection Problem. I've got vector of pairs (natural numbers), sorted by second value. For each pair I take possibly closest pair(by closest i mean (p2.first - p1.second) is minimal and p1.second < p2.first ). Now I do some calculations with those values (doesn't matter), "increase" range of first pair from (p1.first,p1.second) to (p1.first, p2.second) and erase it. The algorithm will look for next closest pair to the new pair it just created with previous pair. My question is, what is the best (fastest) way to find such pairs without iterating over the list for each element. Also how shpuld I erase these pairs after calculations. I am using iterator to iterate over the list and when i remove these pairs it goes crazy so my workaround is to fill these wit (-1,-1) values but it is unacceptable because these algorithm is meant to go on Online Judge and it is way to slow.
Below is the example. Each column is index of pair, in each row is range of pair. For example pairs[0] = [0,3]. After first iteration pairs[0] should be transformed into [0,9] and the second column should be deleted.

It is really hard to say what the "fastest way" to do anything is. Even "fast enough" is problematic without knowing your constraints exactly. Therefore, I'm going to give you a few tips to get a (probably) "faster" program; whether it will be fast enough is up to you to decide.
First of all, you probably want to change your sort criterion. Instead of sorting on the second component (the end of the interval, I assume,) you need to sort on the first component and then on the second. This way, an interval that starts sooner will be earlier in the array, and among intervals with the same start, the one that is shortest will be first.
Secondly, you might want to have a helper data structure: a naturally-sorted array of pairs, where the first component of each pair is any number X and the second is the index of the first pair in the (sorted) original array that starts at X. For example, this array for the image in your question will be {{0, 0}, {4, 1}, {9, 2}}. It shouldn't be hard to see how to construct this array in O(n) and how to use it to accelerate your search over the original array to an amortized O(1).
Thirdly, to iterate over an std::vector and remove its elements without problems, you can use indexes instead of iterators. However, this is not particularly efficient, because each erase must shift quite a few elements backwards and might even reallocate the vector and copy/move all of its elements. Instead, do what your are doing now and mark those elements that your want removed with distinctive numbers, and after your algorithm is done, just go over the array one more time and remove them all. The following is a pseudo code:
displacement = 0
for all elements in the array, do:
if current element should be removed, then:
increment "displacement"
else:
move current element back "displacement" places
delete last "displacement" elements
EDIT: After reading your comment, you don't need any of this stuff. Just sort the array or pairs the way I wrote above (i.e. lexicographically), and then construct another array of pairs from it like this:
let original vector be A, and the new vector of pairs be B,
t0 = -1
t1 = -1
for all elements in A, do:
if start time of current element is greater than "t1", then:
append the pair (t0, t1) to B
t0 = start time of current element
t1 = end time of current element
else:
t1 = max(t1, end time of current element)
append the pair (t0, t1) to B
remove first element of B (because it is (-1, -1))
(My way is not exactly elegant, but it get's the job done.)
Then run your cost calculation logic on this new array, B. This new array will be shorter, and there will be no overlap between its elements.

How to sort even and odd numbers alternatively with the limit of time and space complexity?(C/C++)

Given a integer array like
int numbers[8]={1, 3, 5, 7, 8, 6, 4, 2};
The half side in the front array are odd numbers, and the rest (the equal amount number)
are even. The odd numbers are in an ascending order and even part are in a descending order. After the sorting, the order of the numbers can't be changed.
How can I sort them alternatively with time complexity less than O(n^2) and space complexity O(1)?
For this example, the result would be: {1,8,3,6,5,4,7,2};
I can't use external array storage but temporary variables are acceptable.
I have tried to use two pointers(oddPtr, evenPtr) to point odd and even numbers separately, and move evenPtrto insert the even values to the middles of odd numbers.(Like insertion sort)
But it takes O(n^2).
UPDATED

As per Dukeling's comment I realized that the solution I propose in fact is not linear,but linearithmic and even worse - you can't control if it takes extra memory or not. On my second thought I realized you know a lot about the array you to implement a more specific, but probably easier solution.
I will make an assumption that all values in the array are positive. I need this so that I can use negative values as kind of 'already processed' flag. My idea is the following - iterate over the array from left to right. For each element if it is already processed(i.e. its value is negative) simply continue with the next one. Otherwise you will have a constant formula where is the position where this element should be:
If the value is odd and its index is i it should move to i*2
If the value is even and its index is i it should move to (i - n/2)*2 + 1
Store this value into a temporary and make the value at the current index of the array 0. Now until the position where the value we 'have at hand' is not zero, swap it with the value staying at the position we should place it according to the formula above. Also when you place the value at hand negate it to 'mark it as processed'. Now we have a new value 'at hand' and again we calculate where it should go according to the formula above. We continue moving values until the value we 'have at hand' should go to the position with 0. With a little thought you can prove that you will never have a negative('processed') value at hand and that eventually you will end up at the empty spot of the array.
After you process all the values iterate once over the array to negate all values and you will have the array you need. The complexity of the algorithm I describe is linear- each value will be no more than once 'at hand' and you will iterate over it no more than once.

Deriving an ordered sequence from 5 scrambled ones

I've been assigned to do a problem that goes something like this:
My program should derive a list of integers A[1...N], where A[j] represents the jth integer in the list.
To derive it, my program will be inputted 5 lists, each of N integers (the same exact ones as in A[1...N], although scrambled). Each of these lists will be generated this way:
The list is put into order, just like A[1...N]. The list is then scrambled, which is done by removing at least 0 integers from this list, and placing them BACK into any position in the said list. In each of the 5 lists, each number is moved at most one time (although a number could end up at a different index as a result of other numbers shifting around).
FOR EXAMPLE
Assume N is 5, and the correct sequence A is {1, 2, 3, 4, 5}
The program would be entered these 5 sequences:
1,2,3,4,5
2,1,3,4,5
3,1,2,4,5
4,1,2,3,5
5,1,2,3,4
How would it be able to determine that the target/original sequence was {1,2,3,4,5}?
Could anyone point me in the right direction? (This is a homework problem)
Tell me if you need me to clarify the problem more.
Thanks!

I would create an array of size N and use it as an index for the other arrays. For instance, if you created an integer array index[N], you could manipulate it and use its values as indices for the other arrays, i.e. array1[index[N]]. Depending on how you manipulated this index array, you could use it for either scrambling or sorting.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js