Number of Increasing Subsequences of length k - c++

I am trying to understand the algorithm that gives me the number of increasing subsequences of length K in an array in time O(nklog(n)). I know how to solve this very same problem using the O(k*n^2) algorithm. I have looked up and found out this solution uses BIT (Fenwick Tree) and DP. I have also found some code, but I have not been able to understand it.
Here are some links I've visited that have been helpful.
Here in SO
Topcoder forum
Random webpage
I would really appreciate if some can help me out understand this algorithm.

I am reproducing my algorithm from here, where its logic is explained:
dp[i, j] = same as before num[i] = how many subsequences that end with i (element, not index this time)
have a certain length
for i = 1 to n do dp[i, 1] = 1
for p = 2 to k do // for each length this time num = {0}
for i = 2 to n do
// note: dp[1, p > 1] = 0
// how many that end with the previous element
// have length p - 1
num[ array[i - 1] ] += dp[i - 1, p - 1] *1*
// append the current element to all those smaller than it
// that end an increasing subsequence of length p - 1,
// creating an increasing subsequence of length p
for j = 1 to array[i] - 1 do *2*
dp[i, p] += num[j]
You can optimize *1* and *2* by using segment trees or binary indexed trees. These will be used to efficiently process the following operations on the num array:
Given (x, v) add v to num[x] (relevant for *1*);
Given x, find the sum num[1] + num[2] + ... + num[x] (relevant for *2*).
These are trivial problems for both data structures.
Note: This will have complexity O(n*k*log S), where S is the upper bound on the values in your array. This may or may not be good enough. To make it O(n*k*log n), you need to normalize the values of your array prior to running the above algorithm. Normalization means converting all of your array values into values lower than or equal to n. So this:
5235 223 1000 40 40
Becomes:
4 2 3 1 1
This can be accomplished with a sort (keep the original indexes).

Related

Numbers of common distinct difference

Given two array A and B. Task to find the number of common distinct (difference of elements in two arrays).
Example :
A=[3,6,8]
B=[1,6,10]
so we get differenceSet for A
differenceSetA=[abs(3-6),abs(6-8),abs(8-3)]=[3,5,2]
similiarly
differenceSetB=[abs(1-6),abs(1-10),abs(6-10)]=[5,9,4]
Number of common elements=Intersection :{differenceSetA,differenceSetB}={5}
Answer= 1
My approach O(N^2)
int commonDifference(vector<int> A,vector<int> B){
int n=A.size();
int m=B.size();
unordered_set<int> differenceSetA;
unordered_set<int> differenceSetB;
for(int i=0;i<n;i++){
for(int j=i+1;j<n;j++){
differenceSetA.insert(abs(A[i]-A[j]));
}
}
for(int i=0;i<m;i++){
for(int j=i+1;j<m;j++){
differenceSetB.insert(abs(B[i]-B[j]));
}
}
int count=0;
for(auto &it:differenceSetA){
if(differenceSetB.find(it)!=differenceSetB.end()){
count++;
}
}
return count;
}
Please provide suggestions for optimizing the approach in O(N log N)
If n is the maximum range of a input array, then the set of all differences of a given array can be obtained in O(n logn), as explained in this SO post: find all differences in a array
Here is a brief recall of the method, with a few additional practical implementation details:
Create an array Posi of length 2*n = 2*range = 2*(Vmax - Vmin + 1), where elements whose index matches an element of the input are set to 1, other elements are set to 0. This can be created in O(m), where m is the size of the array.
For example, given in input array [1,4,5] of size m, we create an array [1,0,0,1,1].
Initialisation: Posi[i] = 0 for all i (i = 0 to 2*n)
Posi[A[i] - Vmin] = 1 (i = 0 to m)
Calculate the autocorrelation function of array Posi[]. This can be classically performed in three sub-steps
2.1 Calculate the FFT (size 2*n) of Posi[]array: Y[] = FFT(Posi)
2.2 Calculate the square amplitude of the result: Y2[k] = Y[k] * conj([Y[k])
2.3 Calculate the Inverse FFT of the result Diff[] = IFFT (Y2[])`
A few details are worth being mentioned here:
The reason why a size 2*n was selected, and not a size n, if that, is d is a valid difference, then -d is also a valid difference. The results corresponding to negative differences are available at positions i >= n
If you find more easy to perform FFT with a size a-power-of-two, than you can replace the size 2*n with a value n2k = 2^k, with n2k >= 2*n
The non-null differences correspond to non-null values in the array Diff[]:
`d` is a difference if `Diff[d] > 0`
Another important details is that a classical FFT is used (float calculations), then you encounter little errors. To take it into account, it is important to replace the IFFT output Diff[] with integer rounded values of the real part.
All that concerns one array only. As you want to calculate the number of common differences, then you have to:
calculate the arrays Diff_A[] and Diff_B[] for both sets A and B and then:
count = 0;
if (Diff_A[d] != 0) and (Diff_B[d] != 0) then count++;
A little Bonus
In order to avoid a plagiarism of the mentioned post, here is an additional explanation about the way to get the differences of one set, with the help of the FFT.
The input array A = {3, 6, 8} can mathematically be represented by the following z transform:
A(z) = z^3 + z^6 + z^8
Then the corresponding z-transform of the difference array is equal to the polynomial product:
D(z) = A(z) * A(z*) = (z^3 + z^6 + z^8) (z^(-3) + z^(-6) + z^(-8))
= z^(-5) + z^(-3) + z^(-2) + 3 + z^2 + z^3 + z^5
Then, we can note that A(z) is equal to a FFT of size N of the sequence [0 0 0 1 0 0 1 0 1] by taking:
z = exp (-i * 2 PI/ N), with i = sqrt(-1)
Note that here we consider the classical FFT in C, the complex field.
It is certainly possible to perform calculation in a Galois field, and then no rounding errors, as it is done for example to implement "classical" multiplications (with z = 10) for a large number of digits. This seems over-skilled here.

Unable to understand algorithm to find maximum sum of Subarray

I am looking at the algorithm used to obtain the maximum sum of a subarray within an array and am unable to understand the logic behind the code. Specifically, this line max_ending = max(0, max_ending + number). I don't understand what is being done here. Also, would this algorithm have a complexity of O(n) or O(n^2)?:
#include <vector>
#include <algorithm>
using namespace std;
template <typename T> T max_sub_array (vector<T> const & numbers)
{
T max_ending = 0; max_so_far = 0;
for(auto & number: numbers)
{
max_ending = max(0, max_ending + number);
max_so_far = max(max_so_far, max_ending);
}
return max_so_far;
}
Thank You
The algorithm you presented seems to be the one attributed (in Wikipedia) to Jay Kadane. The line, max_ending = max(0, max_ending + number), means we are looking only at non-negative sums; in other words, if adding one more element to the current subarray would result in a negative sum, then make the maximum sum ending at this index zero (i.e., the subarray is empty again). The line relies on the idea that the only time we need to reset the examined subarray window is if it drops below zero — even if large positive elements could be added later, a larger subarray sum would be attained without a drop to negative in the middle. Let's look at an example (max_ending means the maximum sum for a subarray ending at the current index):
{1,2,23,-4,3,-10}
max_ending 1 3 26 22 25 15
max_so_far 1 3 26 26 26 26
Time complexity for this algorithm is assessed as O(n) since each array element needs to be visited once, and the number of iterations is dependent on the array size in a linear fashion. O(n^2) would mean that for each array element, the number of iterations would be on the order of the array size; so as the array size increases, the number of iterations would increase quadratically.
This is taken from exercise 4.1-5 of the Introduction to Algorithms Book.
Since this question and its answer addresses this. I think this may be helpful.
"Use the following ideas to develop a nonrecursive, linear-time algorithm for the maximum-subarray problem. Start at the left end of the array, and progress toward the right,keeping track of the maximum subarray seen so far. Knowing a maximum subarray of A[1.. j] , extend the answer to find a maximum subarray ending at index j + 1 by using the following observation: a maximum subarray of A[1.. j + 1] is either a maximum subarray of A[1.. j] or a subarray A[i .. j +1] , for some 1 <= i<= j + 1. Determine a maximum subarray of the form A[i... j+1] in constant time based on knowing a maximum subarray ending at index j."
Answer
First we need to figure out the maximum sub array ending at index j + 1 which could be just A[j + 1] or the maximum subarray ending at j plus A[j+1] , therefore we find the max of this two.
Once we have the maximum sub array ending at index j + 1 we find the maximum of A[1..j+1] by again, getting the max between maximum sub array ending at index j + 1 and maximum subarray of A[1..j].
So basically the idea is getting the maximum sub array ending at the current index of each iteration and getting the max between this and the max of the previous iteration.
Also I think this is incorrect
Edit: This will depend on definition, if the array doesn't need to contain at least one positive number. Otherwise yours is OK.
max_ending = max(0, max_ending + number);
it should be :
max_ending = max(number, max_ending + number);
Also max_ending and max_so_far should start at numbers[0] and the loop at index 1, if you follow mi change.
The complexity for this algorithm is O(n)
Additional note:
In your version there is no need to get the max between number and max_ending + number because max_ending >= 0 so number <= number + max_ending
https://app.codility.com/programmers/lessons/9-maximum_slice_problem/max_slice_sum/
The coding test would help to understand more about what to choose 0 or others for max_ending. As #גלעד ברקן said, it's important decision point whether to discard the slice so far.
Let's say sum[i] is the sum of sequence end with numbers[i], then sum[i] = max(numbers[i], numbers[i] + sum[i-1]), the answer is max(sum[i] for i from 0 to numbers.size() - 1)
See Maximum subarray problem

longest contiguous subsequence such that twice the number of zeroes is less than equal to thrice the number of ones

I was trying to solve this problem from hacker rank I tried the brute fore solution but it doesnt seem to work. Can some one gimme an idea to solve this problem efficiently.
https://www.hackerrank.com/contests/sep13/challenges/sherlock-puzzle
Given a binary string (S) which contains ‘0’s and ‘1’s and an integer K,
find the length (L) of the longest contiguous subsequence of (S * K) such that twice the number of zeroes is <= thrice the number of ones (2 * #0s <= 3 * #1s) in that sequence.
S * K is defined as follows: S * 1 = S
S * K = S + S * (K - 1)
Input Format
The first (and only) line contains an integer K and the binary string S separated by a single space.
Constraints
1 <= |S| <= 1,000,000
1 <= K <= 1,000,000
Output Format
A single integer L - the answer to the test case
Here's a hint:
Let's first suppose K = 1 and that S looks like (using a dot for 0):
..1...11...11.....111111....111....
e f b a c d
The key is to note that if the longest acceptable sequence contains a 1 it will also contain any adjacent ones. For example, if the longest sequence contains the 1 at a, it will also contain all of the ones between b and c (inclusive).
So you only have to analyze the sequence at the points where the blocks of ones are.
The main question is: if you start at a certain block of ones, can you make it to the next block of ones? For instance, if you start at e you can make it to the block at f but not to b. If you start at b you can make it to the block at d, etc.
Then generalize the analysis for K > 1.
Brute force obviously won't work since it's O((n * k) ** 2). I will use python style list comprehensions in this answer. You'll need an array t = [3 if el == "1" else - 2 for el in S]. Now if you use the p[i] = t[0] + ... + t[i] array you can see that in the k == 1 case you are basically looking for a pair (i, j), i < j such that p[j] - (p[i - 1] if i != 0 else 0) >= 0 is true and j - i is maximal among
these pairs. Now for each i in 0..n-1 you have to find find it's j pair such that the above is maximal. This can be done in O(log n) for a specific i so this gives and O(n log n) solution for the k == 1 case. This can be extended to an O(n log n) solution for the general case(there is a trick to find the largest block that can be covered). Also there is an O(n) solution to this problem but you need to further examine the p sequence for that. I don't suggest to write a solution in a scripting language though. Even the O(n) solution times out in python...

To find the min and max after addition and subtraction from a range of numbers

I am having a Algorithm question, in which numbers are been given from 1 to N and a number of operations are to be performed and then min/max has to be found among them.
Two operations - Addition and subtraction
and operations are in the form a b c d , where a is the operation to be performed,b is the starting number and c is the ending number and d is the number to be added/subtracted
for example
suppose numbers are 1 to N
and
N =5
1 2 3 4 5
We perform operations as
1 2 4 5
2 1 3 4
1 4 5 6
By these operations we will have numbers from 1 to N as
1 7 8 9 5
-3 3 4 9 5
-3 3 4 15 11
So the maximum is 15 and min is -3
My Approach:
I have taken the lower limit and upper limit of the numbers in this case it is 1 and 5 only stored in an array and applied the operations, and then had found the minimum and maximum.
Could there be any better approach?
I will assume that all update (addition/subtraction) operations happen before finding max/min. I don't have a good solution for update and min/max operations mixing together.
You can use a plain array, where the value at index i of the array is the difference between the index i and index (i - 1) of the original array. This makes the sum from index 0 to index i of our array to be the value at index i of the original array.
Subtraction is addition with the negated number, so they can be treated similarly. When we need to add k to the original array from index i to index j, we will add k to index i of our array, and subtract k to index (j + 1) of our array. This takes O(1) time per update.
You can find the min/max of the original array by accumulating summing the values and record the max/min values. This takes O(n) time per operation. I assume this is done once for the whole array.
Pseudocode:
a[N] // Original array
d[N] // Difference array
// Initialization
d[0] = a[0]
for (i = 1 to N-1)
d[i] = a[i] - a[i - 1]
// Addition (subtraction is similar)
add(from_idx, to_idx, amount) {
d[from_idx] += amount
d[to_idx + 1] -= amount
}
// Find max/min for the WHOLE array after add/subtract
current = max = min = d[0];
for (i = 1 to N - 1) {
current += d[i]; // Sum from d[0] to d[i] is a[i]
max = MAX(max, current);
min = MIN(min, current);
}
Generally there is no "best way" to find the min/max in the performance point of view because it depends on how this application will be used.
-Finding the max and min in a list needs O(n) Time, so if you want to run many (many in the context of the input) operations, your approach to find the min/max after all the operations took place is fine.
-But if the list will hold many elements and you don’t want to run that many operations, you better check each result of the op if its a new max/min and update if necessary.

All possible combinations of length 8 in a 2d array

I've been trying to solve a problem in combinations. I have a matrix 6X6 i'm trying to find all combinations of length 8 in the matrix.
I have to move from neighbor to neighbor form each row,column position and i wrote a recursive program which generates the combination but the problem is it generates a lot of duplicates as well and hence is inefficient. I would like to know how could i eliminate calculating duplicates and save time.
int a={{1,2,3,4,5,6},
{8,9,1,2,3,4},
{5,6,7,8,9,1},
{2,3,4,5,6,7},
{8,9,1,2,3,4},
{5,6,7,8,9,1},
}
void genSeq(int row,int col,int length,int combi)
{
if(length==8)
{
printf("%d\n",combi);
return;
}
combi = (combi * 10) + a[row][col];
if((row-1)>=0)
genSeq(row-1,col,length+1,combi);
if((col-1)>=0)
genSeq(row,col-1,length+1,combi);
if((row+1)<6)
genSeq(row+1,col,length+1,combi);
if((col+1)<6)
genSeq(row,col+1,length+1,combi);
if((row+1)<6&&(col+1)<6)
genSeq(row+1,col+1,length+1,combi);
if((row-1)>=0&&(col+1)<6)
genSeq(row-1,col+1,length+1,combi);
if((row+1)<6&&(row-1)>=0)
genSeq(row+1,col-1,length+1,combi);
if((row-1)>=0&&(col-1)>=0)
genSeq(row-1,col-1,length+1,combi);
}
I was also thinking of writing a dynamic program basically recursion with memorization. Is it a better choice?? if yes than I'm not clear how to implement it in recursion. Have i really hit a dead end with approach???
Thankyou
Edit
Eg result
12121212,12121218,12121219,12121211,12121213.
the restrictions are that you have to move to your neighbor from any point, you have to start for each position in the matrix i.e each row,col. you can move one step at a time, i.e right, left, up, down and the both diagonal positions. Check the if conditions.
i.e
if your in (0,0) you can move to either (1,0) or (1,1) or (0,1) i.e three neighbors.
if your in (2,2) you can move to eight neighbors.
so on...
To eliminate duplicates you can covert 8 digit sequences into 8-digit integers and put them in a hashtable.
Memoization might be a good idea. You can memoize for each cell in the matrix all possible combinations of length 2-7 that can be achieved from it. Going backwards: first generate for each cell all sequences of 2 digits. Then based on that of 3 digits etc.
UPDATE: code in Python
# original matrix
lst = [
[1,2,3,4,5,6],
[8,9,1,2,3,4],
[5,6,7,8,9,1],
[2,3,4,5,6,7],
[8,9,1,2,3,4],
[5,6,7,8,9,1]]
# working matrtix; wrk[i][j] contains a set of all possible paths of length k which can end in lst[i][j]
wrk = [[set() for i in range(6)] for j in range(6)]
# for the first (0rh) iteration initialize with single step paths
for i in range(0, 6):
for j in range(0, 6):
wrk[i][j].add(lst[i][j])
# run iterations 1 through 7
for k in range(1,8):
# create new emtpy wrk matrix for the next iteration
nw = [[set() for i in range(6)] for j in range(6)]
for i in range(0, 6):
for j in range(0, 6):
# the next gen. wrk[i][j] is going to be based on the current wrk paths of its neighbors
ns = set()
if i > 0:
for p in wrk[i-1][j]:
ns.add(10**k * lst[i][j] + p)
if i < 5:
for p in wrk[i+1][j]:
ns.add(10**k * lst[i][j] + p)
if j > 0:
for p in wrk[i][j-1]:
ns.add(10**k * lst[i][j] + p)
if j < 5:
for p in wrk[i][j+1]:
ns.add(10**k * lst[i][j] + p)
nw[i][j] = ns
wrk = nw
# now build final set to eliminate duplicates
result = set()
for i in range(0, 6):
for j in range(0, 6):
result |= wrk[i][j]
print len(result)
print result
There are LOTS of ways to do this. Going through every combination is a perfectly reasonable first approach. It all depends on your requirements. If your matrix is small, and this operation isn't time sensitive, then there's no problem.
I'm not really an algorithms guy, but I'm sure there are really clever ways of doing this that someone will post after me.
Also, in Java when using CamelCase, method names should start with a lowercase character.
int a={{1,2,3,4,5,6},
{8,9,1,2,3,4},
{5,6,7,8,9,1},
{2,3,4,5,6,7},
{8,9,1,2,3,4},
{5,6,7,8,9,1},
}
By length you mean summation of combination of matrix elements resulting 8. i.e., elements to sum up 8 with in row itself and with the other row elements. From row 1 = { {2,6}, {3,5}, } and now row 1 elements with row 2 and so on. Is that what you are expecting ?
You can think about your matrix like it is one-dimension array - no matter here ("place" the rows one by one). For one-dimension array you can write a function like (assuming you should print the combinations)
f(i, n) prints all combinations of length n using elements a[i] ... a[last].
It should skip some elements from a[i] to a[i + k] (for all possible k), print a[k] and make a recursive call f(i + k + 1, n - 1).