With given permutation 1...n for example 5 3 4 1 2
how to find all ascending subsequences of length 3 in linear time ?
Is it possible to find other ascending subsequences of length X ? X
I don't have idea how to solve it in linear time.
Do you need the actual ascending sequences? Or just the number of ascending subsequences?
It isn't possible to generate them all in less than the time it takes to list them. Which, as has been pointed out, is O(NX / (X-1)!). (There is a possibly unexpected factor of X because it takes time O(X) to list a data structure of size X.) The obvious recursive search for them scales not far from that.
However counting them can be done in time O(X * N2) if you use dynamic programming. Here is Python for that.
counts = []
answer = 0
for i in range(len(perm)):
inner_counts = [0 for k in range(X)]
inner_counts[0] = 1
for j in range(i):
if perm[j] < perm[i]:
for k in range(1, X):
inner_counts[k] += counts[j][k-1]
counts.add(inner_counts)
answer += inner_counts[-1]
For your example 3 5 1 2 4 6 and X = 3 you will wind up with:
counts = [
[1, 0, 0],
[1, 1, 0],
[1, 0, 0],
[1, 1, 0],
[1, 3, 1],
[1, 5, 5]
]
answer = 6
(You only found 5 above, the missing one is 2 4 6.)
It isn't hard to extend this answer to create a data structure that makes it easy to list them directly, to find a random one, etc.
You can't find all ascending subsequences on linear time because there may be much more subsequences than that.
For instance in a sorted original sequence all subsets are increasing subsequences, so a sorted sequence of of length N (1,2,...,N) has N choose k = n!/(n-k)!k! increasing subsequences of length k.
Related
We are given an array with numbers from ranging from 1 to n (no duplicates) where n = size of the array.
We are allowed to do the following operation :
arr[i] = arr[arr[i]-1] , 0 <= i < n
Now, one iteration is considered when we perform above operation on the entire array.
Our task is to find the number of iterations after we encounter a previously encountered sequence.
Constraints :
a) Array has no duplicates
b) 1 <= arr[i] <= n , 0 <= i < n
c) 1 <= n <= 10^6
Ex 1:
n = 5
arr[] = {5, 4, 2, 1, 3}
After 1st iteration array becomes : {3, 1, 4, 5, 2}
After 2nd iteration array becomes : {4, 3, 5, 2, 1}
After 3rd iteration array becomes : {2, 5, 1, 3, 4}
After 4th iteration array becomes : {5, 4, 2, 1, 3}
In the 4th iteration, the sequence obtained is already seen before
So the expected output is 4.
This question was asked in one of job hiring tests, so I dont have any link to the question.
There were 2 sample test cases given out of which I remember one which is given above. I would really appreciate any help on this question
P.S.
I was able to code the brute force solution, where in I stored all the results in a Set and then kept advancing to the next permutation. But it gave TLE
First, note that an array of length n containing 1, 2, ..., n with no duplicates is a permutation.
Next, observe that arr[i] := arr[arr[i] - 1] is squaring the permutation.
That is, consider permutations as elements of the symmetric group S_n, where multiplication is composition of permutations.
Then the above operation is arr := arr * arr.
So, in terms of permutations and their composition, the question is as follows:
You are given a permutation p (= arr).
Consider permutations p, p^2, p^4, p^8, p^16, ...
What is the number of distinct elements among them?
Now, to solve it, consider the cycle notation of the permutation.
Every permutation is a product of disjoint cycles.
For example, 6 1 4 3 5 2 is the product of the following cycles: (1 6 2) (3 4) (5).
In other words, every application of this permutation:
moves elements at positions 1, 6, 2 along the cycle;
moves elements at positions 4, 3 along the cycle;
leaves element at position 5 in place.
So, when we consider p^k (take an identity permutation and apply the permutation p to it k times), we actually process three independent actions:
move elements at positions 1, 6, 2 along the cycle, k times;
move elements at positions 4, 3 along the cycle, k times;
leave element at position 5 in place, k times.
Now, take into account that, after d applications of a cycle of length d, it just returns all the respective elements to their initial places.
So, we can actually formulate p^k as:
move elements at positions 1, 6, 2 along the cycle, (k mod 3) times;
move elements at positions 4, 3 along the cycle, (k mod 2) times;
leave element at position 5 in place.
We can now prove (using Chinese Remainder Theorem, or just using general knowledge of group theory) that the permutations p, p^2, p^3, p^4, p^5, ... are all distinct up to p^m, where m is the least common multiple of all cycle lengths.
In our example with p = 6 1 4 3 5 2, we have p, p^2, p^3, p^4, p^5, and p^6 all distinct.
But p^6 is the identity permutation: moving six times along a cycle of length 2 or 3 results in the items at their initial places.
So p^7 is the same as p^1, p^8 is the same as p^2, and so on.
Our question however is harder: we want to know the number of distinct permutations not among p, p^2, p^3, p^4, p^5, ..., but among p, p^2, p^4, p^8, p^16, ...: p to the power of a power of two.
To do that, consider all cycle lengths c_1, c_2, ..., c_r in our permutation.
For each c_i, find the pre-period and period of 2^k mod c_i:
For example, c_1 = 3, and 2^k mod 3 look as 1, 2, 1, 2, 1, 2, ..., which is (1, 2) with pre-period 0 and period 2.
As another example, c_2 = 2, and 2^k mod 2 look as 1, 0, 0, 0, ..., which is 1, (0) with pre-period 1 and period 1.
In this problem, this part can be done naively, by just marking visited numbers mod c_i in some array.
By Chinese Remainder Theorem again, after all pre-periods are considered, the period of the whole system of cycles will be the least common multiple of all individual periods.
What remains is to consider pre-periods.
These can be processed with your naive solution anyway, as the lengths of pre-periods here is at most log_2 n.
The answer is the least common multiple of all individual periods, calculated as above, plus the length of the longest pre-period.
I am given this algorithmic problem, and need to find a way to return the count in a list S and another list L that is between some variable x and some variable y, inclusive, that runs in O(1) time:
I've issued a challenge against Jack. He will submit a list of his favorite years (from 0 to 2020). If Jack really likes a year,
he may list it multiple times. Since Jack comes up with this list on the fly, it is in no
particular order. Specifically, the list is not sorted, nor do years that appear in the list
multiple times appear next to each other in the list.
I will also submit such a list of years.
I then will ask Jack to pick a random year between 0 and 2020. Suppose Jack picks the year x.
At the same time, I will also then pick a random year between 0 and 2020. Suppose I
pick the year y. Without loss of generality, suppose that x ≤ y.
Once x and y are picked, Jack and I get a very short amount of time (perhaps 5
seconds) to decide if we want to re-do the process of selecting x and y.
If no one asks for a re-do, then we count the number of entries in Jack's list that are
between x and y inclusively and the number of entries in my list that are between x and
y inclusively.
More technically, here is the situation. You are given lists S and L of m and n integers,
respectively, in the range [0, k], representing the collections of years selected by Jack and
I. You may preprocess S and L in O(m+n+k) time. You must then give an algorithm
that runs in O(1) time – so that I can decide if I need to ask for a re-do – that solves the
following problem:
Input: Two integers, x as a member of [0,k] and y as a member of [0,k]
Output: the number of entries in S in the range [x, y], and the number of entries in L in [x, y].
For example, suppose S = {3, 1, 9, 2, 2, 3, 4}. Given x = 2 and y = 3, the returned count
would be 4.
I would prefer pseudocode; it helps me understand the problem a bit easier.
Implementing the approach of user3386109 taking care of edge case of x = 0.
user3386109 : Make a histogram, and then compute the accumulated sum for each entry in the histogram. Suppose S={3,1,9,2,2,3,4} and k is 9. The histogram is H={0,1,2,2,1,0,0,0,0,1}. After accumulating, H={0,1,3,5,6,6,6,6,6,7}. Given x=2 and y=3, the count is H[y] - H[x-1] = H[3] - H[1] = 5 - 1 = 4. Of course, x=0 is a corner case that has to be handled.
# INPUT
S = [3, 1, 9, 2, 2, 3, 4]
L = [2, 9, 4, 6, 8, 5, 3]
k = 9
x = 2
y = 3
# Histogram for S
S_hist = [0]*(k+1)
for element in S:
S_hist[element] = S_hist[element] + 1
# Storing prefix sum in S_hist
sum = S_hist[0]
for index in range(1,k+1):
sum = sum + S_hist[index]
S_hist[index] = sum
# Similar approach for L
# Histogram for L
L_hist = [0] * (k+1)
for element in L:
L_hist[element] = L_hist[element] + 1
# Stroing prefix sum in L_hist
sum = L_hist[0]
for index in range(1,k+1):
sum = sum + L_hist[index]
L_hist[index] = sum
# Finding number of elements between x and y (inclusive) in S
print("number of elements between x and y (inclusive) in S:")
if(x == 0):
print(S_hist[y])
else:
print(S_hist[y] - S_hist[x-1])
# Finding number of elements between x and y (inclusive) in S
print("number of elements between x and y (inclusive) in L:")
if(x == 0):
print(L_hist[y])
else:
print(L_hist[y] - L_hist[x-1])
Eg. The given array:[1,2,1,3,1,2,1,5]
should return-1 -> 2
2 -> 4
3 -> 0
5 -> 0
There is a solution I can think of but it is of O(n^2).
Suggest something better.
Transform in one linear scan your array into a hashmap of arrays indexed by value, containing the indices where that value was found. For your example this would be:
{
1: [0, 2, 4, 6],
2: [1, 5],
3: [3],
5: [7],
}
Then for each entry l in the hashmap output 0 if len(l) <= 1, and otherwise output l[1] - l[0]. If you also have to check that the period is consistent, check that l[i] - l[i-1] == l[1] - l[0] for all i >= 2.
How does heapq.heapify() work?
I am trying to find the median using heap.
heapify returns me a sorted way
when I add element using heapq.heappush() using it is inserted in a list.
When I call heapify again the list returned is not sorted.
import heapq
l=[5,15,1,3]
heapq.heapify(l)
print(l)
This gives me [1, 3, 5, 15]
But when I add heapq.heappush(l,2)
it returns
[1, 2, 5, 15, 3]
when I do the again heapq.heapify(l)
Still, it gives me the same.
[1, 2, 5, 15, 3]
How can we achieve to find median using the heap? Should the list be sorted?
if you have a look at the theory section of heapq you will find that it does not sort your list. but it puts them in an oder with a strange invariant:
lst[k] <= lst[2*k+1] and lst[k] <= lst[2*k+2]
this is satisfied for your list; if you look at it in 'binary tree' form:
1
2 5
15 3
2 is smaller than 15 and 3. which satisfies the condition. 5 is compared to non-existing elements (which are considered to be infinite - therefore the condition holds).
in order to sort your list you best use sorted:
lst = sorted(lst)
# [1, 3, 5, 15]
and to then efficiently insert in an already sorted list the bisect module:
from bisect import insort_left
insort_left(lst, 2)
# [1, 2, 3, 5, 15]
the median is now at lst[len(lst)//2].
print(f"median = {lst[len(lst)//2]}")
# median = 3
or, depending on your convention (here the one used in statistics.median):
def median(lst):
ln = len(lst)
if ln % 2 != 0:
return lst[ln // 2]
else:
return (lst[ln // 2 - 1] + lst[ln // 2]) / 2
If you want the sorted list after adding elements each time, try adding those elements to the list(append them). Then heapify the list as you did. It would give you the sorted list each time. :-)
Using np.argpartition, it does not sort the entire array. It only guarantees that the kth element is in sorted position and all smaller elements will be moved before it. Thus, the first k elements will be the k-smallest elements
>>> num = 3
>>> myBigArray=np.array([[1,3,2,5,7,0],[14,15,6,5,7,0],[17,8,9,5,7,0]])
>>> top = np.argpartition(myBigArray, num, axis=1)[:, :num]
>>> print top
[[5 0 2]
[3 5 2]
[5 3 4]]
>>> myBigArray[np.arange(myBigArray.shape[0])[:, None], top]
[[0 1 2]
[5 0 6]
[0 5 7]]
This returns the k-smallest values of each column. Note that these may not be in sorted order.I use this method because To get the top-k elements in sorted order in this way takes O(n + k log k) time
I want to get the k-smallest values of each column in sorted order, without increasing the time complexity.
Any suggestions??
To use np.argpartition and maintain the sorted order, we need to use those range of elements as range(k) instead of feeding in just the scalar kth param -
idx = np.argpartition(myBigArray, range(num), axis=1)[:, :num]
out = myBigArray[np.arange(idx.shape[0])[:,None], idx]
You can use the exact same trick that you used in the case of rows; combining with #Divakar's trick for sorting, this becomes
In [42]: num = 2
In [43]: myBigArray[np.argpartition(myBigArray, range(num), axis=0)[:num, :], np.arange(myBigArray.shape[1])[None, :]]
Out[43]:
array([[ 1, 3, 2, 5, 7, 0],
[14, 8, 6, 5, 7, 0]])
A bit of indirect indexing does the trick. Pleaese note that I worked on rows since you started off on rows.
fdim = np.arange(3)[:, None]
so = np.argsort(myBigArray[fdim, top], axis=-1)
tops = top[fdim, so]
myBigArray[fdim, tops]
# array([[0, 1, 2],
[0, 5, 6],
[0, 5, 7]])
A note on argpartition with range argument: I strongly suspect that it is not O(n + k log k); in any case it is typically several-fold slower than a manual argpartition + argsort see here