spoj dp lsort approach - c++

http://www.spoj.com/problems/LSORT/ It is a problem on spoj
It states that
You are given a permutation of n numbers that are between 1 to n and having no duplicates.
Task is to sort that permutation in ascending order.There is another array Q in which we are inserting elements from given permutation P.
You have to implement N steps to sort P. In the i-th step, P has N-i+1 remaining elements, Q has i-1 elements and you have to choose some x-th element (from the N-i+1 available elements) of P and put it to the left or to the right of Q. The cost of this step is equal to x * i. The total cost is the sum of costs of individual steps. After N steps, Q must be an ascending sequence. Your task is to minimize the total cost.
Input
The first line of the input file is T (T ≤ 10), the number of test cases. Then descriptions of T test cases follow. The description of each test case consists of two lines. The first line contains a single integer N (1 ≤ N ≤ 1000). The second line contains N distinct integers from the set {1, 2, .., N}, the N-element permutation P.
Output
For each test case your program should write one line, containing a single integer - the minimum total cost of sorting.
Now i have figured out the dp
My recurrence relation states that for getting most optimal values from elements having value i to j i will have to insert either $i$ at front or $j$ at back.
Cost of inserting i at front = dp[i+1][j]+cost of adding element i at front
Cost of inserting j at back = dp[i][j-1] +cost of adding element j at back
and i have to take minimum of these.answer would be dp[1][n]
for(l=1;l<=n;l++) //length of current permutation Q
{
for(i=1;i<=n-l+1;i++) //starting value of permutation Q
{
j=i+l-1; //ending value of permutation Q
dp[i][j]=min(dp[i+1][j]+l*xi,dp[i][j-1]+l*xj);//chosing wether to insert i at start or j at end
}
}
here xi=index of element i from start of permutation P.
and yi=index of element j from start of permutation P.
ans would be dp[1][n]
But am unable to figure out xi and xj
Please help

You can try re-thinking your DP state.
For me, I would use the dp[startQ][endQ] where dp[startQ][endQ] means the cost I have incurred to far to 'sort' values startQ to endQ in the array Q.
If you know what is in the array Q (integers startQ to endQ inclusive), one can easily re-construct the array of P by just removing/ignoring all the integers within startQ and endQ.
For each state, dp[startQ][endQ], since one can only add to the front or the back of Q,
dp[startQ][endQ] can only be:
dp[startQ][endQ-1] + cost of adding endQ
dp[startQ-1][endQ] + cost of adding startQ
with the base cases being
dp[i][i] = 0;
These states can be computed and the answer can be found at dp[1]][n]; (assuming it is one indexed).
However I haven't thought of a efficient way to compute x if it were to be coded in a top down manner, where as the whole computation can be performed in O(N^2 log N) using bottom-up DP with a data structure to compute x at every state.
I will leave the final details for you to code out :) but I can help more if required.

Related

Counting inversion after swapping two elements of array

You are given a permutation p1,p2,...,pn of numbers from 1 to n.
A permutation is a sequence of integers from 1 to n of length n containing each number exactly once.
You are given q queries where each query consists of two integers a and b, In response to each query you need to return a number of inversions of permutation after swapping elements at index a and b, Here every query is independent i.e. after each query the permutation is restored to its initial state.
An inversion in a permutation p is a pair of indices (i, j) such that i > j and pi < pj. For example, a permutation [4, 1, 3, 2] contains 4 inversions: (2, 1), (3, 1), (4, 1), (4, 3).
Input: The first line contains n,q.
The second line contains the space-separated permutation p1,p2,...,pn.
Each line of the next q lines contains two integers a,b.
Output: For each query Print an integer denoting the number of Inversion on a new line.
Sample input:
5 5
1 2 3 4 5
1 2
1 3
2 5
2 4
3 3
Output:
1
3
5
3
0
Constraints:
2<=n<=1000
1<=q<=200000
My approach: I am counting no of inversions using BIT (https://www.geeksforgeeks.org/count-inversions-array-set-3-using-bit/) for each query after swapping elements at position a and b..and then again swapping it so that my array remains unchanged. But this solution gives TLE for large test cases. Is there any better approach for this problem?
You are getting TLE probably because number of computations in this approach is q * (n * log(n)) = 2 * 10^5 * 10^3 * log(1000) = ~10^9, which is more than generally accepted computations ~10^8.
I can think of the following solution. Please note that I have not coded / verified it:
Denoting ri == number of indices j, such that i > j && pi < pj. Eg: [2, 3, 1, 4], r3 = 2. Basically, it means the number of inversions with the farther index as i. (Please note that I am using 1-based index as per the question. Also,a < b as per the question)
Thus we have: Sum of ri == #invs (number of inversions)
We can calculate initial total #invs in O(n^2)
When a and b are swapped, we can observe that:
a) ri remains constant, where i < a .
b) ri remains constant, where i > b.
Only ri changes where a <= i <=b, and that too on these following conditions. I am considering the case when pa < pb. Exact opposite case will need to considered when pa > pb.
a) Since pa < pb, thus this swap causes #invs = #invs + 1
b) If (pi < pa && pi < pb) || (pi > pa && pi > pb), this swap does not change ri. Eg: [2,....10,....5]. Here Swapping 2 and 5 does not change the r value for 10.
c) If pa < pi < pb, it will increment ri by 1, and new rb by 1. Eg: [2,....3,.....4], when 2 and 4 are swapped, we have [4,....3,....2], the rvalue 3 increases by 1 (because of 4); and also the r value of 2 increase by 1 (because of 3). Please note that increment because of what about 4 > 2? was already calculated in step (a), and needs to be done once only.
d) We need to find all such indicies i where pa < pi < pb as we started with above. Let us call it f(a, b). Then the total change in #invs delta = (2 * f(a, b)) + 1, and answer will be #original_invs + delta.
As I mentioned, all the exact opposite steps need to be done for the case pa > pb. The delta will be negative in that case.
Now, the only thing remained is to solve: Given a, b, find f(a, b) efficiently. For this, we can pre-process and store it for all pairs of indices. This will take O(N^2) space, and O(N^2 * log(N)) time, using a balanced binary-search-tree (BST). Again showing steps for pre-processing for case pa < pb only. Another set of pre-processing steps needs to be done for the other case:
We will use self-balancing BST, in which each node also contains the following fields:
a) field_1: This denotes the size of the left sub-tree. This value will be updated on every insert operation, if size of left-sub-tree changes.
b) field_2: This denotes the number of elements < node.value that this tree has. This value is initialized once when the node is inserted and does not change thereafter. I have added a small explanation of how it will be achieved in Addendum-A. This field is basically our pre-processing, which will determine f(a, b).
With all of this now, for each index i, where 0 <= i < n, do the following: Create new tree. Insert pj values into the tree one by one, where (i < j < n ) && (pa < pj) . (Please note we are not inserting values where pa > pj). The method given in Addendum-A will make sure we find f(i, j) while inserting.
There will be n such pre-processed trees, one for every index. For finding f(a, b): We need to look into ath tree, and search node.value = pb. This node's field_2 = f(a, b).
The complexity of insertion is O(logN). So, the total pre-processing computation = O(N * N(logN)). Search is O(logN), so the query complexity is O(q * logN). Total complexity = O(N^2) + O(N * N (logN)) + O(q * logN) which will turn out ~10^7
==============================================================================
Addendum A: How to populate field_2 while inserting node:
i) Insert the node, and balance the tree. Update field_1 as required.
i) Initailze ans = 0. Traverse the BST from root searching for your node.
iii) do {
If node.value < search_key_b, ans += node.left_subtree_size + 1
} while(!node.found)
iv) ans -= 1
We can solve this in O(n log n) space and O(n log n + Q * log^2(n)) time with a merge-sort tree. The merge-sort tree allows us to find the number of elements inside a subarray that are greater than or lower than an input number in O(log^2(n)) time and O(n log n) space.
First we record the total number of inversions in O(n log n) time, for which there are known methods. To query the effect of a swap bound by left and right, consider the subarray between:
subtract the number of elements greater
than right in the subarray (those will
no longer be inversions)
subtract the number of elements smaller
than left in the subarray (those will
no longer be inversions)
add the number of elements greater
than left in the subarray (those will
be new inversions)
add the number of elements smaller
than right in the subarray (those will
be new inversions)
if right > left, add 1
if left > right, subtract 1

Count Divisors of Product from L to R

I have been solving a problem but then got stuck upon its subpart which is as follows:
Given an array of N elements whose ith element is A[i] and we are given Q queries of the type [L,R].
For each query output the number of divisors of product from Lth element to Rth element.
More formally, for each query lets define P as P = A[L] * A[L+1] * A[L+2] * ...* A[R].
Output the number of divisors of P modulo 998244353.
Constraints : 1<= N,Q <= 100000, 1<= A[i] <= 1000000.
My Approach,
For each index i, I have defined a map< int, int > which stores the prime divisor and its count in the product up to [1, i].
I am extracting the prime divisors of a number in O(LogN) using Sieve.
Then for each query (lets say {L,R} ), I am iterating through the map of Lth element and subtracting the count of each each key from the map of Rth element.
And then I am answering the query using the result:
if N = a^p * b^q * c^r ...(a,b,c being primes)
the number of divisors = (p+1)(q+1)(r+1)..
The time complexity of above solution is O(ND + QD), where D = number of distinct prime numbers upto 1000000. In worst case D = 78498.
Is there more efficient solution than this?
There is a more efficient solution for this. But it is slightly complicated. Here are steps to get to the necessary data structure.
Define a data type prime_factor that is a struct that contains a prime and a count.
Define a data type prime_factorization that is a vector of the first data type in ascending size of the primes. This can store the factorization of a number.
Write a function that takes a number, and turns its prime factorization into a prime_factorization
Write a function that takes 2 prime_factorization vectors and merges them into the factorization of the product of the two.
For each number in your array, compute its prime factorization. That gets stored in an array.
For each pair in your array, compute the prime factorization of the product. We will only need half of them. So elements 0, 1 go into one factorization, 2, 3 into the next and so on.
Repeat step 6 O(log(N)) times. So you have a vector of the factorization of each number, pairs, fours, eights, and so on. This results in approximately 2N precomputed factorization vectors. Most vectors are small though a few can be up to O(D) in size (where D is the number of distinct primes). Most of the merges should be very, very fast.
And now you have all of your data prepared. It can't take more than O(log(N)) times the space that storing the prime factors required by itself. (Less than that normally, though, because repeats among the small primes get gathered together in one prime_factor.)
Any range is the union of at most O(log(N)) of these computed vectors. For example the range 10..25 can be broken up into 10..11, 12..15, 16..24, 25. Arrange these intervals from smallest to largest and merge them. Then compute your answer from the result.
An exact analysis is complicated. But I assure you that query time is bounded above by O(Q * D * log(N)) and normally is much less than that.
UPDATE:
How do you find those intervals?
The answer is that you need to identify the number divisible by the highest power of 2 in the range, and then fill out both sides from there. And you figure that out by dividing by 2 (rounding down) until the range is of length 1. Then multiply the top boundary by 2 to find that mid-point.
For example if your range was 35-53 you would start by dividing by 2 to get 35-53, 17-26, 8-13, 4-6, 2-3. That was 2^4 we divided by. our power of 2 mid-point is 3*2^4 = 48. Our intervals above that midpoint are then 48-52, 53-53. Our intervals below are 40-47, 36-39, 35-35. And each of them is of length a power of 2 and starts at a number divisible by that power of 2.

Big 0 notation for duplicate function, C++

What is the Big 0 notation for the function description in the screenshot.
It would take O(n) to go through all the numbers but once it finds the numbers and removes them what would that be? Would the removed parts be a constant A? and then would the function have to iterate through the numbers again?
This is what I am thinking for Big O
T(n) = n + a + (n-a) or something involving having to iterate through (n-a) number of steps after the first duplicate is found, then would big O be O(n)?
Big O notation is considering the worst case. Let's say we need to remove all duplicates from the array A=[1..n]. The algorithm will start with the first element and check every remaining element - there are n-1 of them. Since all values happen to be different it won't remove any from the array.
Next, the algorithm selects the second element and checks the remaining n-2 elements in the array. And so on.
When the algorithm arrives at the final element it is done. The total number of comparisions is the sum of (n-1) + (n-2) + ... + 2 + 1 + 0. Through the power of maths, this sum becomes (n-1)*n/2 and the dominating term is n^2 so the algorithm is O(n^2).
This algorithm is O(n^2). Because for each element in the array you are iterating over the array and counting the occurrences of that element.
foreach item in array
count = 0
foreach other in array
if item == other
count += 1
if count > 1
remove item
As you see there are two nested loops in this algorithm which results in O(n*n).
Removed items doesn't affect the worst case. Consider an array containing unique elements. No elements is being removed in this array.
Note: A naive implementation of this algorithm could result in O(n^3) complexity.
You started with first element you will go through all elements in the vector thats n-1 you will do that for n time its (n * n-1)/2 for worst case n time is the best case (all elements are 4)

Algorithm to find isomorphic set of permutations

I have an array of set of permutations, and I want to remove isomorphic permutations.
We have S sets of permutations, where each set contain K permutations, and each permutation is represented as and array of N elements. I'm currently saving it as an array int pset[S][K][N], where S, K and N are fixed, and N is larger than K.
Two sets of permutations, A and B, are isomorphic, if there exists a permutation P, that converts elements from A to B (for example, if a is an element of set A, then P(a) is an element of set B). In this case we can say that P makes A and B isomorphic.
My current algorithm is:
We choose all pairs s1 = pset[i] and s2 = pset[j], such that i < j
Each element from choosen sets (s1 and s2) are numered from 1 to K. That means that each element can be represented as s1[i] or s2[i], where 0 < i < K+1
For every permutation T of K elements, we do the following:
Find the permutation R, such that R(s1[1]) = s2[1]
Check if R is a permutation that make s1 and T(s2) isomorphic, where T(s2) is a rearrangement of the elements (permutations) of the set s2, so basically we just check if R(s1[i]) = s2[T[i]], where 0 < i < K+1
If not, then we go to the next permutation T.
This algorithms works really slow: O(S^2) for the first step, O(K!) to loop through each permutation T, O(N^2) to find the R, and O(K*N) to check if the R is the permutation that makes s1 and s2 isomorphic - so it is O(S^2 * K! * N^2).
Question: Can we make it faster?
You can sort and compare:
// 1 - sort each set of permutation
for i = 0 to S-1
sort(pset[i])
// 2 - sort the array of permutations itself
sort(pset)
// 3 - compare
for i = 1 to S-1 {
if(areEqual(pset[i], pset[i-1]))
// pset[i] and pset[i-1] are isomorphic
}
A concrete example:
0: [[1,2,3],[3,2,1]]
1: [[2,3,1],[1,3,2]]
2: [[1,2,3],[2,3,1]]
3: [[3,2,1],[1,2,3]]
After 1:
0: [[1,2,3],[3,2,1]]
1: [[1,3,2],[2,3,1]] // order changed
2: [[1,2,3],[2,3,1]]
3: [[1,2,3],[3,2,1]] // order changed
After 2:
2: [[1,2,3],[2,3,1]]
0: [[1,2,3],[3,2,1]]
3: [[1,2,3],[3,2,1]]
1: [[1,3,2],[2,3,1]]
After 3:
(2, 0) not isomorphic
(0, 3) isomorphic
(3, 1) not isomorphic
What about the complexity?
1 is O(S * (K * N) * log(K * N))
2 is O(S * K * N * log(S * K * N))
3 is O(S * K * N)
So the overall complexity is O(S * K * N log(S * K * N))
There is a very simple solution for this: transposition.
If two sets are isomorphic, it means a one-to-one mapping exists, where the set of all the numbers at index i in set S1 equals the set of all the numbers at some index k in set S2. My conjecture is that no two non-isomorphic sets have this property.
(1) Jean Logeart's example:
0: [[1,2,3],[3,2,1]]
1: [[2,3,1],[1,3,2]]
2: [[1,2,3],[2,3,1]]
3: [[3,2,1],[1,2,3]]
Perform ONE pass:
Transpose, O(n):
0: [[1,3],[2,2],[3,1]]
Sort both in and between groups, O(something log something):
0: [[1,3],[1,3],[2,2]]
Hash:
"131322" -> 0
...
"121233" -> 1
"121323" -> 2
"131322" -> already hashed.
0 and 3 are isomorphic.
(2) vsoftco's counter-example in his comment to Jean Logeart's answer:
A = [ [0, 1, 2], [2, 0, 1] ]
B = [ [1, 0, 2], [0, 2, 1] ]
"010212" -> A
"010212" -> already hashed.
A and B are isomorphic.
You can turn each set into a transposed-sorted string or hash or whatever compressed object for linear-time comparison. Note that this algorithm considers all three sets A, B and C as isomorphic even if one p converts A to B and another p converts A to C. Clearly, in this case, there are ps to convert any one of these three sets to the other, since all we are doing is moving each i in one set to a specific k in the other. If, as you stated, your goal is to "remove isomorphic permutations," you will still get a list of sets to remove.
Explanation:
Assume that along with our sorted hash, we kept a record of which permutation each i came from. vsoftco's counter-example:
010212 // hash for A and B
100110 // origin permutation, set A
100110 // origin permutation, set B
In order to confirm isomorphism, we need to show that the i's grouped in each index from the first set moved to some index in the second set, which index does not matter. Sorting the groups of i's does not invalidate the solution, rather it serves to confirm movement/permutation between sets.
Now by definition, each number in a hash and each number in each group in the hash is represented in an origin permutation exactly one time for each set. However we choose to arrange the numbers in each group of i's in the hash, we are guaranteed that each number in that group is representing a different permutation in the set; and the moment we theoretically assign that number, we are guaranteed it is "reserved" for that permutation and index only. For a given number, say 2, in the two hashes, we are guaranteed that it comes from one index and permutation in set A, and in the second hash corresponds to one index and permutation in set B. That is all we really need to show - that the number in one index for each permutation in one set (a group of distinct i's) went to one index only in the other set (a group of distinct k's). Which permutation and index the number belongs to is irrelevant.
Remember that any set S2, isomorphic to set S1, can be derived from S1 using one permutation function or various combinations of different permutation functions applied to S1's members. What the sorting, or reordering, of our numbers and groups actually represents is the permutation we are choosing to assign as the solution to the isomorphism rather than an actual assignment of which number came from which index and permutation. Here is vsoftco's counter-example again, this time we will add the origin indexes of our hashes:
110022 // origin index set A
001122 // origin index set B
Therefore our permutation, a solution to the isomorphism, is:
Or, in order:
(Notice that in Jean Logeart's example there is more than one solution to the isomorphism.)
Suppose that two elements of s1, s2 \in S are isomorphic. Then if p1 and p2 are permutations, then s1 is isomorphic to s2 iff p1(s1) is isomorphic to p2(s2) where pi(si) is the set of permutations obtained by applying pi to every element in si.
For each i in 1...s and j in 1...k, choose the j-th member of si, and find the permutation that changes it to unity. Apply it to all the elements of si. Hash each of the k permutations to a number, obtaining k numbers, for any choice of i and j, at cost nk.
Comparing the hashed sets for two different values of i and j is k^2 < nk. Thus, you can find the set of candidate matches at cost s^2 k^3 n. If the actual number of matches is low, the overall complexity is far beneath what you specified in your question.
Take a0 in A. Then find it's inverse (fast, O(N)), call it a0inv. Then choose some i in B and define P_i = b_i * ainv and check that P_i * a generates B, when varying a over A. Do this for every i in B. If you don't find any i for which the relation holds, then the sets are not isomorphic. If you find such an i, then the sets are isomorphic. The runtime is O(K^2) for each pair of sets it checks, and you'd need to check O(S^2) sets, so you end up with O(S^2 * K^2 * N).
PS: I assumed here that by "maps A to B" you mean mapping under permutation composition, so P(a) is actually the permutation P composed with the permutation a, and I've used the fact that if P is a permutation, then there must exist an i for which Pa = b_i for some a.
EDIT I decided to undelete my answer as I am not convinced the previous one (#Jean Logeart) based on searching is correct. If yes, I'll gladly delete mine, as it performs worse, but I think I have a counterexample, see the comments below Jean's answer.
To check if two sets S₁ and S₂ are isomorphic you can do a much shorter search.
If they are isomorphic then there is a permutation t that maps each element of S₁ to an element of S₂; to find t you can just pick any fixed element p of S₁ and consider the permutations
t₁ = (1/p) q₁
t₂ = (1/p) q₂
t₃ = (1/p) q₃
...
for all elements q of S₂. For, if a valid t exists then it must map the element p to an element of S₂, so only permutations mapping p to an element of S₂ are possible candidates.
Moreover given a candidate t to check if two sets of permutations S₁t and S₂ are equal you could use an hash computed as the x-or of an hash code for each element, doing the full check of all the permutations only if the hash matches.

Given an array of N numbers,find the number of sequences of all lengths having the range of R?

This is a follow up question to Given a sequence of N numbers ,extract number of sequences of length K having range less than R?
I basically need a vector v as an answer of size N such that V[i] denotes number of sequences of length i which have range <=R.
Traditionally, in recursive solutions, you would compute the solution for K = 0, K = 1, and then find some kind of recurrence relation between subsequent elements to avoid recomputing the solution from scratch each time.
However here I believe that maybe attacking the problem from the other side would be interesting, because of the property of the spread:
Given a sequence of spread R (or less), any subsequence has a spread inferior to R as well
Therefore, I would first establish a list of the longest subsequences of spread R beginning at each index. Let's call this list M, and have M[i] = j where j is the higher index in S (the original sequence) for which S[j] - S[i] <= R. This is going to be O(N).
Now, for any i, the number of sequences of length K starting at i is either 0 or 1, and this depends whether K is greater than M[i] - i or not. A simple linear pass over M (from 0 to N-K) gives us the answer. This is once again O(N).
So, if we call V the resulting vector, with V[k] denoting the number of subsequences of length K in S with spread inferior to R, then we can do it in a single iteration over M:
for i in [0, len(M)]:
for k in [0, M[i] - i]:
++V[k]
The algorithm is simple, however the number of updates can be rather daunting. In the worst case, supposing than M[i] - i equals N - i, it is O(N*N) complexity. You would need a better data structure (probably an adaptation of a Fenwick Tree) to use this algorithm an lower the cost of computing those numbers.
If you are looking for contiguous sequences, try doing it recursively : The K-length subsequences set having a range inferior than R are included in the (K-1)-length subsequences set.
At K=0, you have N solutions.
Each time you increase K, you append (resp. prepend) the next (resp.previous) element, check if it the range is inferior to R, and either store it in a set (look for duplicates !) or discard it depending on the result.
If think the complexity of this algorithm is O(n*n) in the worst-case scenario, though it may be better on average.
I think Matthieu has the right answer when looking for all sequences with spread R.
As you are only looking for sequences of length K, you can do a little better.
Instead of looking at the maximum sequence starting at i, just look at the sequence of length K starting at i, and see if it has range R or not. Do this for every i, and you have all sequences of length K with spread R.
You don't need to go through the whole list, as the latest start point for a sequence of length K is n-K+1. So complexity is something like (n-K+1)*K = n*K - K*K + K. For K=1 this is n,
and for K=n it is n. For K=n/2 it is n*n/2 - n*n/4 + n/2 = n*n/2 + n/2, which I think is the maximum. So while this is still O(n*n), for most values of K you get a little better.
Start with a simpler problem: count the maximal length of sequences, starting at each index and having the range, equal to R.
To do this, let first pointer point to the first element of the array. Increase second pointer (also starting from the first element of the array) while sequence between pointers has the range, less or equal to R. Push every array element, passed by second pointer, to min-max-queue, made of a pair of mix-max-stacks, described in this answer. When difference between max and min values, reported by min-max-queue exceeds R, stop increasing second pointer, increment V[ptr2-ptr1], increment first pointer (removing element, pointed by it, from min-max-queue), and continue increasing second pointer (keeping range under control).
When second pointer leaves bounds of the array, increment V[N-ptr1] for all remaining ptr1 (corresponding ranges may be less or equal to R). To add all other ranges, that are less than R, compute cumulative sum of array V[], starting from its end.
Both time and space complexities are O(N).
Pseudo-code:
p1 = p2 = 0;
do {
do {
min_max_queue.push(a[p2]);
++p2;
} while (p2 < N && min_max_queue.range() <= R);
if (p2 < N) {
++v[p2 - p1 - 1];
min_max_queue.pop();
++p1;
}
} while (p2 < N);
for (i = 1; i <= N-p1; ++i) {
++v[i];
}
sum = 0;
for (j = N; j > 0; --j) {
value = v[j];
v[j] += sum;
sum += value;
}