In a max heap H of n elements, the top 2 powerk -1 elements are in the first k levels. How to prove this as false? - heap

Prove the above statement as false.
n and k are positive integers such that 2 powerk -1<=n.
In a max heap of n elements, the top 2 powerk -1 elements are in the first k levels.

I think you mean:
The top 2k-1 elements are in the top k levels.
We can prove this by contradiction. Here is a max heap of 15 items.
F
/ \
E 7
/ \ / \
D A 6 5
/ \ / \ / \ / \
C B 9 8 4 3 2 1
This is a valid heap. Every item is smaller than its parent.
So let's assume your assertion is correct, that the top 2k-1 elements are in the top k levels. So the top 4 (23-1) elements should be in the top 3 levels. In this case, that would be F, E, D, and C. But C is in the 4th level. That contradicts the assertion.
Or do you mean that the top 2k-1 items must be in the top k levels? In that case, the top 3 (22-1) must be in the top 2 levels. But again, the item D, which is the 3rd largest, is in the 3rd level. Again, that contradicts the assertion.

Related

Counting inversion after swapping two elements of array

You are given a permutation p1,p2,...,pn of numbers from 1 to n.
A permutation is a sequence of integers from 1 to n of length n containing each number exactly once.
You are given q queries where each query consists of two integers a and b, In response to each query you need to return a number of inversions of permutation after swapping elements at index a and b, Here every query is independent i.e. after each query the permutation is restored to its initial state.
An inversion in a permutation p is a pair of indices (i, j) such that i > j and pi < pj. For example, a permutation [4, 1, 3, 2] contains 4 inversions: (2, 1), (3, 1), (4, 1), (4, 3).
Input: The first line contains n,q.
The second line contains the space-separated permutation p1,p2,...,pn.
Each line of the next q lines contains two integers a,b.
Output: For each query Print an integer denoting the number of Inversion on a new line.
Sample input:
5 5
1 2 3 4 5
1 2
1 3
2 5
2 4
3 3
Output:
1
3
5
3
0
Constraints:
2<=n<=1000
1<=q<=200000
My approach: I am counting no of inversions using BIT (https://www.geeksforgeeks.org/count-inversions-array-set-3-using-bit/) for each query after swapping elements at position a and b..and then again swapping it so that my array remains unchanged. But this solution gives TLE for large test cases. Is there any better approach for this problem?
You are getting TLE probably because number of computations in this approach is q * (n * log(n)) = 2 * 10^5 * 10^3 * log(1000) = ~10^9, which is more than generally accepted computations ~10^8.
I can think of the following solution. Please note that I have not coded / verified it:
Denoting ri == number of indices j, such that i > j && pi < pj. Eg: [2, 3, 1, 4], r3 = 2. Basically, it means the number of inversions with the farther index as i. (Please note that I am using 1-based index as per the question. Also,a < b as per the question)
Thus we have: Sum of ri == #invs (number of inversions)
We can calculate initial total #invs in O(n^2)
When a and b are swapped, we can observe that:
a) ri remains constant, where i < a .
b) ri remains constant, where i > b.
Only ri changes where a <= i <=b, and that too on these following conditions. I am considering the case when pa < pb. Exact opposite case will need to considered when pa > pb.
a) Since pa < pb, thus this swap causes #invs = #invs + 1
b) If (pi < pa && pi < pb) || (pi > pa && pi > pb), this swap does not change ri. Eg: [2,....10,....5]. Here Swapping 2 and 5 does not change the r value for 10.
c) If pa < pi < pb, it will increment ri by 1, and new rb by 1. Eg: [2,....3,.....4], when 2 and 4 are swapped, we have [4,....3,....2], the rvalue 3 increases by 1 (because of 4); and also the r value of 2 increase by 1 (because of 3). Please note that increment because of what about 4 > 2? was already calculated in step (a), and needs to be done once only.
d) We need to find all such indicies i where pa < pi < pb as we started with above. Let us call it f(a, b). Then the total change in #invs delta = (2 * f(a, b)) + 1, and answer will be #original_invs + delta.
As I mentioned, all the exact opposite steps need to be done for the case pa > pb. The delta will be negative in that case.
Now, the only thing remained is to solve: Given a, b, find f(a, b) efficiently. For this, we can pre-process and store it for all pairs of indices. This will take O(N^2) space, and O(N^2 * log(N)) time, using a balanced binary-search-tree (BST). Again showing steps for pre-processing for case pa < pb only. Another set of pre-processing steps needs to be done for the other case:
We will use self-balancing BST, in which each node also contains the following fields:
a) field_1: This denotes the size of the left sub-tree. This value will be updated on every insert operation, if size of left-sub-tree changes.
b) field_2: This denotes the number of elements < node.value that this tree has. This value is initialized once when the node is inserted and does not change thereafter. I have added a small explanation of how it will be achieved in Addendum-A. This field is basically our pre-processing, which will determine f(a, b).
With all of this now, for each index i, where 0 <= i < n, do the following: Create new tree. Insert pj values into the tree one by one, where (i < j < n ) && (pa < pj) . (Please note we are not inserting values where pa > pj). The method given in Addendum-A will make sure we find f(i, j) while inserting.
There will be n such pre-processed trees, one for every index. For finding f(a, b): We need to look into ath tree, and search node.value = pb. This node's field_2 = f(a, b).
The complexity of insertion is O(logN). So, the total pre-processing computation = O(N * N(logN)). Search is O(logN), so the query complexity is O(q * logN). Total complexity = O(N^2) + O(N * N (logN)) + O(q * logN) which will turn out ~10^7
==============================================================================
Addendum A: How to populate field_2 while inserting node:
i) Insert the node, and balance the tree. Update field_1 as required.
i) Initailze ans = 0. Traverse the BST from root searching for your node.
iii) do {
If node.value < search_key_b, ans += node.left_subtree_size + 1
} while(!node.found)
iv) ans -= 1
We can solve this in O(n log n) space and O(n log n + Q * log^2(n)) time with a merge-sort tree. The merge-sort tree allows us to find the number of elements inside a subarray that are greater than or lower than an input number in O(log^2(n)) time and O(n log n) space.
First we record the total number of inversions in O(n log n) time, for which there are known methods. To query the effect of a swap bound by left and right, consider the subarray between:
subtract the number of elements greater
than right in the subarray (those will
no longer be inversions)
subtract the number of elements smaller
than left in the subarray (those will
no longer be inversions)
add the number of elements greater
than left in the subarray (those will
be new inversions)
add the number of elements smaller
than right in the subarray (those will
be new inversions)
if right > left, add 1
if left > right, subtract 1

how do I get all combinations of elements in a list?

When getting a list of X elements, how can I get all doubles, triples, ... ( Y ) combinations of these elements ?
Y being the size of the required combinations. Ex : if Y = 2, I need to get all of the possible pairs.
I must not give the same combinations twice ( ex : [a, b] and [b, a] are the same combination )
Take a copy of the list.
If the list is empty, there are no combinations.
To get all combinations of size one, look at each element in turn.
To get all combinations of size n+1, first remove the first element. Then get all combinations of size n of the rest of the list, plus that first element. Then get all combinations of size n+1 of the rest of the list, and don't add the first element.
And then you are done.
You can get fancy and merely pretend to copy/remove elements for optimization sake.
You can iterate t from 2 to Y, and create an array A with the size X fill with X-t 0s in the front and t 1s in the back, then with the code below:
do{
//1s in array A now correspond to a valid combination
}while(std::next_permutation(A,A+X));
The loop will stop when all combination with size t are iterated
next_permutation is in header algorithm, it will reorder the array to the next lexicographically greater permutation or return false if the array is already in the lexicographically greatest permutation. Its complexity is O(n), since you also need to iterate through the array once, so it wouldn't be a problem. Total complexity for the whole process will be bounded by O(2^n*n).
So here is an example pseudo code
D[X] = {1,2,3,4} Y = 3 //the input
For t = 2,3,..,Y
A[X] = {0,...,0,1,...,1} // X - t 0s and t 1s
Do
For j = 0,1,...,X-1
if A[j] == 1
output D[j]
end if
end for
output newline
While next_permutation(A,A+X)
end for
The output will looks like
3 4
2 4
2 3
1 4
1 3
1 2
2 3 4
1 3 4
1 2 4
1 2 3

Intuition behind using the Cartesian product for finding number of unique BSTs

I am solving a LeetCode question. The question is:
Given n, how many structurally unique BSTs can be generated, that store the values from 1...n? For e.g., for n=3, a total of 5 unique BSTs can be generated as follows:
1 3 3 2 1
\ / / / \ \
3 2 1 1 3 2
/ / \ \
2 1 2 3
The maximum upvoted solution makes use of DP and the following recursive formula:
G(n) = G(0) * G(n-1) + G(1) * G(n-2) + … + G(n-1) * G(0)
where G(n) represents the number of unique BSTs that can be generated for n. The code is as follows:
class Solution {
public:
int numTrees(int n) {
vector<int> G(n+1);
G[0]=G[1]=1;
for(int i=2; i<=n; i++)
for(int j=1; j<=i; j++)
G[i]+=G[j-1]*G[i-j];
return G[n];
}
};
While I more-or-less do understand what is going on, I didn't understand why we take a Cartesian product (instead of simple addition, which is more intuitive). As per my understanding:
G[i] += G[j-1] * G[i-j];
should instead be:
G[i] += G[j-1] + G[i-j]; //replaced '*' with a '+'
This is so because, I think the number of unique BSTs possible with i as the current root should be the sum(?) of the number of BSTs for its left and right subtrees. I did try a few examples but somehow the numbers get multiplied magically in the original solution (with a *) and the final answer appears in G[n].
Could someone please provide an intuitive explanation for using Cartesian product instead of sum?
Note: The original question is here and the solution is here. Also, the original code is in Java while I have posted the C++ variation that I wrote above.
You can go by mathematical induction and then apply it to the sub-problems to get the result. Or simply just check for small values and then go for higher values.
For example:-
No of nodes BST representation
1 --> [1]
2 --> [2] [1]
/ \
[1] [2]
3 --> [1]
\
[2]
\
[3]
[2]
/ \
[1] [3]
[3]
/
[2]
/
[1]
4 -->
[1]
/ \
NUM{} NUM of keys with 3 val NUM{2,3,4}
[2]
/ \
NUM{1} NUM{3,4}
[3]
/ \
NUM{1,2} NUM{4}
[4]
/ \
NUM{1,2,3} NUM{}
From the 4th case you can clearly understand that we have to simply multiply the number of possible ways to group the left and right subtree in each of the trees. And for a given number of values we have to add them. That's why cartesian product is being used.
The product basically gives us all possible order the whole true can have.
For example:
G[i] += G[j-1] * G[i-j]; Here j-1 nodes are to the left( we can assume
without loss of generality) and i-j nodes to the right sub-tree. And
now you can arrange the left sub-tree in G[j-1] ways and similarly for
right sub-tree in G[i-j] ways. Now think how many ways can you arrange
the original tree which has this left and rigth subtree? It would
multiply. Because each combination of left and right subtree will give
rise to a unique tree representation.
This also explains why we define G[0]=1 because it conforms to the way we do things here. And also the number of arrangements with no value is also an arrangement. So it is considered 1.

Count the number of paths in DAG with length K

I have a DAG with 2^N nodes, with values from 0 to 2^N-1. There is edge from x to y if x < y and x (xor) y = 2^p, x and y being the node values and p a non-negative integer.
Since N can be as large as 100000, generating the graph and than proceeding with the counting would take much computational time. Is there any way to count the paths with certain length K (K being the number of edges between two nodes), differently stated, is there an equation of some sort for this kind of counting?
Thanks in advance
Michael's got some good insights, but I'm not sure I follow his entire argument. Here's my solution.
Let's say N=4, K=2. So the nodes range from 0 (00002) to 15 (11112).
Now let's consider node 2 (00102). There's an edge from 2 to 3 (00112) because 2 < 3 and xor(2,3) = 1 = 20. There's also an edge from 2 to 6 because 2 < 6 and xor(2,6) = 4 = 22. And there's an edge from 2 to 10 because 2 < 10 and xor(2,10) = 8 = 23.
To generalize: for any x, consider all of the 0 bits in x. By flipping any of the 0 bits to 1, you get a number y that's larger than x and differs from x by one bit. So there's an edge from x to that y.
The number of 1 bits in x is typically called the population count of x. I'll use pop(x) to mean the population count of x.
We're dealing with N-bit numbers (when we include leading zeroes), so the number of 0 bits in x is N - pop(x).
Let's use the term “j-path” to mean a path of length j. We want to count the number of K-paths.
Every node x has N - pop(x) outgoing edges. Each of these edges is a 1-path.
Let's consider node 5 (01012). Node 5 has an edge to 7 (01112), and node 7 has an edge to 15 (11112). Node 5 also has an edge to 13 (11012), and node 13 has an edge to 15 (11112). So there are two 2-paths out of node 5: 5-7-15 and 5-13-15.
Next let's look at node 2 (00102) again. Node 2 has an edge to 3 (00112), which has edges to 7 (01112) and 11 (10112). Node 2 also has an edge to node 6 (01102), which has edges to 7 (01112) and 14 (11102). Finally, node 2 has an edge to node 10 (10102), which has edges to 11 (10112) and 14 (11102). In all, there are six 2-paths out of node 2: 2-3-7, 2-3-11, 2-6-7, 2-6-14, 2-10-11, and 2-10-14.
The pattern is that, for any node x with z bits set to zero, where z ≥ K, there are some K-paths out of x. To find a K-path out of x, you pick any K of the zero bits. Flipping those bits to 1, one by one, gives you the path. You can flip the bits in any order you want; each order gives a different path.
When you want to pick k items, in a specific order, from a set of n items, that's called an ordered sample without replacement, and there are n! / (n-k)! ways to do it. This is often written nPk, but it's easier to type P(n,k) here.
So, the nodes that have exactly 2 zero bits have P(2,2) = 2! / (2-2)! = 2 2-paths out of them. (Note that 0! = 1.) The nodes that have exactly 3 zero bits have P(3,2) = 3! / 1! = 6 2-paths out of them. The node that has exactly 4 zero bits has P(4,2)= 4! / 2! = 12 2-paths out of it. (Since I'm using N=4 for the example, there is only one node with exactly 4 zero bits, which is node 0.)
But then we need to know, how many nodes have exactly 2 zero bits? Well, when there are n items to choose from, and we want to choose k of them, and we don't care about the order of the chosen items, that's called an unordered sample without replacement, and there are n! / (k! (n-k)!) ways to do it. This is called “n choose k”, and it's usually written in a way that I can't reproduce on stack overflow, so I'll write it as C(n,k).
For our example with N=4, there are C(4,2) = 6 nodes with exactly 2 bits set to zero. These nodes are 3 (00112), 5 (01012), 6 (01102), 9 (10012), 10 (10102), and 12 (11002). Each of these nodes has P(2,2) 2-paths out of it, so that means there are C(4,2) * P(2,2) = 6 * 2 = 12 2-paths out of nodes with exactly two 0 bits.
Then there are C(4,3) = 4 nodes with exactly 3 bits set to zero. These nodes are 1 (00012), 2 (00102), 4 (01002), and 8 (10002). Each of these nodes has P(3,2) 2-paths out of it, so there are C(4,3) * P(3,2) = 4 * 6 = 24 2-paths out of nodes with exactly three 0 bits.
Finally, there is C(4,4) = 1 node with exactly 4 bits set to zero. This node has P(4,2) = 12 2-paths out of it.
So the total number of 2-paths when N=4 is C(4,2)*P(2,2) + C(4,3)*P(3,2) + C(4,4)*P(4,2) = 12 + 24 + 12 = 48.
For general N and K (where K ≤ N), the number of K-paths is the sum of C(N,z) * P(z,K) for K ≤ z ≤ N.
I can type that into Wolfram Alpha (or Mathematica) like this:
Sum[n!/(z! (n - z)!) z!/(z - k)!, {z, k, n}]
And it simplifies it to this:
2^(n-k) n! / (n-k)!
The stated problem seems to be equivalent to this one:
Consider the set of all possible binary strings of length N. Consider operation Fi that flips i-th bit from 0 to 1. For strings x & y denote |x| the number of set bits, x
It's easy to see that one can obtain y from x by a series of exactly K operations Fi if and only if (x,y) is K-admissible. Moreover, if we fix x and sum up over all y such that (x,y) is K-admissible we get (N-|x|)!
Finally, we need to sum up over all x with |x|<=(N-K). For a given choice of |x| we have N!/(N-|x|)!|x|! possible choices of x. Combine with the above and you get that for the given |x| there are N!/|x|! possible paths.
Denote |x|=M, with M from 0 to N-K, and your answer is the sum over all M of N!/M!

To find the min and max after addition and subtraction from a range of numbers

I am having a Algorithm question, in which numbers are been given from 1 to N and a number of operations are to be performed and then min/max has to be found among them.
Two operations - Addition and subtraction
and operations are in the form a b c d , where a is the operation to be performed,b is the starting number and c is the ending number and d is the number to be added/subtracted
for example
suppose numbers are 1 to N
and
N =5
1 2 3 4 5
We perform operations as
1 2 4 5
2 1 3 4
1 4 5 6
By these operations we will have numbers from 1 to N as
1 7 8 9 5
-3 3 4 9 5
-3 3 4 15 11
So the maximum is 15 and min is -3
My Approach:
I have taken the lower limit and upper limit of the numbers in this case it is 1 and 5 only stored in an array and applied the operations, and then had found the minimum and maximum.
Could there be any better approach?
I will assume that all update (addition/subtraction) operations happen before finding max/min. I don't have a good solution for update and min/max operations mixing together.
You can use a plain array, where the value at index i of the array is the difference between the index i and index (i - 1) of the original array. This makes the sum from index 0 to index i of our array to be the value at index i of the original array.
Subtraction is addition with the negated number, so they can be treated similarly. When we need to add k to the original array from index i to index j, we will add k to index i of our array, and subtract k to index (j + 1) of our array. This takes O(1) time per update.
You can find the min/max of the original array by accumulating summing the values and record the max/min values. This takes O(n) time per operation. I assume this is done once for the whole array.
Pseudocode:
a[N] // Original array
d[N] // Difference array
// Initialization
d[0] = a[0]
for (i = 1 to N-1)
d[i] = a[i] - a[i - 1]
// Addition (subtraction is similar)
add(from_idx, to_idx, amount) {
d[from_idx] += amount
d[to_idx + 1] -= amount
}
// Find max/min for the WHOLE array after add/subtract
current = max = min = d[0];
for (i = 1 to N - 1) {
current += d[i]; // Sum from d[0] to d[i] is a[i]
max = MAX(max, current);
min = MIN(min, current);
}
Generally there is no "best way" to find the min/max in the performance point of view because it depends on how this application will be used.
-Finding the max and min in a list needs O(n) Time, so if you want to run many (many in the context of the input) operations, your approach to find the min/max after all the operations took place is fine.
-But if the list will hold many elements and you don’t want to run that many operations, you better check each result of the op if its a new max/min and update if necessary.