Is Dilation/Erosion with fixed kernel for a number of iterations is similar to dilating/eroding with equivalent kernel of bigger size - c++

While going through the OpenCV source code, I noticed that for iterations more than one it just creates a kernel of bigger size and do a single iteration.
So my question is if we take SQUARE structuring element of 3x3 size and dilate/erode it in three iterations, will it be same as dilating/eroding it with a 9x9 kernel once.
if( iterations > 1 && countNonZero(kernel) == kernel.rows*kernel.cols )
{
anchor = Point(anchor.x*iterations, anchor.y*iterations);
kernel = getStructuringElement(MORPH_RECT,
Size(ksize.width + (iterations-1)*(ksize.width-1),
ksize.height + (iterations-1)*(ksize.height-1)),
anchor);
iterations = 1;
}

Refering to Jordi's Answer:
[Quoted] ... Note, however, that this does not hold for all structuring elements...
In fact, it holds, in the following way (not in Jordi's example):
First step, calculate the 5x5 kernel by dilation twice in 3x3 kernel on a single center point 5x5 source image:
00000 00000 00100
00000 010 00100 010 01110
00100 + 111 -> 01110 + 111 -> 11111 ===> this is the equivalent 5x5 kernel for 2x 3x3 dilation
00000 010 00100 010 01110
00000 00000 00100
Then applying twice of 3x3 original dilation kernel is equivalent to applying this 5x5 dilation kernel on a bigger image. For example:
0000000000 0000000000 00100
0000000000 010 010 0000000000 01110
0011100000 + 111 + 111 === 0011100000 + 11111
0000001000 010 010 0000001000 01110
0000000000 0000000000 00100
0000000000 0000000000
This does not directly answer your question though. However, I can not just use 'comment' as it is very hard (if not impossible) to format all these equations/explanations.
In fact, a proof for binary image (image with only value 0 or 1 in each pixel) for the larger combined kernel for dilation is easy:
Let's define the binary operator + to be the dilation operator, where the 1st operand is the kernel, and the second operand is the image to be dilated.. So, if we want to do dilation on image I with kernel K, we write dilated-image = K + I
Let's define binary operator U to be the union operator, or, in other word, the binary 'OR' operator for each pixel, where the two operand of U must be binary images in the same dimension. For example: A U B means doing -OR- on each corresponding pixel of A and B:
A= 0 0 1 B= 0 1 1
1 0 1 1 1 1
1 1 0 0 1 0
Then
A U B = 0 1 1
1 1 1
1 1 0
We also define U A(i), i=1, ..., n to be A(1) U A(2) U ... U A(n).
Let's define K^n to be the dilation-styled larger kernel by applying n times of kernel K on a single center point image.
Note that any image I, we can decompose it into union of single point images. For example,
0 1 0 0 1 0 0 0 0 0 0 0
I = 0 0 0 === 0 0 0 U 0 0 0 U 0 0 0
1 0 1 0 0 0 1 0 0 0 0 1
Now it's time to prove it:
For any image I, we define D(i), i = 1, ..., n to be the single point decomposition of I,
and thus I = U D(i), i = 1, ..., n
By definition of the binary dilation, K + I == K + (U D(i)) == U (K+D(i)).
(Remember that dilation is to mask kernel K on each pixel of I, and mark all corresponding 1's).
Now, let's see what is K + (K + I):
K + (K + I) == K + U (K + D(i))
== U(K + (K + D(i))) (Note: this is tricky. see Theorem 1 below)
== U (K^2 + D(i)) (by definition of K^2)
== K^2 + U D(i) (by definition of the dilation)
== K^2 + I (since I = U D(i))
Now, we already know K + (K + I) == K^2 + I, and it's easy to apply mathematical induction to prove that K + K + K .. + K + I = K^n + I (Note: please apply right association, as I have drop the parenthesis).
Theorem 1: Proof of the deduction from K + U (K + D(i)) to U(K + (K+D(i)))
It's suffice to just prove that for any two binary images A and B in a same dimension,
K + (A U B) = (K+A) U (K+B)
It's quite easy to see that, if we decompose image A and B, and apply kernel K on the decomposed images, those common points (i.e. the intersection points of A and B, or the common 1's point of A and B), will contribute the same resulting points after applying kernel K. And by the definition of dilation, we need to union all points contributed by each decomposed image of A and B. Thus Theorem 1 holds.
=== UPDATE ===
Regarding to kid.abr's comment "27 operations compared to 7x7 kernel with 49 operations":
Generally speaking, it is not 27 operations. It depends. For example, a source image of 100x100 pixels,
with 20 singular point (1's) sparsely distributed. Applying a 3x3 solid kernel (i.e. All 1's) 3 times on it
requires the following steps for each of the 20 singular point:
Loop 1: 9 operations, and generate 9 points.
Loop 2: For each of the 9 points generated, it needs 9 operations => 9 x 9 = 81 steps. And it generates 25 points
Loop 3: For each of the 25 points generated, it needs 9 operations => 25 x 9 = 225 steps.
Total: 9 + 81 + 225 = 315 steps.
Please note that when we visit a pixel with 0 value in the source image, we don't need to apply the kernel
on that point, right?
So, the same case applying the larger kernel, it requires 7x7 = 49 steps.
Yet, if the source image has a large solid area of 1's, the 3-step method wins.

Short answer: with a square structuring element, yes.
Long answer: you need to consider what the erosion/dilation operations do. Dilation, for instance, moves the kernel over the image and sets its centre to 1 whenever any of its grid positions are 1 (I'm assuming binary images, it works the same for greyscale). Increasing the distance between the centre of the structuring element and its edges is then the same as increasing the size of the kernel.
Note, however, that this does not hold for all structuring elements. Suppose you take a structuring element that is just a stretched plus, obviously dilating twice with size 3 is not the same as dilating once with size 5:
00000 00000 00100
00000 010 00100 010 01110
00100 + 111 -> 01110 + 111 -> 11111
00000 010 00100 010 01110
00000 00000 00100
00000 00100 00100
00000 00100 00100
00100 + 11111 -> 11111
00000 00100 00100
00000 00100 00100
Of course, this does work if we define the scaled version of plus as a square without its corners (as it usually would be). I think that in general this shortcut works when the kernel of size k+1 is the dilated version of the kernel of size k, but I have no proof for this.

Short answer for a general kernel: Yes for dilation/erosion, but not necessarily with an equivalent kernel.
From wikipedia:
Dilation: (A⊕B)⊕C = A⊕(B⊕C)
Erosion: (A⊖B)⊖C = A⊖(B⊕C)
Where ⊕ denotes the morphological dilation, and ⊖ denotes the morphological erosion.
Basically, performing erosion/dilation on image A with kernel B and then kernel C is equivalent to performing erosion/dilation on image A with the kernel obtained by dilating B with C. This can easily be expanded to an arbitrary number erosions/dilations.

Related

How do optimize my code to find product of all the contiguous subsequences of an array?

This is my try to count the contiguous subsequences of an array with product mod 4 is not equal to 2:
# include <iostream>
using namespace std;
int main() {
long long int n, i, j, s, t, count = 0;
cin>>n;
long long int arr[n];
count = 0;
for(i = 0; i<n; i++) {
cin>>arr[i];
}
for(i = 0; i<n; i++) {
s = 1;
for(j = i; j<n; j++) {
s = s*arr[j];
if(s%4!=2) {
count++;
}
}
}
cout<<count;
return 0;
}
However, I want to reduce the time taken by my code to execute. I am looking for a way to do it. Any help/hint would be appreciated.
Thank you.
What does this definition of contiguous subsequences mean?
Listing all the subsequences
Suppose we have the sequence:
A B C D E F
First of all, we should recognize that there is one substring for every unique start and end point. Let's use the notation C-F to mean all items from C through F: i.e.: C D E F.
We can list all subsequences in a triangular arrangement like this:
A B C D E F
A-B B-C C-D D-E E-F
A-C B-D C-E D-F
A-D B-E C-F
A-E B-F
A-F
The first row lists all the subsequences of length 1.
The second row lists all the subsequences of length 2.
The third row lists all the subsequences of length 3. Etc.
The last row is the full sequence.
Modular arithmetic
Computing the product MOD 4 of a set of numbers
To figure out the product of a bunch of numbers MOD 4, we just need to look at each element of the set MOD 4. Intuitively, this is because when you multiply a bunch of numbers, the last digit of the result is determined entirely by the last digit of each factor. In this case "the last digit base 4" is the number mod 4.
The identity we are using is:
(A * B) MOD N == ((A MOD N) * (B MOD N)) MOD N
The table of products
Now we also have to look at the matrix of possible multiplications that might happen. It's a fairly small table and the interesting entries are given here:
2 * 2 = 4 4 MOD 4 = 0
2 * 3 = 6 6 MOD 4 = 2
3 * 3 = 9 9 MOD 4 = 1
So the results of multiplying any 2 numbers MOD 4 are given by this table:
+--------+---+---+---+---+
| Factor | 0 | 1 | 2 | 3 |
+--------+---+---+---+---+
| 0 | 0 | / | / | / |
| 1 | 0 | 1 | / | / |
| 2 | 0 | 2 | 0 | / |
| 3 | 0 | 3 | 2 | 1 |
+--------+---+---+---+---+
The /'s are omitted because of the symmetry of multiplication (A * B = B * A)
An example sequence
Now for each subsequence, let's compute the product MOD 4 of its elements.
Consider the following list of numbers
242 497 681 685 410 795
The first thing we do is take all these numbers MOD 4 and list them as the first row of our list of all subsequences triangle.
2 0 1 1 2 3
The second row is just the product of the pairs above it.
2 0 1 1 2 3
0 0 1 2 3
In general, the Nth element of each row is the product, MOD 4, of:
the number just to its left in the row above left times and
the element in the first row that is diagonally to its right
For example C = A * B
* * * * B *
* * * / *
* A / *
* C *
* *
*
Again,
A is immediately up and left of C
B is diagonally right all the way to the top row from C
Now we can complete our triangle
2 0 1 1 2 3
0 0 1 2 3
0 0 2 3
0 0 2
0 0
0
This can be computed easily in O(n^2) time.
Optimization
These optimizations do not improve the time complexity of the algorithm in its worse case, but can cause an early exit in the computation, and should therefore be included if time is to be reduced and the input is unknown.
Contageous 0's
Furthermore, as a matter of optimization, notice how contagious the 0's are. Anything times 0 is 0, so you can skip computing products of cells below a 0. In your case those sequences will not equal 2 MOD 4 once the product of one of its subsequences is determined to be equal to 0 MOD 4.
* * * 0 * * // <-- this zero infects all cells below it
* * 0 0 *
* 0 0 0
0 0 0
0 0
0
Need a 2 to make a 2.
Look back at the table of factors and products. Notice that the only way to get a product that is equal to 2 MOD 4 is to have one of the factors be equal to 2 MOD 4. What that means is that there can only be a 2 below another 2. So we are only interested in following computing entries in the table that are below a 2. Other entries in rows below can never become a 2.
You don't have to store more than the whole rows.
You only need O(n) storage to implement this. Working line by line, you can compute the values in a row entirely from the values in the first row and values in the row above.
Reading the answers from the table
Now you can look at the rows of the triangle list as you generate them and read off which subsequences are to be included.
Entries with a 2 are to be excluded. All others are to be included.
2 0 1 1 3 2
0 0 1 3 2
0 0 3 2
0 0 2
0 0
0
The excluded subsequences for the example (which I will list only because there are fewer of them in my example) are:
A
F
E-F
D-F
C-F
Which remember, according to our convention refer to the elements:
A
F
E F
D E F
C D E F
Which are:
242
795
410 795
685 410 795
681 685 410 795
Hopefully it's obvious how to display the "included" sequences, rather than the "excluded" sequences, as I have shown above.
Displaying all the elements makes it take much longer.
Sadly, actually displaying all of the elements of such subsequences is still an O(N^3) operation in the worst case. (Imagine a sequence of all zeros.)
Summary
For me, I feel like an average developer could take the magic bullet observation made in the diagram below and write an implementation that has optimal time complexity.
C = A * B
* * * * B *
* * * / *
* A / *
* C *
* *
*

Intuition behind working with `k` to find the kth-symbol in the grammar

I took part in a coding contest wherein I encountered the following question:
On the first row, we write a 0. Now in every subsequent row, we look at the previous row and replace each occurrence of 0 with 01, and each occurrence of 1 with 10. Given row N and index K, return the K-th indexed symbol in row N. (The values of K are 1-indexed.)
While solving the question, I solved it like a level-order traversal of a tree, trying to form the new string at each level. Unfortunately, it timed-out. I then tried to think along the terms of caching the results, etc. with no luck.
One of the highly upvoted solutions is like this:
class Solution {
public:
int kthGrammar(int N, int K) {
if (N == 1) return 0;
if (K % 2 == 0) return (kthGrammar(N - 1, K / 2) == 0) ? 1 : 0;
else return (kthGrammar(N - 1, (K + 1) / 2) == 0) ? 0 : 1;
}
};
My question is simple - what is the intuition behind working with the value of K (especially, the parities of K)? (I hope to be able to identify such questions when I encounter them in future).
Thanks.
Look at the sequence recursively. In generating a new row, the first half is identical to the process you used to get the previous row, so that part of the expansion is already done. The second half is merely the same sequence inverted (0 for 1, 1 for 0). This is one classic way to generate a parity map: flip all the bits and append, representing adding a 1 to the start of each binary number. Thinking of expanding the sequence 0-3 to 0-7, we start with
00 => 0
01 => 1
10 => 1
11 => 0
We now replicate the 2-digit sequence twice: first with a leading 0, which preserves the original parity; second with a leading 1, which inverts the parity.
000 => 0
001 => 1
010 => 1
011 => 0
100 => 1
101 => 0
110 => 0
111 => 1
Is that an intuition that works for you?
Just for fun, as a different way to solve this, consider that the nth row (0-indexed) has 2^n elements in it, and a determination as to the value of the kth (0-indexed) element can be made soley according to the parity of how many bits are set in k.
The check for parity in the code you posted is just to make the division by two correct, there's no advanced math or mystery hiding here :) Since the pattern is akin to a tree, where the pattern size multiplies by two for each added row, correctly dividing points to the element's parent. The indexes in this question are said to be "1-indexed;" if the index is 2, dividing by two yields the parent index (1) in the row before; and if the index is 1, dividing (1+1) by two yields that same parent index. I'll leave it to the reader to generalize that to ks parity. After finding the parent, the code follows the rule stated in the question: if the parent is 0, the left-child must be 0 and right-child 1, and vice versa.
0
0 1
0 1 1 0
0 1 1 0 1 0 0 1
0 1 1 0 1 0 0 1 1 0 0 1 0 1 1 0
a a b a b b a
0 01 0110 01101001 0110100110010110
a b b a b a a b
0110100110010110 1001011001101001

Combinational Circuit with LED Lighting

Combinational Circuit design question.
A
____
| |
F | | B
| |
____
| G |
E | | C
| |
____
D
Suppose this is a LED display. It would take input of 4 bit
(0000)-(1111) and display the Hex of it. For example
if (1100) come in it would display C by turning on AFED and turning off BCG.
If (1010) comes in it would display A by turning on ABCEFG
and turn off D.
These display will all be Capital letters so there is no visual
difference between 0 and D and 8 and B.
Develop a truth table and an optimized expression using Karnaugh Maps.
I'm not exactly sure how to begin. For the truth table would I be using (w,x,y,z) as input variable or just the ABCDEFG variable since it's the one turning on and off?
input (1010)-->A--> ABCEFG~D (~ stand for NOT)
input (1011)-->B--> ABCDEFG
input (1100)-->C--> ADEF~B~C~G
So would I do for all hex 0-F then that would give me the min. term canonical then use Karnaugh Map to optimize it? Any help would be grateful!
1) Map your lights to bits:
ABCDEFG, so truth table will be:
ABCDEFG
input (1010)-->A-->1110110
and so on.
You will have big table (with 16 rows).
2) Then follow sample on wikipedia for every output light.
You need to do 7 of these: Each for one segment in the 7-segment display.
This figure is for illustration only. It doesn't necessarily map to any segment in your problem.
cd=00 01 11 10 <-- where abcd = 0000 for 0 : put '1' if the light is on
ab= 00 1 1 1 1 = 0001 for 1 : put '0' if it's off for
ab= 01 1 1 1 0 = 0010 for 2 ... the given segment
ab= 11 0 1 1 1
ab= 10 1 1 1 0 = 1111 for f
^^^^ = d=1 region
^^^^ = c==1 region
The two middle rows represent "b==1" region and the two last rows are a==1 region.
From that map find maximum size rectangles (that are of size [1,2 or 4] x [1, 2 or 4]); that can be overlapping. The middle 2x4 region is coded as 'd'. The top row is '~a~b'. The top left 2x2 square is '~a~c'. A bottom left square that wraps from row 4 to row 1 is '~b~c'. Finally the small 2x1 region that covers position x=4, y=3 is 'abc'.
This function would thus be 'd + ~a~b + ~a~c + ~b~c + abc'. If there are no redundant squares (that are completely covered by other squares), then this formula should be optimal canonical form. (not counting XOR operation). Repeat for 7 times for the real data!
Any selection/permutation of the variables should give the same logical circuit, whether you use abcd or dcba or acbd etc.

Find rank of a number on basis of number of 1's

Let f(k) = y where k is the y-th number in the increasing sequence of non-negative integers with
the same number of ones in its binary representation as k, e.g. f(0) = 1, f(1) = 1, f(2) = 2, f(3) = 1, f(4)
= 3, f(5) = 2, f(6) = 3 and so on. Given k >= 0, compute f(k)
many of us have seen this question
1 solution to this problem to categorise numbers on basis of number of 1's and then find the rank.i did find some patterns going by this way but it would be a lengthy process. can anyone suggest me a better solution?
This is a counting problem. I think that if you approach it with this in mind, you can do much better than literally enumerating values and checking how many bits they have.
Consider the number 17. The binary representation is 10001. The number of 1s is 2. We can get smaller numbers with two 1s by (in this case) re-distributing the 1s to any of the four low-order bits. 4 choose 2 is 6, so 17 should be the 7th number with 2 ones in the binary representation. We can check this...
0 00000 -
1 00001 -
2 00010 -
3 00011 1
4 00100 -
5 00101 2
6 00110 3
7 00111 -
8 01000 -
9 01001 4
10 01010 5
11 01011 -
12 01100 6
13 01101 -
14 01110 -
15 01111 -
16 10000 -
17 10001 7
And we were right. Generalize that idea and you should get an efficient function for which you simply compute the rank of k.
EDIT: Hint for generalization
17 is special in that if you don't consider the high-order bit, the number has rank 1; that is, f(z) = 1 where z is everything except the higher order bit. For numbers where this is not the case, how can you account for the fact that you can get smaller numbers without moving the high-order bit?
f(k) are integers less than or equal to k that have the same number of ones in their binary representation as k.
For example, k needs m bits, that is k = 2^(m-1) + a, where a < 2^(m-1). The number of integers less than 2^(m-1) that have the same number of bits as k is choose(m-1, bitcount(k)), since you can freely redistribute the ones among the m-1 least significant bits.
Integers that are greater than or equal to 2^(m-1) have the same most significant bit as k (which is 1), so there are f(k - 2^(m-1)) of them. This implies f(k) = choose(m-1, bitcount(k)) + f(k-2^(m-1)).
See "Efficiently Enumerating the Subsets of a Set". Look at Table 3, the "Bankers sequence". This is a method to generate exactly the sequence you need (if you reverse the bit order). Just run K iterations for the word with K bits. There is code to generate it included in the paper.

Ranking and unranking of permutations with duplicates

I'm reading about permutations and I'm interested in ranking/unranking methods.
From the abstract of a paper:
A ranking function for the permutations on n symbols assigns a unique
integer in the range [0, n! - 1] to each of the n! permutations. The corresponding
unranking function is the inverse: given an integer between 0 and n! - 1, the
value of the function is the permutation having this rank.
I made a ranking and an unranking function in C++ using next_permutation. But this isn't practical for n>8. I'm looking for a faster method and factoradics seem to be quite popular.
But I'm not sure if this also works with duplicates. So what would be a good way to rank/unrank permutations with duplicates?
I will cover one half of your question in this answer - 'unranking'. The goal is to find the lexicographically 'K'th permutation of an ordered string [abcd...] efficiently.
We need to understand Factorial Number System (factoradics) for this. A factorial number system uses factorial values instead of powers of numbers (binary system uses powers of 2, decimal uses powers of 10) to denote place-values (or base).
The place values (base) are –
5!= 120 4!= 24 3!=6 2!= 2 1!=1 0!=1 etc..
The digit in the zeroth place is always 0. The digit in the first place (with base = 1!) can be 0 or 1. The digit in the second place (with base 2!) can be 0,1 or 2 and so on. Generally speaking, the digit at nth place can take any value between 0-n.
First few numbers represented as factoradics-
0 -> 0 = 0*0!
1 -> 10 = 1*1! + 0*0!
2 -> 100 = 1*2! + 0*1! + 0*0!
3 -> 110 = 1*2! + 1*1! + 0*0!
4 -> 200 = 2*2! + 0*1! + 0*0!
5 -> 210 = 2*2! + 1*1! + 0*0!
6 -> 1000 = 1*3! + 0*2! + 0*1! + 0*0!
7 -> 1010 = 1*3! + 0*2! + 1*1! + 0*0!
8 -> 1100 = 1*3! + 1*2! + 0*1! + 0*0!
9 -> 1110
10-> 1200
There is a direct relationship between n-th lexicographical permutation of a string and its factoradic representation.
For example, here are the permutations of the string “abcd”.
0 abcd 6 bacd 12 cabd 18 dabc
1 abdc 7 badc 13 cadb 19 dacb
2 acbd 8 bcad 14 cbad 20 dbac
3 acdb 9 bcda 15 cbda 21 dbca
4 adbc 10 bdac 16 cdab 22 dcab
5 adcb 11 bdca 17 cdba 23 dcba
We can see a pattern here, if observed carefully. The first letter changes after every 6-th (3!) permutation. The second letter changes after 2(2!) permutation. The third letter changed after every (1!) permutation and the fourth letter changes after every (0!) permutation. We can use this relation to directly find the n-th permutation.
Once we represent n in factoradic representation, we consider each digit in it and add a character from the given string to the output. If we need to find the 14-th permutation of ‘abcd’. 14 in factoradics -> 2100.
Start with the first digit ->2, String is ‘abcd’. Assuming the index starts at 0, take the element at position 2, from the string and add it to the Output.
Output String
c abd
2 012
The next digit -> 1.String is now ‘abd’. Again, pluck the character at position 1 and add it to the Output.
Output String
cb ad
21 01
Next digit -> 0. String is ‘ad’. Add the character at position 1 to the Output.
Output String
cba d
210 0
Next digit -> 0. String is ‘d’. Add the character at position 0 to the Output.
Output String
cbad ''
2100
To convert a given number to Factorial Number System,successively divide the number by 1,2,3,4,5 and so on until the quotient becomes zero. The reminders at each step forms the factoradic representation.
For eg, to convert 349 to factoradic,
Quotient Reminder Factorial Representation
349/1 349 0 0
349/2 174 1 10
174/3 58 0 010
58/4 14 2 2010
14/5 2 4 42010
2/6 0 2 242010
Factoradic representation of 349 is 242010.
One way is to rank and unrank the choice of indices by a particular group of equal numbers, e.g.,
def choose(n, k):
c = 1
for f in xrange(1, k + 1):
c = (c * (n - f + 1)) // f
return c
def rank_choice(S):
k = len(S)
r = 0
j = k - 1
for n in S:
for i in xrange(j, n):
r += choose(i, j)
j -= 1
return r
def unrank_choice(k, r):
S = []
for j in xrange(k - 1, -1, -1):
n = j
while r >= choose(n, j):
r -= choose(n, j)
n += 1
S.append(n)
return S
def rank_perm(P):
P = list(P)
r = 0
for n in xrange(max(P), -1, -1):
S = []
for i, p in enumerate(P):
if p == n:
S.append(i)
S.reverse()
for i in S:
del P[i]
r *= choose(len(P) + len(S), len(S))
r += rank_choice(S)
return r
def unrank_perm(M, r):
P = []
for n, m in enumerate(M):
S = unrank_choice(m, r % choose(len(P) + m, m))
r //= choose(len(P) + m, m)
S.reverse()
for i in S:
P.insert(i, n)
return tuple(P)
if __name__ == '__main__':
for i in xrange(60):
print rank_perm(unrank_perm([2, 3, 1], i))
For large n-s you need arbitrary precision library like GMP.
this is my previous post for an unranking function written in python, I think it's readable, almost like a pseudocode, there is also some explanation in the comments: Given a list of elements in lexicographical order (i.e. ['a', 'b', 'c', 'd']), find the nth permutation - Average time to solve?
based on this you should be able to figure out the ranking function, it's basically the same logic ;)
Java, from https://github.com/timtiemens/permute/blob/master/src/main/java/permute/PermuteUtil.java (my public domain code, minus the error checking):
public class PermuteUtil {
public <T> List<T> nthPermutation(List<T> original, final BigInteger permutationNumber) {
final int size = original.size();
// the return list:
List<T> ret = new ArrayList<>();
// local mutable copy of the original list:
List<T> numbers = new ArrayList<>(original);
// Our input permutationNumber is [1,N!], but array indexes are [0,N!-1], so subtract one:
BigInteger permNum = permutationNumber.subtract(BigInteger.ONE);
for (int i = 1; i <= size; i++) {
BigInteger factorialNminusI = factorial(size - i);
// casting to integer is ok here, because even though permNum _could_ be big,
// the factorialNminusI is _always_ big
int j = permNum.divide(factorialNminusI).intValue();
permNum = permNum.mod(factorialNminusI);
// remove item at index j, and put it in the return list at the end
T item = numbers.remove(j);
ret.add(item);
}
return ret;
}
}