Runtime of KMP algorithm and LPS table construction - c++

I recently came across the KMP algorithm, and I have spent a lot of time trying to understand why it works. While I do understand the basic functionality now, I simply fail to understand the runtime computations.
I have taken the below code from the geeksForGeeks site: https://www.geeksforgeeks.org/kmp-algorithm-for-pattern-searching/
This site claims that if the text size is O(n) and pattern size is O(m), then KMP computes a match in max O(n) time. It also states that the LPS array can be computed in O(m) time.
// C++ program for implementation of KMP pattern searching
// algorithm
#include <bits/stdc++.h>
void computeLPSArray(char* pat, int M, int* lps);
// Prints occurrences of txt[] in pat[]
void KMPSearch(char* pat, char* txt)
{
int M = strlen(pat);
int N = strlen(txt);
// create lps[] that will hold the longest prefix suffix
// values for pattern
int lps[M];
// Preprocess the pattern (calculate lps[] array)
computeLPSArray(pat, M, lps);
int i = 0; // index for txt[]
int j = 0; // index for pat[]
while (i < N) {
if (pat[j] == txt[i]) {
j++;
i++;
}
if (j == M) {
printf("Found pattern at index %d ", i - j);
j = lps[j - 1];
}
// mismatch after j matches
else if (i < N && pat[j] != txt[i]) {
// Do not match lps[0..lps[j-1]] characters,
// they will match anyway
if (j != 0)
j = lps[j - 1];
else
i = i + 1;
}
}
}
// Fills lps[] for given patttern pat[0..M-1]
void computeLPSArray(char* pat, int M, int* lps)
{
// length of the previous longest prefix suffix
int len = 0;
lps[0] = 0; // lps[0] is always 0
// the loop calculates lps[i] for i = 1 to M-1
int i = 1;
while (i < M) {
if (pat[i] == pat[len]) {
len++;
lps[i] = len;
i++;
}
else // (pat[i] != pat[len])
{
// This is tricky. Consider the example.
// AAACAAAA and i = 7. The idea is similar
// to search step.
if (len != 0) {
len = lps[len - 1];
// Also, note that we do not increment
// i here
}
else // if (len == 0)
{
lps[i] = 0;
i++;
}
}
}
}
// Driver program to test above function
int main()
{
char txt[] = "ABABDABACDABABCABAB";
char pat[] = "ABABCABAB";
KMPSearch(pat, txt);
return 0;
}
I am really confused why that is the case.
For LPS computation, consider: aaaaacaaac
In this case, when we try to compute LPS for the first c, we would keep going back until we hit LPS[0], which is 0 and stop. So, essentially, we would travel back atleast the length of the pattern until that point. If this happens multiple times, how will time complexity be O(m)?
I have similar confusion on runtime of KMP to be O(n).
I have read other threads in stack overflow before posting, and also various other sites on the topic. I am still very confused. I would really appreciate if someone can help me understand the best and worse case scenarios for these algorithms and how their runtime is computed using some examples. Again, please don't suggest I google this, I have done it, spent a whole week trying to gain any insight, and failed.

One way to establish an upper bound on the runtime for construction of the LPS array is to consider a pathological case - how can we maximize the number of times we have to execute len = lps[len - 1]? Consider the following string, ignoring spaces: x1 x2 x1x3 x1x2x1x4 x1x2x1x3x1x2x1x5 ...
The second term needs to be compared to the first term as if it ended in 1 instead of 2, it would match the first term. Similarly the third term needs to be compared to the first two terms as if it ended in 1 or 2 instead of 3, it would match those partial terms. And so forth.
In the example string, it is clear that only every 1/2^n characters can match n times, so the total runtime will be m+m/2+m/4+..=2m=O(m), the length of the pattern string. I suspect it's impossible to construct a string with worse runtime than the example string and this can probably be formally proven.

Related

Reversing positive sequences in array

So, I have a cycle that goes over an array and should reverse the sequence of consecutive positive numbers, but it seems to count excess negative number as a part of a sequence, thus changing its position. I can't figure the error myself, and will be happy to hear any tips!
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
using namespace std;
int Arr[100];
int Arr2[100];
int main()
{
srand(time(NULL));
int n, x;
bool seq = false;
int ib = 0;
printf("Starting array\n");
for (n = 0; n < 100; Arr[n++] = (rand() % 101) - 50);
for (n = 0; n < 100; printf("%3d ", Arr[n++]));
putchar('\n');
for (n = 0; n < 100; n++) //sorting
{
if (Arr[n] > 0) //check if the number is positive
{
if (seq == false) //if it isn't the part of a sequence
{
seq = true; ib = n; //declare it now is, and remember index of sequence's beginning
}
else
seq = true; //do nothing if it isn't first
}
else //if the number is negative
{
if (seq==true) //if sequence isn't null
for (x = n; ib <= n; ib++, x--) //new variable so that n will stay unchanged,
number of iterations = length of sequence
{
Arr2[x] = Arr[ib]; //assigning array's value to a new one,
reversing it in the process
}
seq = false; //declaring sequence's end
Arr2[n + 1] = Arr[n + 1]; //assigning negative numbers at the same place of a new array
}
}
printf("Modified array\n");
for (n = 0; n < 100; printf("%3d ", Arr2[n++]));
putchar('\n');
system('pause');
return 0;
}
following what we discussed in comments, i listed couple of rules here to shape my answer around it.
Rules :
the sequence of elements can be varied. so if there are 5 positive numbers in a row within an array, we would be reversing the 5 elements. for example
array[5] = {1,2,3,4,5} would become array[5]{5,4,3,2,1}
if single positive number neighboured by negatives, no reverse can happen
array[4] = {-1,0,-2,1} would result the same array
no processing happens when a negative number is discovered.
based on these rules.
here is what I think going wrong in your code.
Problems :
1- consider thisarray = {1,2,-1}. notice that the last value is negative. because of this. the following code would run when the 3rd index of the array is processed;
` Arr2[n + 1] = Arr[n + 1]; //assigning negative numbers at the same place of a new array`
this is a no-no. since you are already at the end of the Arr2 n+1 would indicate that there is a 4th element in the array. (in your case 101h element of the array) this would cause an undefined behaviour.
2 - consider the same array mentioned above. when that array is looped, the outcome would be array = {-1,2,1} . the -1 and 1 are swapped instead of 1 and 2.
3 - you are assigning ib = n whenever a negative number is found. because whenever a negative value is hit, seq=false is forced. But the ib, never been put into use until a next negative number is found. here is an example;
array = {...2, 6}
in such scenario, 2 and 6 would never get reversed because there is no negative value is following this positive sequence.
4 - consider this scenario arr = {-10,-1,....} this would result in arr = {0,-1,....}. This happens because of the same code causing the undefined behaviour problem mentioned above.
`Arr2[n + 1] = Arr[n + 1];`
Suggestion
Most of the problems mentioned above is happening because you are trying to figure out the sequence of the positive numbers when a negative number is found.
else //if the number is negative
{
if (seq==true) //if sequence isn't null
for (x = n; ib <= n; ib++, x--) //new variable so that n will stay unchanged,
number of iterations = length of sequence
{
Arr2[x] = Arr[ib]; //assigning array's value to a new one,
reversing it in the process
}
you should completely get rid of that and completely ignore the negative numbers unless you forgot to mention in your question some key details. instead just focus on the positive numbers. I'm not going to send you the entire code but here is how I approached the problem. feel free to let me know if you need help and I would be more then happy to go through in detail.
start your for loop as usual.
for (n = 0; n < 100; n++) //sorting
{
don't try to do anything when an element in an array is a negative value.
if (Arr[n] > 0) //check if the number is positive
if the number is positive. create recording the sequence indices. for one, we know the sequence will start at n once the `if (Arr[n] > 0) true. so we can do something like this;
int sequenceStart = n;
we also need to know when the positive number sequence ends.
int sequenceEnd = sequenceStart;
the reason for int sequenceEnd = sequenceStart; is because we going to start using the same n value to start with. we can now loop through the array and increment the sequenceEnd until we reach to a negative number or to the end of the array.
while (currentElement > 0)
{
n++;//increment n
if(n < arraySiz) //make sure we still in the range
{
currentElement = Arr[n]; // get the new elemnet
if (currentElement > 0)
{
sequenceEnd++;
}
}
else
break; // we hit to a negative value so stop the while loop.
}
notice the n++;//increment n this would increment the n++ until we reach to the negative number. which is great because at the end of the sequence the for loop will continue from the updated n
after the while loop, you can create an array that has the same size as the number of sequences you iterated through. you can then store the elements from starting arr[sequenceStart] and arr[sequenceEnd] this will make the reversing the sequence in the array easier.

Knapsack Backtracking Using only O(W) space

So I have this code that I have written that correctly finds the optimal value for the knapsack problem.
int mat[2][size + 1];
memset(mat, 0, sizeof(mat));
int i = 0;
while(i < nItems)
{
int j = 0;
if(i % 2 != 0)
{
while(++j <= size)
{
if(weights[i] <= j) mat[1][j] = max(values[i] + mat[0][j - weights[i]], mat[0][j]);
else mat[1][j] = mat[0][j];
}
}
else
{
while(++j <= size)
{
if(weights[i] <= j) mat[0][j] = max(values[i] + mat[1][j - weights[i]], mat[1][j]);
else mat[0][j] = mat[1][j];
}
}
i++;
}
int val = (nItems % 2 != 0)? mat[0][size] : mat[1][size];
cout << val << endl;
return 0;
This part I udnerstand. However I am trying to keep the same memory space, i.e. O(W), but also now compute the optimal solution using backtracking. This is where I am finding trouble. The hints I have been given is this
Now suppose that we also want the optimal set of items. Recall that the goal
in finding the optimal solution in part 1 is to find the optimal path from
entry K(0,0) to entry K(W,n). The optimal path must pass through an
intermediate node (k,n/2) for some k; this k corresponds to the remaining
capacity in the knapsack of the optimal solution after items n/2 + 1,...n
have been considered
The question asked is this.
Implement a modified version of the algorithm from part 2 that returns not
only the optimal value, but also the remaining capacity of the optimal
solution after the last half of items have been considered
Any help would be apprecaited to get me started. Thanks

C++ Longest Common Subsequence Implementation errors O(n*m)

I'm going through some dynamic programming articles on geeksforgeeks and ran across the Longest Common Subsequence problem. I did not come up with an implementation of the exponential naive solution on my own, however after working out some examples of the problem on paper I came up with what I thought was a successful implementation of an O(n*m) version . However, an OJ proved me wrong. My algorithm fails with the input strings:
"LRBBMQBHCDARZOWKKYHIDDQSCDXRJMOWFRXSJYBLDBEFSARCBYNECDYGGXXPKLORELLNMPAPQFWKHOPKMCO"
"QHNWNKUEWHSQMGBBUQCLJJIVSWMDKQTBXIXMVTRRBLJPTNSNFWZQFJMAFADRRWSOFSBCNUVQHFFBSAQXWPQCAC"
My thought process for the algorithm is as follows. I want to maintain a DP array whose length is the length of string a where a is the smaller of the input strings. dpA[i] would be the Longest Common Subsequence ending in a[i]. To do this I need to iterate through string a from index 0 => length-1 and see if a[i] exists in b. If a[i] exists in b it will be at position pos.
First mark dp[i] as 1 if dp[i] was 0
To know that a[i] is an extension of an existing subsequence we must go through a and find the first character behind i that matches a value in b behind pos. Let's call the indices of these matching values j and k respectively. This value is guaranteed to be a value we've seen before since we've covered all of a[0...i-1] and have filled out dpA[0...i-1]. When we find the first match, dpA[i] = dpA[j]+1 because we're extending the previous subsequence that ends in a[j]. Rinse repeat.
Obviously this method is not perfect or I wouldn't be asking this question, but I can't quite seem to see the problem with the algorithm. I've been looking at it so long I can hardly think about it anymore but any ideas on how to fix it would be greatly appreciated!
int longestCommonSubsequenceString(const string& x, const string& y) {
string a = (x.length() < y.length()) ? x : y;
string b = (x.length() >= y.length()) ? x : y;
vector<int> dpA(a.length(), 0);
int pos;
bool breakFlag = false;
for (int i = 0; i < a.length(); ++i) {
pos = b.find_last_of(a[i]);
if (pos != string::npos) {
if (!dpA[i]) dpA[i] = 1;
for (int j = i-1; j >= 0; --j) {
for (int k = pos-1; k >= 0; --k) {
if (a[j] == b[k]) {
dpA[i] = dpA[j]+1;
breakFlag = true;
break;
}
if (breakFlag) break;
}
}
}
breakFlag = false;
}
return *max_element(dpA.begin(), dpA.end());
}
EDIT
I think the complexity might actually be O(n*n*m)

What is the fastest way to find longest 'consecutive numbers' streak in vector ?

I have a sorted std::vector<int> and I would like to find the longest 'streak of consecutive numbers' in this vector and then return both the length of it and the smallest number in the streak.
To visualize it for you :
suppose we have :
1 3 4 5 6 8 9
I would like it to return: maxStreakLength = 4 and streakBase = 3
There might be occasion where there will be 2 streaks and we have to choose which one is longer.
What is the best (fastest) way to do this ? I have tried to implement this but I have problems with coping with more than one streak in the vector. Should I use temporary vectors and then compare their lengths?
No you can do this in one pass through the vector and only storing the longest start point and length found so far. You also need much fewer than 'N' comparisons. *
hint: If you already have say a 4 long match ending at the 5th position (=6) and which position do you have to check next?
[*] left as exercise to the reader to work out what's the likely O( ) complexity ;-)
It would be interesting to see if the fact that the array is sorted can be exploited somehow to improve the algorithm. The first thing that comes to mind is this: if you know that all numbers in the input array are unique, then for a range of elements [i, j] in the array, you can immediately tell whether elements in that range are consecutive or not, without actually looking through the range. If this relation holds
array[j] - array[i] == j - i
then you can immediately say that elements in that range are consecutive. This criterion, obviously, uses the fact that the array is sorted and that the numbers don't repeat.
Now, we just need to develop an algorithm which will take advantage of that criterion. Here's one possible recursive approach:
Input of recursive step is the range of elements [i, j]. Initially it is [0, n-1] - the whole array.
Apply the above criterion to range [i, j]. If the range turns out to be consecutive, there's no need to subdivide it further. Send the range to output (see below for further details).
Otherwise (if the range is not consecutive), divide it into two equal parts [i, m] and [m+1, j].
Recursively invoke the algorithm on the lower part ([i, m]) and then on the upper part ([m+1, j]).
The above algorithm will perform binary partition of the array and recursive descent of the partition tree using the left-first approach. This means that this algorithm will find adjacent subranges with consecutive elements in left-to-right order. All you need to do is to join the adjacent subranges together. When you receive a subrange [i, j] that was "sent to output" at step 2, you have to concatenate it with previously received subranges, if they are indeed consecutive. Or you have to start a new range, if they are not consecutive. All the while you have keep track of the "longest consecutive range" found so far.
That's it.
The benefit of this algorithm is that it detects subranges of consecutive elements "early", without looking inside these subranges. Obviously, it's worst case performance (if ther are no consecutive subranges at all) is still O(n). In the best case, when the entire input array is consecutive, this algorithm will detect it instantly. (I'm still working on a meaningful O estimation for this algorithm.)
The usability of this algorithm is, again, undermined by the uniqueness requirement. I don't know whether it is something that is "given" in your case.
Anyway, here's a possible C++ implementation
typedef std::vector<int> vint;
typedef std::pair<vint::size_type, vint::size_type> range;
class longest_sequence
{
public:
const range& operator ()(const vint &v)
{
current = max = range(0, 0);
process_subrange(v, 0, v.size() - 1);
check_record();
return max;
}
private:
range current, max;
void process_subrange(const vint &v, vint::size_type i, vint::size_type j);
void check_record();
};
void longest_sequence::process_subrange(const vint &v,
vint::size_type i, vint::size_type j)
{
assert(i <= j && v[i] <= v[j]);
assert(i == 0 || i == current.second + 1);
if (v[j] - v[i] == j - i)
{ // Consecutive subrange found
assert(v[current.second] <= v[i]);
if (i == 0 || v[i] == v[current.second] + 1)
// Append to the current range
current.second = j;
else
{ // Range finished
// Check against the record
check_record();
// Start a new range
current = range(i, j);
}
}
else
{ // Subdivision and recursive calls
assert(i < j);
vint::size_type m = (i + j) / 2;
process_subrange(v, i, m);
process_subrange(v, m + 1, j);
}
}
void longest_sequence::check_record()
{
assert(current.second >= current.first);
if (current.second - current.first > max.second - max.first)
// We have a new record
max = current;
}
int main()
{
int a[] = { 1, 3, 4, 5, 6, 8, 9 };
std::vector<int> v(a, a + sizeof a / sizeof *a);
range r = longest_sequence()(v);
return 0;
}
I believe that this should do it?
size_t beginStreak = 0;
size_t streakLen = 1;
size_t longest = 0;
size_t longestStart = 0;
for (size_t i=1; i < len.size(); i++) {
if (vec[i] == vec[i-1] + 1) {
streakLen++;
}
else {
if (streakLen > longest) {
longest = streakLen;
longestStart = beginStreak;
}
beginStreak = i;
streakLen = 1;
}
}
if (streakLen > longest) {
longest = streakLen;
longestStart = beginStreak;
}
You can't solve this problem in less than O(N) time. Imagine your list is the first N-1 even numbers, plus a single odd number (chosen from among the first N-1 odd numbers). Then there is a single streak of length 3 somewhere in the list, but worst case you need to scan the entire list to find it. Even on average you'll need to examine at least half of the list to find it.
Similar to Rodrigo's solutions but solving your example as well:
#include <vector>
#include <cstdio>
#define len(x) sizeof(x) / sizeof(x[0])
using namespace std;
int nums[] = {1,3,4,5,6,8,9};
int streakBase = nums[0];
int maxStreakLength = 1;
void updateStreak(int currentStreakLength, int currentStreakBase) {
if (currentStreakLength > maxStreakLength) {
maxStreakLength = currentStreakLength;
streakBase = currentStreakBase;
}
}
int main(void) {
vector<int> v;
for(size_t i=0; i < len(nums); ++i)
v.push_back(nums[i]);
int lastBase = v[0], currentStreakBase = v[0], currentStreakLength = 1;
for(size_t i=1; i < v.size(); ++i) {
if (v[i] == lastBase + 1) {
currentStreakLength++;
lastBase = v[i];
} else {
updateStreak(currentStreakLength, currentStreakBase);
currentStreakBase = v[i];
lastBase = v[i];
currentStreakLength = 1;
}
}
updateStreak(currentStreakLength, currentStreakBase);
printf("maxStreakLength = %d and streakBase = %d\n", maxStreakLength, streakBase);
return 0;
}

Understanding Recursion to generate permutations

I find recursion, apart from very straight forward ones like factorial, very difficult to understand. The following snippet prints all permutations of a string. Can anyone help me understand it. What is the way to go about to understand recursion properly.
void permute(char a[], int i, int n)
{
int j;
if (i == n)
cout << a << endl;
else
{
for (j = i; j <= n; j++)
{
swap(a[i], a[j]);
permute(a, i+1, n);
swap(a[i], a[j]);
}
}
}
int main()
{
char a[] = "ABCD";
permute(a, 0, 3);
getchar();
return 0;
}
PaulR has the right suggestion. You have to run through the code by "hand" (using whatever tools you want - debuggers, paper, logging function calls and variables at certain points) until you understand it. For an explanation of the code I'll refer you to quasiverse's excellent answer.
Perhaps this visualization of the call graph with a slightly smaller string makes it more obvious how it works:
The graph was made with graphviz.
// x.dot
// dot x.dot -Tpng -o x.png
digraph x {
rankdir=LR
size="16,10"
node [label="permute(\"ABC\", 0, 2)"] n0;
node [label="permute(\"ABC\", 1, 2)"] n1;
node [label="permute(\"ABC\", 2, 2)"] n2;
node [label="permute(\"ACB\", 2, 2)"] n3;
node [label="permute(\"BAC\", 1, 2)"] n4;
node [label="permute(\"BAC\", 2, 2)"] n5;
node [label="permute(\"BCA\", 2, 2)"] n6;
node [label="permute(\"CBA\", 1, 2)"] n7;
node [label="permute(\"CBA\", 2, 2)"] n8;
node [label="permute(\"CAB\", 2, 2)"] n9;
n0 -> n1 [label="swap(0, 0)"];
n0 -> n4 [label="swap(0, 1)"];
n0 -> n7 [label="swap(0, 2)"];
n1 -> n2 [label="swap(1, 1)"];
n1 -> n3 [label="swap(1, 2)"];
n4 -> n5 [label="swap(1, 1)"];
n4 -> n6 [label="swap(1, 2)"];
n7 -> n8 [label="swap(1, 1)"];
n7 -> n9 [label="swap(1, 2)"];
}
To use recursion effectively in design, you solve the problem by assuming you've already solved it.
The mental springboard for the current problem is "if I could calculate the permutations of n-1 characters, then I could calculate the permutations of n characters by choosing each one in turn and appending the permutations of the remaining n-1 characters, which I'm pretending I already know how to do".
Then you need a way to do what's called "bottoming out" the recursion. Since each new sub-problem is smaller than the last, perhaps you'll eventually get to a sub-sub-problem that you REALLY know how to solve.
In this case, you already know all the permutations of ONE character - it's just the character. So you know how to solve it for n=1 and for every number that's one more than a number you can solve it for, and you're done. This is very closely related to something called mathematical induction.
It chooses each character from all the possible characters left:
void permute(char a[], int i, int n)
{
int j;
if (i == n) // If we've chosen all the characters then:
cout << a << endl; // we're done, so output it
else
{
for (j = i; j <= n; j++) // Otherwise, we've chosen characters a[0] to a[j-1]
{ // so let's try all possible characters for a[j]
swap(a[i], a[j]); // Choose which one out of a[j] to a[n] you will choose
permute(a, i+1, n); // Choose the remaining letters
swap(a[i], a[j]); // Undo the previous swap so we can choose the next possibility for a[j]
}
}
}
This code and reference might help you to understand it.
// C program to print all permutations with duplicates allowed
#include <stdio.h>
#include <string.h>
/* Function to swap values at two pointers */
void swap(char *x, char *y)
{
char temp;
temp = *x;
*x = *y;
*y = temp;
}
/* Function to print permutations of string
This function takes three parameters:
1. String
2. Starting index of the string
3. Ending index of the string. */
void permute(char *a, int l, int r)
{
int i;
if (l == r)
printf("%s\n", a);
else
{
for (i = l; i <= r; i++)
{
swap((a+l), (a+i));
permute(a, l+1, r);
swap((a+l), (a+i)); //backtrack
}
}
}
/* Driver program to test above functions */
int main()
{
char str[] = "ABC";
int n = strlen(str);
permute(str, 0, n-1);
return 0;
}
Reference: Geeksforgeeks.org
Though it is little old question and already answered thought of adding my inputs to help new visitors. Also planning to explain the running time without focusing on Recursive Reconciliation.
I have written the sample in C# but easy to understand for most of the programmers.
static int noOfFunctionCalls = 0;
static int noOfCharDisplayCalls = 0;
static int noOfBaseCaseCalls = 0;
static int noOfRecursiveCaseCalls = 0;
static int noOfSwapCalls = 0;
static int noOfForLoopCalls = 0;
static string Permute(char[] elementsList, int currentIndex)
{
++noOfFunctionCalls;
if (currentIndex == elementsList.Length)
{
++noOfBaseCaseCalls;
foreach (char element in elementsList)
{
++noOfCharDisplayCalls;
strBldr.Append(" " + element);
}
strBldr.AppendLine("");
}
else
{
++noOfRecursiveCaseCalls;
for (int lpIndex = currentIndex; lpIndex < elementsList.Length; lpIndex++)
{
++noOfForLoopCalls;
if (lpIndex != currentIndex)
{
++noOfSwapCalls;
Swap(ref elementsList[currentIndex], ref elementsList[lpIndex]);
}
Permute(elementsList, (currentIndex + 1));
if (lpIndex != currentIndex)
{
Swap(ref elementsList[currentIndex], ref elementsList[lpIndex]);
}
}
}
return strBldr.ToString();
}
static void Swap(ref char Char1, ref char Char2)
{
char tempElement = Char1;
Char1 = Char2;
Char2 = tempElement;
}
public static void StringPermutationsTest()
{
strBldr = new StringBuilder();
Debug.Flush();
noOfFunctionCalls = 0;
noOfCharDisplayCalls = 0;
noOfBaseCaseCalls = 0;
noOfRecursiveCaseCalls = 0;
noOfSwapCalls = 0;
noOfForLoopCalls = 0;
//string resultString = Permute("A".ToCharArray(), 0);
//string resultString = Permute("AB".ToCharArray(), 0);
string resultString = Permute("ABC".ToCharArray(), 0);
//string resultString = Permute("ABCD".ToCharArray(), 0);
//string resultString = Permute("ABCDE".ToCharArray(), 0);
resultString += "\nNo of Function Calls : " + noOfFunctionCalls;
resultString += "\nNo of Base Case Calls : " + noOfBaseCaseCalls;
resultString += "\nNo of General Case Calls : " + noOfRecursiveCaseCalls;
resultString += "\nNo of For Loop Calls : " + noOfForLoopCalls;
resultString += "\nNo of Char Display Calls : " + noOfCharDisplayCalls;
resultString += "\nNo of Swap Calls : " + noOfSwapCalls;
Debug.WriteLine(resultString);
MessageBox.Show(resultString);
}
Steps:
For e.g. when we pass input as "ABC".
Permutations method called from Main for first time. So calling with Index 0 and that is first call.
In the else part in for loop we are repeating from 0 to 2 making 1 call each time.
Under each loop we are recursively calling with LpCnt + 1.
4.1 When index is 1 then 2 recursive calls.
4.2 When index is 2 then 1 recursive calls.
So from point 2 to 4.2 total calls are 5 for each loop and total is 15 calls + main entry call = 16.
Each time loopCnt is 3 then if condition gets executed.
From the diagram we can see loop count becoming 3 total 6 times i.e. Factorial value of 3 i.e. Input "ABC" length.
If statement's for loop repeats 'n' times to display chars from the example "ABC" i.e. 3.
Total 6 times (Factorial times) we enter into if to display the permutations.
So the total running time = n X n!.
I have given some static CallCnt variables and the table to understand each line execution in detail.
Experts, feel free to edit my answer or comment if any of my details are not clear or incorrect, I am happy correct them.
Download the sample code and other samples from here
Think about the recursion as simply a number of levels. At each level, you are running a piece of code, here you are running a for loop n-i times at each level. this window gets decreasing at each level. n-i times, n-(i+1) times, n-(i+2) times,..2,1,0 times.
With respect to string manipulation and permutation, think of the string as simply a 'set' of chars. "abcd" as {'a', 'b', 'c', 'd'}. Permutation is rearranging these 4 items in all possible ways. Or as choosing 4 items out of these 4 items in different ways. In permutations the order does matter. abcd is different from acbd. we have to generate both.
The recursive code provided by you exactly does that. In my string above "abcd", your recursive code runs 4 iterations (levels). In the first iteration you have 4 elements to choose from. second iteration, you have 3 elements to choose from, third 2 elements, and so on. so your code runs 4! calculations. This is explained below
First iteration:
choose a char from {a,b,c,d}
Second Iteration:
choose a char from subtracted set {{a,b,c,d} - {x}} where x is the char chosen from first iteration. i.e. if 'a' has been choose in first iteration, this iteration has {b,c,d} to choose from.
Third Iteration:
choose a char from subtracted set {{a,b,c,d} - {x,y}} where x and y are chosen chars from previous iterations. i.e. if 'a' is chosen at first iteration, and 'c' is chosen from 2nd, we have {b,d} to play with here.
This repeats until we choose 4 chars overall. Once we choose 4 possible char, we print the chars. Then backtrack and choose a different char from the possible set. i.e. when backtrack to Third iteration, we choose next from possible set {b,d}. This way we are generating all possible permutations of the given string.
We are doing this set manipulations so that we are not selecting the same chars twice. i.e. abcc, abbc, abbd,bbbb are invalid.
The swap statement in your code does this set construction. It splits the string into two sets free set to choose from used set that are already used. All chars on left side of i+1 is used set and right are free set. In first iteration, you are choosing among {a,b,c,d} and then passing {a}:{b,c,d} to next iteration. The next iteration chooses one of {b,c,d} and passes {a,b}:{c,d} to next iteration, and so on. When the control backtracks back to this iteration, you will then choose c and construct {a,c}, {b,d} using swapping.
That's the concept. Otherwise, the recursion is simple here running n deep and each level running a loop for n, n-1, n-2, n-3...2,1 times.