Intuition behind using backtracking (and not DFS) - c++

I am solving Word Search question on LeetCode.com:
Given a 2D board and a word, find if the word exists in the grid.
The word can be constructed from letters of sequentially adjacent cell, where "adjacent" cells are those horizontally or vertically neighboring. The same letter cell may not be used more than once.
The solution which I wrote with online help is as follows:
class Solution {
public:
//compare this with Max Area of Island:
//they 'look' similar, but this one uses a backtracking approach since we retract when we don't find a solution
//In case of Max Area Of Island, we are not 'coming back' - technically we do come back due to recursion, but we don't track
//that since we don't acutally intend to do anything - we are just counting the 1s.
bool exist(vector<vector<char>>& board, string& word) {
if(board.empty()) return false;
for(int i=0; i<board.size(); i++) {
for(int j=0; j<board[0].size(); j++) {
//if(word[0] == board[i][j])
if(existUtil(board, word, i, j, 0)) //matching the word[i][j] with 0th character of word
return true;
}
}
return false;
}
bool existUtil(vector<vector<char>>& board, string& word, int i, int j, int match) {
if(match==word.size()) return true;
if(i<0 || i>=board.size() || j<0 || j>=board[0].size()) return false;
if(board[i][j]!=word[match]) return false;
board[i][j] = '*';
bool ans = existUtil(board, word, i+1, j, match+1) || //[i+1,j]
existUtil(board, word, i-1, j, match+1) || // [i,j+1]
existUtil(board, word, i, j+1, match+1) || // [i-1,j]
existUtil(board, word, i, j-1, match+1); // [i,j-1]
board[i][j] = word[match];
return ans;
}
};
My question is simple - why are we using a backtracking approach and not just a conventional DFS? Pretty similar to what we have done, we could start at each character and do a DFS to determine if we could find the target word. But we are not doing so, why?
I thought a lot about it and came with the following reasoning, but I am not sure - we use a backtracking approach because of the same letter cell may not be used more than once. So, when we are do backtracking, we replace the original character with a '*' and then later re-substitute it when we come back. But this somehow does not feel right, because we could have used a visited matrix instead.

Q: My question is simple - why are we using a backtracking approach and not just a conventional DFS?
Because backtracking is far more efficient for solving this class of problems than the plain DFS.
The difference between DFS and backtracking is subtle, but we can summarize like this: DFS is a technique for searching a graph, while backtracking is a problem solving technique (which consists of DFS + pruning, such programs are called backtrackers). So, DFS visits each node until it finds the required value (in your case the target word), while backtracking is smarter - it doesn't even visit particular branches when it is certain that the target word would not be found there.
Imagine that you have a dictionary of all possible words and searching through the board to find all words that exist on the board (Boggle game). You start to traverse the board and stumble upon the letters 'J','A','C' in that order, so the current prefix is "JAC". Great. Let's look at the neighbors of the letter 'C', e.g. they are 'A', 'Q', 'D', 'F'. What would plain DFS do? It would skip 'A' because it came from that node to 'C', but it would then blindly visit each of the remaining nodes hoping to find some word, even though we know there are no words starting with "JACQ", "JACD" and "JACF". Backtracker would immediately prune branches with "JACQ", "JACD" and "JACF" by e.g. consulting an auxiliary trie data structure built from the dictionary. At some point even DFS would backtrack, but only when it does not have where to go - i.e. all surrounding letters have already been visited.
To conclude - in your example, the conventional DFS would for each node blindly check all neighbor nodes until it finds the target word or until all its neighbors are visited - it would only then backtrack. Backtracker on the other hand constantly checks whether we are on the "right track", and the key line in your code that performs this is:
if (board[i][j] != word[match]) return false;

Related

Reversing the positions of words in a string without changing order of special characters in O(1) space limit

During mock interview I come up with this question. Interviewer first ask this question without any space limitations. Then he continued with space-limited version. To be on the same page. In the question a string and container class consist of delimiters are given. This is up to you to decide suitable container class and the language of response. I think sample input and output would be enough to understand what really question is.
Input:
"Reverse#Strings Without%Changing-Delimiters"
Output:
"Delimiters#Changing Without%Strings-Reverse"
Note That: Position of "#", "%"," ","-" is not changed
I came up with the solution below:
string ChangeOrderWithoutSpecial(string s, unordered_set<char> delimiter)
{
stack<string> words; // since last words outs first
queue<char> limiter; // since first delimiter outs first
string response =""; //return value
int index=-1; // index of last delimiter visited
int len=s.length();
for (int i =0 ; i <len;i++)
{
if(delimiter.find(s[i]) != delimiter.end()) // i-th char is a delimiter character
{
string temp=s.substr(index+1,i-index-1);
words.push(temp);
char t =s.at(i);
limiter.push(t);
index=i;
}
// i realized that part after interview under assumption starting with word and no double delimiters ie, each word followed by one delimiter
if(index!=s.length()-1)
{
string temp=s.substr(index+1,s.length()-index-1);//until the end;
cout<<temp<<endl;
words.push(temp);
}
while(!limiter.empty())
{
response+=words.top()+limiter.front();
words.pop();
limiter.pop();
}
response+=words.top();
return response;
}
However I couldnt find a o(1) space solution ? Anyone know how ? I also could not figure out if there are multiple delimiters , that also be appricated. Thank you anyone spend time even reading.
Find the first word and the last word. Rotate the string by length(last_word)-length(first_word): this would put the middle part in the correct position. In the example, that'll produce
ersReverse#Strings Without%Changing-Delimit
Then rotate the first and last part of the string, skipping the middle, by length(first_word):
Delimiters#Strings Without%Changing-Reverse
Repeat this algorithm for the substring between the two outermost delimiters.
"Rotate by m" operation can be performed in O(1) space and O(n) time, where n is the length of the sequence being rotated.
Instead of rotating the string, it can be also solved by successive reversing the string.
Reverse the whole string. This is O(n) operation. In your case the string becomes sretimileD-gnignahC%tuohtiW sgnirtS#esreveR.
Find all words and reverse each of them. This is O(n) operation. String is now equal to Delimiters-Changing%Without Strings#Reverse.
Reverse delimiters. This is O(n) operation. You'll get wanted result: Delimiters#Changing Without%Strings-Reverse.
Each of these operations can be done in place, so the total memory complexity is O(1) and time complexity is O(n).
It is worth noting that with this approach each character will be visited 4 times (first reverse, finding words, reverse word, reverse delimiter), so (in general case) it should be faster than Igor Tandetnik's answer where characters in the middle of the string are visited many times. However, in special case where each word has the same length, Igor's solution will be faster because the first rotate operation won't exists.
Edit:
Reverse delimiters can be done in O(n) without extra memory in the similar way as the standard reverse. Just iterate through delimiters instead of whole set of characters:
Iterate forward until you reach delimiter;
Reverse iterate until you reach delimiter from the back;
Swap the current delimiters;
Continue procedure until your iterators meet.
Here is procedure in C++ which will do this job
void reverseDelimiters(string& s, unordered_set<char>& delimiters)
{
auto i = s.begin(); auto j = s.end() - 1; auto dend = delimiters.end();
while (i < j) {
while (i < j && delimiters.find(*i) == dend) i++;
while (i < j && delimiters.find(*j) == dend) j--;
if (i < j) swap(*i, *j), i++, j--;
}
}

Given a string, find two identical subsequences with consecutive indexes C++

I need to construct an algorithm (not necessarily effective) that given a string finds and prints two identical subsequences (by print I mean color for example). What more, the union of the sets of indexes of these two subsequences has to be a set of consecutive natural numbers (a full segment of integers).
In mathematics, the thing what I am looking for is called "tight twins", if it helps anything. (E.g., see the paper (PDF) here.)
Let me give a few examples:
1) consider string 231213231
It has two subsequences I am looking for in the form of "123". To see it better look at this image:
The first subsequence is marked with underlines and the second with overlines. As you can see they have all the properties I need.
2) consider string 12341234
3) consider string 12132344.
Now it gets more complicated:
4) consider string: 13412342
It is also not that easy:
I think that these examples explain well enough what I meant.
I've been thinking a long time about an algorithm that could do that but without success.
For coloring, I wanted to use this piece of code:
using namespace std;
HANDLE hConsole;
hConsole = GetStdHandle(STD_OUTPUT_HANDLE);
SetConsoleTextAttribute(hConsole, k);
where k is color.
Any help, even hints, would be highly appreciated.
Here's a simple recursion that tests for tight twins. When there's a duplicate, it splits the decision tree in case the duplicate is still part of the first twin. You'd have to run it on each substring of even length. Other optimizations for longer substrings could include hashing tests for char counts, as well as matching the non-duplicate portions of the candidate twins (characters that only appear twice in the whole substring).
Explanation of the function:
First, a hash is created with each character as key and the indexes it appears in as values. Then we traverse the hash: if a character count is odd, the function returns false; and indexes of characters with a count greater than 2 are added to a list of duplicates - characters half of which belong in one twin but we don't know which.
The basic rule of the recursion is to only increase i when a match for it is found later in the string, while maintaining a record of chosen matches (js) that i must skip without looking for a match. It works because if we find n/2 matches, in order, by the time j reaches the end, that's basically just another way of saying the string is composed of tight twins.
JavaScript code:
function isTightTwins(s){
var n = s.length,
char_idxs = {};
for (var i=0; i<n; i++){
if (char_idxs[s[i]] == undefined){
char_idxs[s[i]] = [i];
} else {
char_idxs[s[i]].push(i);
}
}
var duplicates = new Set();
for (var i in char_idxs){
// character with odd count
if (char_idxs[i].length & 1){
return false;
}
if (char_idxs[i].length > 2){
for (let j of char_idxs[i]){
duplicates.add(j);
}
}
}
function f(i,j,js){
// base case positive
if (js.size == n/2 && j == n){
return true;
}
// base case negative
if (j > n || (n - j < n/2 - js.size)){
return false;
}
// i is not less than j
if (i >= j) {
return f(i,j + 1,js);
}
// this i is in the list of js
if (js.has(i)){
return f(i + 1,j,js);
// yet to find twin, no match
} else if (s[i] != s[j]){
return f(i,j + 1,js);
} else {
// maybe it's a twin and maybe it's a duplicate
if (duplicates.has(j)) {
var _js = new Set(js);
_js.add(j);
return f(i,j + 1,js) | f(i + 1,j + 1,_js);
// it's a twin
} else {
js.add(j);
return f(i + 1,j + 1,js);
}
}
}
return f(0,1,new Set());
}
console.log(isTightTwins("1213213515")); // true
console.log(isTightTwins("11222332")); // false
WARNING: Commenter גלעד ברקן points out that this algorithm gives the wrong answer of 6 (higher than should be possible!) for the string 1213213515. My implementation gets the same wrong answer, so there seems to be a serious problem with this algorithm. I'll try to figure out what the problem is, but in the meantime DO NOT TRUST THIS ALGORITHM!
I've thought of a solution that will take O(n^3) time and O(n^2) space, which should be usable on strings of up to length 1000 or so. It's based on a tweak to the usual notion of longest common subsequences (LCS). For simplicity I'll describe how to find a minimal-length substring with the "tight twin" property that starts at position 1 in the input string, which I assume has length 2n; just run this algorithm 2n times, each time starting at the next position in the input string.
"Self-avoiding" common subsequences
If the length-2n input string S has the "tight twin" (TT) property, then it has a common subsequence with itself (or equivalently, two copies of S have a common subsequence) that:
is of length n, and
obeys the additional constraint that no character position in the first copy of S is ever matched with the same character position in the second copy.
In fact we can safely tighten the latter constraint to no character position in the first copy of S is ever matched to an equal or lower character position in the second copy, due to the fact that we will be looking for TT substrings in increasing order of length, and (as the bottom section shows) in any minimal-length TT substring, it's always possible to assign characters to the two subsequences A and B so that for any matched pair (i, j) of positions in the substring with i < j, the character at position i is assigned to A. Let's call such a common subsequence a self-avoiding common subsequence (SACS).
The key thing that makes efficient computation possible is that no SACS of a length-2n string can have more than n characters (since clearly you can't cram more than 2 sets of n characters into a length-2n string), so if such a length-n SACS exists then it must be of maximum possible length. So to determine whether S is TT or not, it suffices to look for a maximum-length SACS between S and itself, and check whether this in fact has length n.
Computation by dynamic programming
Let's define f(i, j) to be the length of the longest self-avoiding common subsequence of the length-i prefix of S with the length-j prefix of S. To actually compute f(i, j), we can use a small modification of the usual LCS dynamic programming formula:
f(0, _) = 0
f(_, 0) = 0
f(i>0, j>0) = max(f(i-1, j), f(i, j-1), m(i, j))
m(i, j) = (if S[i] == S[j] && i < j then 1 else 0) + f(i-1, j-1)
As you can see, the only difference is the additional condition && i < j. As with the usual LCS DP, computing it takes O(n^2) time, since the 2 arguments each range between 0 and n, and the computation required outside of recursive steps is O(1). (Actually we need only compute the "upper triangle" of this DP matrix, since every cell (i, j) below the diagonal will be dominated by the corresponding cell (j, i) above it -- though that doesn't alter the asymptotic complexity.)
To determine whether the length-2j prefix of the string is TT, we need the maximum value of f(i, 2j) over all 0 <= i <= 2n -- that is, the largest value in column 2j of the DP matrix. This maximum can be computed in O(1) time per DP cell by recording the maximum value seen so far and updating as necessary as each DP cell in the column is calculated. Proceeding in increasing order of j from j=1 to j=2n lets us fill out the DP matrix one column at a time, always treating shorter prefixes of S before longer ones, so that when processing column 2j we can safely assume that no shorter prefix is TT (since if there had been, we would have found it earlier and already terminated).
Let the string length be N.
There are two approaches.
Approach 1. This approach is always exponential-time.
For each possible subsequence of length 1..N/2, list all occurences of this subsequence. For each occurence, list positions of all characters.
For example, for 123123 it should be:
(1, ((1), (4)))
(2, ((2), (5)))
(3, ((3), (6)))
(12, ((1,2), (4,5)))
(13, ((1,3), (4,6)))
(23, ((2,3), (5,6)))
(123, ((1,2,3),(4,5,6)))
(231, ((2,3,4)))
(312, ((3,4,5)))
The latter two are not necessary, as their appear only once.
One way to do it is to start with subsequences of length 1 (i.e. characters), then proceed to subsequences of length 2, etc. At each step, drop all subsequences which appear only once, as you don't need them.
Another way to do it is to check all 2**N binary strings of length N. Whenever a binary string has not more than N/2 "1" digits, add it to the table. At the end drop all subsequences which appear only once.
Now you have a list of subsequences which appear more than 1 time. For each subsequence, check all the pairs, and check whether such a pair forms a tight twin.
Approach 2. Seek for tight twins more directly. For each N*(N-1)/2 substrings, check whether the substring is even length, and each character appears in it even number of times, and then, being its length L, check whether it contains two tight twins of the length L/2. There are 2**L ways to divide it, the simplest you can do is to check all of them. There are more interesting ways to seek for t.t., though.
I would like to approach this as a dynamic programming/pattern matching problem. We deal with characters one at a time, left to right, and we maintain a herd of Non-Deterministic Finite Automata / NDFA, which correspond to partial matches. We start off with a single null match, and with each character we extend each NDFA in every possible way, with each NDFA possibly giving rise to many children, and then de-duplicate the result - so we need to minimise the state held in the NDFA to put a bound on the size of the herd.
I think a NDFA needs to remember the following:
1) That it skipped a stretch of k characters before the match region.
2) A suffix which is a p-character string, representing characters not yet matched which will need to be matched by overlines.
I think that you can always assume that the p-character string needs to be matched with overlines because you can always swap overlines and underlines in an answer if you swap throughout the answer.
When you see a new character you can extend NDFAs in the following ways:
a) An NDFA with nothing except skips can add a skip.
b) An NDFA can always add the new character to its suffix, which may be null
c) An NDFA with a p character string whose first character matches the new character can turn into an NDFA with a p-1 character string which consists of the last p-1 characters of the old suffix. If the string is now of zero length then you have found a match, and you can work out what it was if you keep links back from each NDFA to its parent.
I thought I could use a neater encoding which would guarantee only a polynomial herd size, but I couldn't make that work, and I can't prove polynomial behaviour here, but I notice that some cases of degenerate behaviour are handled reasonably, because they lead to multiple ways to get to the same suffix.

Choosing an efficient data structure to find rhymes

I've been working on a program that reads in a whole dictionary, and utilizes the WordNet from CMU that splits every word to its pronunciation.
The goal is to utilize the dictionary to find the best rhymes and alliterations of a given word, given the number of syllables in the word we need to find and its part of speech.
I've decided to use std::map<std::string, vector<Sound> > and std::multimap<int, std::string> where the map maps each word in the dictionary to its pronunciation in a vector, and the multimap is returned from a function that finds all the words that rhyme with a given word.
The int is the number of syllables of the corresponding word, and the string holds the word.
I've been working on the efficiency, but can't seem to get it to be more efficient than O(n). The way I'm finding all the words that rhyme with a given word is
vector<string> *rhymingWords = new vector<string>;
for (iterator it : map<std::string, vector<Sound> >) {
if(rhymingSyllables(word, it.first) >= 1 && it.first != word) {
rhymingWords->push_back(it.first);
}
}
return rhymingWords;
And when I find the best rhyme for a word (a word that rhymes the most syllables with the given word), I do
vector<string> rhymes = *getAllRhymes(rhymesWith);
int x = 0;
for (string s : rhymes) {
if (countSyllables(s) == numberOfSyllables) {
int a = rhymingSyllables(s, rhymesWith);
if (a > x) {
maxRhymes = thisRhyme;
bestRhyme = s;
}
}
}
return bestRhyme;
The drawback is the O(n) access time in terms of the number of words in the dictionary. I'm thinking of ideas to drop this down to O(log n) , but seem to hit a dead end every time. I've considered using a tree structure, but can't work out the specifics.
Any suggestions? Thanks!
The rhymingSyllables function is implemented as such:
int syllableCount = 0;
if((soundMap.count(word1) == 0) || (soundMap.count(word2) == 0)) {
return 0;
}
vector<Sound> &firstSounds = soundMap.at(word1), &secondSounds = soundMap.at(word2);
for(int i = firstSounds.size() - 1, j = secondSounds.size() - 1; i >= 0 && j >= 0; --i, --j){
if(firstSounds[i] != secondSounds[j]) return syllableCount;
else if(firstSounds[i].isVowel()) ++syllableCount;
}
return syllableCount;
P.S.
The vector<Sound> is the pronunciation of the word, where Sound is a class that contains every different pronunciation of a morpheme in English: i.e,
AA vowel AE vowel AH vowel AO vowel AW vowel AY vowel B stop CH affricate D stop DH fricative EH vowel ER vowel EY vowel F fricative G stop HH aspirate IH vowel IY vowel JH affricate K stop L liquid M nasal N nasal NG nasal OW vowel OY vowel P stop R liquid S fricative SH fricative T stop TH fricative UH vowel UW vowel V fricative W semivowel Y semivowel Z fricative ZH fricative
Perhaps you could group the morphemes that will be matched during rhyming and compare not the vectors of morphemes, but vectors of associated groups. Then you can sort the dictionary once and get a logarithmic search time.
After looking at rhymingSyllables implementation, it seems that you convert words to sounds, and then match any vowels to each other, and match other sounds only if they are the same. So applying advice above, you could introduce an extra auxiliary sound 'anyVowel', and then during dictionary building convert each word to its sound, replace all vowels with 'anyVowel' and push that representation to dictionary. Once you're done sort the dictionary. When you want to search a rhyme for a word - convert it to the same representation and do a binary search on the dictionary, first by last sound as a key, then by previous and so on. This will give you m*log(n) worst case complexity, where n is dictionary size and m is word length, but typically it will terminate faster.
You could also exploit the fact that for best rhyme you consider words only with certain syllable numbers, and maintain a separate dictionary per each syllable count. Then you count number of syllables in word you look rhymes for, and search in appropriate dictionary. Asymptotically it doesn't give you any gain, but a speedup it gives may be useful in your application.
I've been thinking about this and I could probably suggest an approach to an algorithm.
I would maybe first take the dictionary and divide it into multiple buckets or batches. Where each batch represents the number of syllables each word has. The traversing of the vector to store into different buckets should be linear as you are traverse a large vector of strings. From here since the first bucket will have all words of 1 syllable there is nothing to do at the moment so you can skip to bucket two and each bucket after will need to take each word and separate the syllables of each word. So if you have say 25 buckets, where you know the first few and the last few are not going to hold many words their time shouldn't be significant and should be done first, however the buckets in the middle that have say 3-5 or 3-6 syllables in length will be the largest to do so you could run each of these buckets on a separate thread if their size is over a certain amount and have them run in parallel. Now once you are done; each bucket should return a std::vector<std::shared_ptr<Word>> where your structure might look like this:
enum SpeechSound {
SS_AA,
SS_AE,
SS_...
SS_ZH
};
enum SpeechSoundType {
ASPIRATE,
...
VOWEL
};
struct SyllableMorpheme {
SpeechSound sound;
SpeechSoundType type;
};
class Word {
public:
private:
std::string m_strWord;
// These Two Containers Should Match In Size! One String For Each
// Syllable & One Matching Struct From Above Containing Two Enums.
std::vector<std::string> m_vSyllables
std::vector<SyllableMorpheme> m_vMorphemes;
public:
explicit Word( const std::string& word );
std::string getWord() const;
std::string getSyllable( unsigned index ) const;
unsigned getSyllableCount() const;
SyllableMorpheme getMorhpeme( unsigned index ) const;
bool operator==( const ClassObj& other ) const;
bool operator!=( const ClassObj& other ) const;
private:
Word( const Word& c ); // Not Implemented
Word& operator=( const Word& other ) const; // Not Implemented
};
This time you will now have new buckets or vectors of shared pointers of these class objects. Then you can easily write a function to traverse through each bucket or even multiple buckets since the buckets will have the same signature only a different amount of syllables. Remember; each bucket should already be sorted alphabetically since we only added them in by the syllable count and never changed the order that was read in from the dictionary.
Then with this you can easily compare if two words are equal or not while checking For Matching Syllables and Morphemes. And these are contained in std::vector<std::shared_ptr<Word>>. So you don't have to worry about memory clean up as much either.
The idea is to use linear search, separation and comparison as much as possible; yet if your container gets too large, then create buckets and run in parallel multiple threads, or maybe use a hash table if it will suite your needs.
Another possibility with this class structure is that you could even add more to it later on if you wanted or needed to such as another std::vector for its definitions, and another std::vector<string> for its part of speech {noun, verb, etc.} You could even add in other vector<string> for things such as homonyms, homophomes and even a vector<string> for a list of all words that rhyme with it.
Now for your specific task of finding the best matching rhyme you may find that some words may end up having a list of Words that would all be considered a Best Match or Fit! Due to this you wouldn't want to store or return a single string, but rather a vector of strings!
Case Example:
To Too Two Blue Blew Hue Hew Knew New,
Bare Bear Care Air Ayre Heir Fair Fare There Their They're
Plain, Plane, Rain, Reign, Main, Mane, Maine
Yes these are all single syllable rhyming words, but as you can see there are many cases where there are multiple valid answers, not just a single best case match. This is something that does need to be taken into consideration.

Backtrack in any programming language [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 9 years ago.
Improve this question
I was given a task of writing a program for string permutation. I understand the logic, but not the exact meaning of the Backtrack in this program. Please explain the for-loop functionality, when swap will be called, when permutate() will be called, and the exact meaning of backtrack.
# include <stdio.h>
void swap (char *x, char *y)
{
char temp;
temp = *x;
*x = *y;
*y = temp;
}
void permute(char *a, int i, int n)
{
int j;
if (i == n)
printf("%s\n", a);
else
{
for (j = i; j <= n; j++)
{
swap((a+i), (a+j));
permute(a, i+1, n);
swap((a+i), (a+j)); //backtrack
}
}
}
int main()
{
char a[] = "ABC";
permute(a, 0, 2);
getchar();
return 0;
}
Sketching the call stack can help you understanding how the algorithm works. The example string "ABC" is a good starting point. Basically, this is what will happen with ABC:
permute(ABC, 0, 2)
i = 0
j = 0
permute(ABC, 1, 2)
i = 1
j = 1
permute(ABC, 2, 2)
print "ABC"
j = 2
string = ACB
permute(ACB, 2, 2)
print "ACB"
string = ABC
j = 1
string = BAC
permute(BAC, 1, 2)
.... (everything starts over)
As usual, in the example above, indentation defines what happens inside each of the recursive calls.
The reasoning behind the for loop is that every permutation of string ABC is given by ABC, BAC and CBA, plus every permutation of the substrings BC, AC and BA (remove the first letter from each of the previous ones). For any string S, the possible permutations are obtained by swapping every position with the first one, plus all of the permutations of each of these strings. Think of it like this: any permuted string must start with one of the letters in the original string, so you place every possible letter in the first position, and recursively apply the same method to the rest of the string (without the first letter).
That's what the loop is doing: we scan the string from the current starting point (which is i) up to the end, and at each step we swap that position with the starting point, recursively call permute() to print every permutation of this new string, and after that we restore the string back to its previous state, so that we have the original string to repeat the same process with the next position.
Personally, I don't like that comment saying "backtrack". A better term would be "winding back", because at that point the recursion winds back and you prepare your string for the next recursive call. Backtrack is normally used for a situation in which you explored a subtree and you didn't find a solution, so you go back up (backtrack) and try a different branch. Taken from wikipedia:
Backtracking is a general algorithm for finding all (or some)
solutions to some computational problem, that incrementally builds
candidates to the solutions, and abandons each partial candidate c
("backtracks") as soon as it determines that c cannot possibly be
completed to a valid solution.
Note that this algorithm does not generate the set of permutations, because it can print the same string more than once when there are repeated letters. An extreme case is when you apply it to the string "aaaaa", or any other string with one unique letter.
"Backtracking" means, you are gong back one step in your solution space (think of it as a decision tree and you are going up one level). It is usually used if you can rule out certain sub-trees in the decision space, and gives significant performance boost compared to full exploration of the decision tree if and only if it is very likely that you can rule out larger parts of the solution space.
You can find an exhaustive expalnation of a similar algorithm here: Using recursion and backtracking to generate all possible combinations

Given a 2D matrix of characters we have to check whether the given word exist in it or not

Given a 2D matrix of characters we have to check whether the given word exist in it or not.
eg
s f t
d a h
r y o
we can find "rat in it
(top down , straight ,diagonal or anypath).. even in reverse order. with least complexiety.
my approach is
While traversing the 2d matrix ( a[][] ) row wise.
If ( a[i][j] == first character of given word ) {
search for rest of the letters in 4 directions i.e. right, right diagonally down, down and left diagonally down.
} else if( a[i][j] == last character of the given word ) {
search for remaining characters in reverse order in 4 directions i.e. left, right diagonally up, up, left diagonally up.
}
is there any better approach?
Let me describe a very cool data structure for this problem.
Go ahead and look up Tries.
It takes O(k) time to insert a k-length word into the Trie, and O(k) to look-up the presence of a k-length word.
Video tutorial
If you have problems understanding the data structure, or implementing it, I'll be happy to help you there.
I think I would do this in two phases:
1) Iterate over the array, looking for instances of the first letter in the word.
2) Whenever you find an instance of the first letter, call a function that examines all adjacent cells (e.g. up to 9 of them) to see if any of them are the second letter of the word. For any second-letter-matches that are found, this function would call itself recursively and look for third-letter matches in cells adjacent to that (and so on). If the recursion ever gets all the way to the final letter of the word and finds a match for it, then the word exists in the array. (Note that if you're not allowed to use a letter twice you'll need to flag cells as 'already used' in order to prevent the algorithm from re-using them. Probably the easiest way to do that would be to pass-by-value a vector of already-used-cell-coordinates in to the recursive function, and have the recursive function ignore the contents of any cells that are in that list)
In fact you have 16 sequences here:
sft
dah
ryo
sdr
fay
tho
sao
rat
tfs
had
oyr
rds
yaf
oht
oas
tar
(3 horizontal + 3 vertical + 2 diagonals) * 2 (reversed) = 16. Let n be a size of a matrix. In your example n = 3. Number of sequences = (n + n + 2) * 2 = 4n + 4.
Now you need to determine whether a sequence is a word or not. Create a hash set (unordered_set in C++, HashSet in Java) with words from dictionary (found on the internet). You can check one sequence in O(1).
Look for the first letter or your word using a simple loop and when you find it use the following recursive function.
The function will get as input 5 parameters: the word you are looking for str, your current position of the letter in the word str you look for in your array k, i and j as the position in your array to search for the letter and direction d.
The stop conditions will be:
-if k > strlen(str); return 1;
-if arr[i][j] != str[k]; return 0;
If none of the upper statements are true you increment your letter counter k++; update your i and j acording to your value of d and call again your function via return func(str, k);