Searching a tree and outputting the path to that node in C++ - c++

Say we have the following structure:
struct ATree {
string id;
int numof_children;
ATree *children[5];
};
How would I be able to search for an id and output the path to that id? I have a way of finding if the id is in the tree, but outputting the proper path is another story. I have tried using a string stream, but it doesn't seem to work properly (i get paths that include ids not leading to the id I want). NOTE: assume ids may only appear once in the tree
Should this be done using recursion? Or can it be done using loops?

I Belive that the following code give you a notion of what you should do (recursion):
bool find(const string& i_id, const ATree* i_tree, string& o_path) {
if(!i_tree) return false;
if (i_id == i_tree->id) {
o_path = i_tree->id;
return true;
}
string path;
for (size_t i = 0; i < i_tree->numof_children; ++i) {
if (find(id, i_tree->children[i], path)) {
o_path = i_tree->id + '/' + path;
return true;
}
}
return false;
}

You should basically keep the node from which you arrived to the current, for each node you're going through. Then just pop them out and print them when you found the correct path.
If you keep them in a std::stack structure it will be easy for you to just pop them when you're going back after reaching leaves and not finding the needed id.
If you do it recursively then you just have the stack of your calls and it should be enough, if you convert them to loops (iteratively), then you need the std::stack to remember the states, but it's fairly simple, really.

Rough outline of algorithm:
std::vector< const ATree* >
getPath(const ATree* tree, const std::string& id)
{
std::vector< const ATree* > path;
if (tree->id == id) {
path.push_back(tree);
} else {
for (int i=0; i < tree->numof_children; i++) {
std::vector< const ATree* > tmp=
getPath(tree->children[i], id);
if (tmp.size() > 0) {
path.push_back(tree);
path.insert(path.end(), tmp.begin(), tmp.end());
break;
}
}
}
return path;
}

Related

Properly exiting out of recursions?

TrieNode and Trie Object:
struct TrieNode {
char nodeChar = NULL;
map<char, TrieNode> children;
TrieNode() {}
TrieNode(char c) { nodeChar = c; }
};
struct Trie {
TrieNode *root = new TrieNode();
typedef pair<char, TrieNode> letter;
typedef map<char, TrieNode>::iterator it;
Trie(vector<string> dictionary) {
for (int i = 0; i < dictionary.size(); i++) {
insert(dictionary[i]);
}
}
void insert(string toInsert) {
TrieNode * curr = root;
int increment = 0;
// while letters still exist within the trie traverse through the trie
while (curr->children.find(toInsert[increment]) != curr->children.end()) { //letter found
curr = &(curr->children.find(toInsert[increment])->second);
increment++;
}
//when it doesn't exist we know that this will be a new branch
for (int i = increment; i < toInsert.length(); i++) {
TrieNode temp(toInsert[i]);
curr->children.insert(letter(toInsert[i], temp));
curr = &(curr->children.find(toInsert[i])->second);
if (i == toInsert.length() - 1) {
temp.nodeChar = NULL;
curr->children.insert(letter(NULL, temp));
}
}
}
vector<string> findPre(string pre) {
vector<string> list;
TrieNode * curr = root;
/*First find if the pre actually exist*/
for (int i = 0; i < pre.length(); i++) {
if (curr->children.find(pre[i]) == curr->children.end()) { //DNE
return list;
}
else {
curr = &(curr->children.find(pre[i])->second);
}
}
/*Now curr is at the end of the prefix, now we will perform a DFS*/
pre = pre.substr(0, pre.length() - 1);
findPre(list, curr, pre);
}
void findPre(vector<string> &list, TrieNode *curr, string prefix) {
if (curr->nodeChar == NULL) {
list.push_back(prefix);
return;
}
else {
prefix += curr->nodeChar;
for (it i = curr->children.begin(); i != curr->children.end(); i++) {
findPre(list, &i->second, prefix);
}
}
}
};
The problem is this function:
void findPre(vector<string> &list, TrieNode *curr, string prefix) {
/*if children of TrieNode contains NULL char, it means this branch up to this point is a complete word*/
if (curr->nodeChar == NULL) {
list.push_back(prefix);
}
else {
prefix += curr->nodeChar;
for (it i = curr->children.begin(); i != curr->children.end(); i++) {
findPre(list, &i->second, prefix);
}
}
}
The purpose is to return all words with the same prefix from a trie using DFS. I manage to retrieve all the necessary strings but I can't exit out of the recursion.
The code completes the last iteration of the if statement and breaks. Visual Studio doesn't return any error code.
The typical end to a recursion is just as you said- return all words. A standard recursion looks something like this:
returnType function(params...){
//Do stuff
if(need to recurse){
return function(next params...);
}else{ //This should be your defined base-case
return base-case;
}
The issue arises in that your recursive function can never return- it can either execute the push_back, or it can call itself again. Neither of these seems to properly exit, so it'll either end quietly (with an inferred return of nothing), or it'll keep recursing.
In your situation, you likely need to store the results from recursion in an intermediate structure like a list or such, and then return that list after iteration (since it's a tree search and ought to check all the children, not return the first one only)
On that note, you seem to be missing part of the point of recursions- they exist to fill a purpose: break down a problem into pieces until those pieces are trivial to solve. Then return that case and build back to a full solution. Any tree-searching must come from this base structure, or you may miss something- like forgetting to return your results.
Check the integrity of your Trie structure. The function appears to be correct. The reason why it wouldn't terminate is if one or more of your leaf nodes doesn't have curr->nodeChar == NULL.
Another case is that any node (leaf or non-leaf) has a garbage child node. This will cause the recursion to break into reading garbage values and no reason to stop. Running in debug mode should break the execution with segmentation fault.
Write another function to test if all leaf-nodes have NULL termination.
EDIT:
After posting the code, the original poster has already pointed out that the problem was that he/she was not returning the list of strings.
Apart from that, there are a few more suggestions I would like to provide based on the code:
How does this while loop terminate if toInsert string is already in the Trie.
You will overrun the toInsert string and read a garbage character.
It will exit after that, but reading beyond your string is a bad way to program.
// while letters still exist within the trie traverse through the trie
while (curr->children.find(toInsert[increment]) != curr->children.end())
{ //letter found
curr = &(curr->children.find(toInsert[increment])->second);
increment++;
}
This can be written as follows:
while (increment < toInsert.length() &&
curr->children.find(toInsert[increment]) != curr->children.end())
Also,
Trie( vector<string> dictionary)
should be
Trie( const vector<string>& dictionary )
because dictionary can be a large object. If you don't pass by reference, it will create a second copy. This is not efficient.
I am a idiot. I forgot to return list on the first findPre() function.
vector<string> findPre(string pre) {
vector<string> list;
TrieNode * curr = root;
/*First find if the pre actually exist*/
for (int i = 0; i < pre.length(); i++) {
if (curr->children.find(pre[i]) == curr->children.end()) { //DNE
return list;
}
else {
curr = &(curr->children.find(pre[i])->second);
}
}
/*Now curr is at the end of the prefix, now we will perform a DFS*/
pre = pre.substr(0, pre.length() - 1);
findPre(list, curr, pre);
return list; //<----- this thing
}

Need to reference and update value from nested class C++

Bear with me, I'm new to C++. I'm trying to update a value which is stored in a vector, but I'm getting this error:
non-const lvalue reference to type 'Node'
I'm using a simple wrapper around std::vector so I can share methods like contains and others (similar to how the ArrayList is in Java).
#include <vector>
using namespace std;
template <class T> class NewFrames {
public:
// truncated ...
bool contains(T data) {
for(int i = 0; i < this->vec->size(); i++) {
if(this->vec->at(i) == data) {
return true;
}
}
return false;
}
int indexOf(T data) {
for(int i = 0; i < this->vec->size(); i++) {
if(this->vec->at(i) == data) {
return i;
}
}
return -1;
}
T get(int index) {
if(index > this->vec->size()) {
throw std::out_of_range("Cannot get index that exceeds the capacity");
}
return this->vec->at(index);
}
private:
vector<T> *vec;
};
#endif // A2_NEWFRAMES_H
The class which utilizes this wrapper is defined as follows:
#include "Page.h"
#include "NewFrames.h"
class Algo {
private:
typedef struct Node {
unsigned reference:1;
int data;
unsigned long _time;
Node() { }
Node(int data) {
this->data = data;
this->reference = 0;
this->_time = (unsigned long) time(NULL);
}
} Node;
unsigned _faults;
Page page;
NewFrames<Node> *frames;
};
I'm at a point where I need to reference one of the Node objects inside of the vector, but I need to be able to change reference to a different value. From what I've found on SO, I need to do this:
const Node &n = this->frames->get(this->frames->indexOf(data));
I've tried just using:
Node n = this->frames->get(this->frames->indexOf(data));
n.reference = 1;
and then viewing the data in the debugger, but the value is not updated when I check later on. Consider this:
const int data = this->page.pages[i];
const bool contains = this->frames->contains(Node(data));
Node node = this->frames->get(index);
for(unsigned i = 0; i < this->page.pages.size(); i++) {
if(node == NULL && !contains) {
// add node
} else if(contains) {
Node n = this->frames->get(this->frames->indexOf(data));
if(n.reference == 0) {
n.reference = 1;
} else {
n.reference = 0;
}
} else {
// do other stuff
}
}
With subsequent passes of the loop, the node with that particular data value is somehow different.
But if I attempt to change n.reference, I'll get an error because const is preventing the object from changing. Is there a way I can get this node so I can change it? I'm coming from the friendly Java world where something like this would work, but I want to know/understand why this doesn't work in C++.
Node n = this->frames->get(this->frames->indexOf(data));
n.reference = 1;
This copies the Node from frames and stores the copy as the object n. Modifying the copy does not change the original node.
The simplest "fix" is to use a reference. That means changing the return type of get from T to T&, and changing the previous two lines to
Node& n = this->frames->get(this->frames->indexOf(data));
n.reference = 1;
That should get the code to work. But there is so much indirection in the code that there are likely to be other problems that haven't shown up yet. As #nwp said in a comment, using vector<T> instead of vector<T>* will save you many headaches.
And while I'm giving style advice, get rid of those this->s; they're just noise. And simplify the belt-and-suspenders validity checks: when you loop from 0 to vec.size() you don't need to check that the index is okay when you access the element; change vec.at(i) to vec[i]. And in get, note that vec.at(index) will throw an exception if index is out of bounds, so you can either skip the initial range check or keep the check (after fixing it so that it checks the actual range) and, again, use vec[index] instead of vec.at(index).

Logic flaw in trie search

I'm currently working on a trie implementation for practice and have run into a mental roadbloack.
The issue is with my searching function. I am attempting to have my trie tree be able to retrieve a list of strings from a supplied prefix after they are loaded into the programs memory.
I also understand I could be using a queue/shouldnt use C functions in C++ ect.. This is just a 'rough draft' so to speak.
This is what I have so far:
bool SearchForStrings(vector<string> &output, string data)
{
Node *iter = GetLastNode("an");
Node *hold = iter;
stack<char> str;
while (hold->visited == false)
{
int index = GetNextChild(iter);
if (index > -1)
{
str.push(char('a' + index));
//current.push(iter);
iter = iter->next[index];
}
//We've hit a leaf so we want to unwind the stack and print the string
else if (index < 0 && IsLeaf(iter))
{
iter->visited = true;
string temp("");
stringstream ss;
while (str.size() > 0)
{
temp += str.top();
str.pop();
}
int i = 0;
for (std::string::reverse_iterator it = temp.rbegin(); it != temp.rend(); it++)
ss << *it;
//Store the string we have
output.push_back(data + ss.str());
//Move our iterator back to the root node
iter = hold;
}
//We know this isnt a leaf so we dont want to print out the stack
else
{
iter->visited = true;
iter = hold;
}
}
return (output.size() > 0);
}
int GetNextChild(Node *s)
{
for (int i = 0; i < 26; i++)
{
if (s->next[i] != nullptr && s->next[i]->visited == false)
return i;
}
return -1;
}
bool IsLeaf(Node *s)
{
for (int i = 0; i < 26; i++)
{
if (s->next[i] != nullptr)
return false;
}
return true;
}
struct Node{
int value;
Node *next[26];
bool visited;
};
The code is too long or i'd post it all, GetLastNode() retrieves the node at the end of the data passed in, so if the prefix was 'su' and the string was 'substring' the node would be pointing to the 'u' to use as an artificial root node
(might be completely wrong... just typed it here, no testing)
something like:
First of all, we need a way of indicating that a node represents an entry.
So let's have:
struct Node{
int value;
Node *next[26];
bool entry;
};
I've removed your visited flag because I don't have a use for it.
You should modify your insert/update/delete functions to support this flag. If the flag is true it means there's an actual entry up to that node.
Now we can modify the
bool isLeaf(Node *s) {
return s->entry;
}
Meaning that we consider a leaf when there's an entry... perhaps the name is wrong now, as the leaf might have childs ("y" node with "any" and "anywhere" is a leaf, but it has childs)
Now for the search:
First a public function that can be called.
bool searchForStrings(std::vector<string> &output, const std::string &key) {
// start the recursion
// theTrieRoot is the root node for the whole structure
return searchForString(theTrieRoot,output,key);
}
Then the internal function that will use for recursion.
bool searchForStrings(Node *node, std::vector<string> &output, const std::string &key) {
if(isLeaf(node->next[i])) {
// leaf node - add an empty string.
output.push_back(std::string());
}
if(key.empty()) {
// Key is empty, collect all child nodes.
for (int i = 0; i < 26; i++)
{
if (node->next[i] != nullptr) {
std::vector<std::string> partial;
searchForStrings(node->next[i],partial,key);
// so we got a list of the childs,
// add the key of this node to them.
for(auto s:partial) {
output.push_back(std::string('a'+i)+s)
}
}
} // end for
} // end if key.empty
else {
// key is not empty, try to get the node for the
// first character of the key.
int c=key[0]-'a';
if((c<0 || (c>26)) {
// first character was not a letter.
return false;
}
if(node->next[c]==nullptr) {
// no match (no node where we expect it)
return false;
}
// recurse into the node matching the key
std::vector<std::string> partial;
searchForStrings(node->next[c],partial,key.substr(1));
// add the key of this node to the result
for(auto s:partial) {
output.push_back(std::string(key[0])+s)
}
}
// provide a meaningful return value
if(output.empty()) {
return false;
} else {
return true;
}
}
And the execution for "an" search is.
Call searchForStrings(root,[],"an")
root is not leaf, key is not empty. Matched next node keyed by "a"
Call searchForStrings(node(a),[],"n")
node(a) is not leaf, key is not empty. Matched next node keyed by "n"
Call searchForStrings(node(n),[],"")
node(n) is not leaf, key is empty. Need to recurse on all not null childs:
Call searchForStrings(node(s),[],"")
node(s) is not leaf, key is empty, Need to recurse on all not null childs:
... eventually we will reach Node(r) which is a leaf node, so it will return an [""], going back it will get added ["r"] -> ["er"] -> ["wer"] -> ["swer"]
Call searchForStings(node(y),[],"")
node(y) is leaf (add "" to the output), key is empty,
recurse, we will get ["time"]
we will return ["","time"]
At this point we will add the "y" to get ["y","ytime"]
And here we will add the "n" to get ["nswer","ny","nytime"]
Adding the "a" to get ["answer","any","anytime"]
we're done

Palindrome Partitioning (how to figure out how to use DFS)

My general question is how to figure out how to use DFS. It seems to be a weak part of my knowledge. I have vague idea but often get stuck when the problem changes. It caused a lot of confusion for me.
For this question, I got stuck with how to write DFS with recursion.
Given a string s, partition s such that every substring of the partition is a palindrome.
Return all possible palindrome partitioning of s.
For example, given s = "aab",
Return
[
["aa","b"],
["a","a","b"]
]
My first attempt was stuck in the loop of the helper function. Then from searching on internet, I found that bool palindrome(string s) can be written as a different signature.
bool palindrome(string &s, int start, int end)
This leads to the correct solution.
Here's the code of my initial attempt:
class Solution {
public:
bool palindrome(string s)
{
int len = s.size();
for (int i=0;i<len/2; i++)
{
if (s[i]!=s[len-i])
return false;
}
return true;
}
void helper( int i, string s, vector<string> &p, vector<vector<string>> &ret)
{
int slen = s.size();
if (i==slen-1&&flag)
{
ret.push_back(p);
}
for (int k=i; k<slen; k++)
{
if (palindrome(s.substr(0,k)))
{
p.push_back(s.substr(0,k)); //Got stuck
}
}
i++;
}
vector<vector<string>> partition(string s) {
vector<vector<string>> ret;
int len=s.size();
if (len==0) return ret;
vector<string> p;
helper(0,s,p,ret);
return ret;
}
};
Correct one:
class Solution {
public:
bool palindrome(string &s, int start, int end)
{
while(start<end)
{
if (s[start]!=s[end])
return false;
start++;
end--;
}
return true;
}
void helper( int start, string &s, vector<string> &p, vector<vector<string>> &ret)
{
int slen = s.size();
if (start==slen)
{
ret.push_back(p);
return;
}
for (int i=start; i<s.size(); i++)
{
if (palindrome(s, start, i))
{
p.push_back(s.substr(start,i-start+1));
helper(i+1,s,p,ret);
p.pop_back();
}
}
}
vector<vector<string>> partition(string s) {
vector<vector<string>> ret;
int len=s.size();
if (len==0) return ret;
vector<string> p;
helper(0,s,p,ret);
return ret;
}
};
Edit Dec. 4, 2014: I saw some approach using dynamical programming but can't understand the code completely.
esp. isPalin[i][j] = (s[i] == s[j]) && ((j - i < 2) || isPalin[i+1][j-1]);
Why j-I<2 instead of j-I<1?
class Solution {
public:
vector<vector<string>> partition(string s) {
int len = s.size();
vector<vector<string>> subPalins[len+1];
subPalins[0] = vector<vector<string>>();
subPalins[0].push_back(vector<string>());
bool isPalin[len][len];
for (int i=len-1; i>=0; i--)
{
for (int j=i; j<len; j++)
{
isPalin[i][j] = (s[i]==s[j])&&((j-i<2)||isPalin[i+1][j-1]);
}
}
for (int i=1; i<=len;i++)
{
subPalins[i]=vector<vector<string>>();
for (int j=0; j<i; j++)
{
string rightStr=s.substr(j,i-j);
if (isPalin[j][i-1])
{
vector<vector<string>> prepar=subPalins[j];
for (int t=0; t<prepar.size(); t++)
{
prepar[t].push_back(rightStr);
subPalins[i].push_back(prepar[t]);
}
}
}
}
return subPalins[len];
}
};
What exactly are you asking? You have correct working code and your non-working code which is not that different.
I guess I can point out several issues with your code - may be it will be helpful to you:
in the palindrome() function you should compare s[i] to s[len-1-i] rather than to just s[len-i] in the if, since in former case you will compare 1st element (having index 0) to the non-existent element (index len). That might be the reason helper() got stuck.
in the helper() function flag is not initialized. In the for cycle, the end condition should be k<slen-1 instead of k<slen, since in latter case you will omit checking the substring that includes the terminal symbol of the string. Also, incrementing i in the end of helper() is pointless. Finally, indentations are messy in the helper() function.
Not sure why you use DFS - what is the meaning of your graph, what are the vertices and edges here? As to how the recursion works here: in the helper() function you start checking substrings of increased length for being palindrome. If the palindrome is found, you place it into p vector (which represent your current partitioning) and try to break the remainder of the string into palindromes by calling helper() recursively. If you succeed in that (i.e. if the whole string is completely partitioned into palindromes) you place the contents of p vector (current partitioning) into ret (set of all found partitionings), and then clear p to prepare it for the analysis of the next partition (purge of p is achieved by pop_back() call that follows recursive call of helper()). If, on the other hand, you fail to completely break string into palindromes, the p is purged as well, but without transferring its content into ret (this is due to the fact that recursive call for the last piece of string - which is not a palindrome - returns without calling helper() for the final symbol and thus pushing p into ret does not occur). Therefore you end up having all possible palindrome partitionings in the ret.
Hi~ this is my code using DFS + backtracking.
class Solution
{
public:
bool isPalindrome (string s) {
int i = 0, j = s.length() - 1;
while(i <= j && s[i] == s[j]) {
i++;
j--;
}
return (j < i);
}
void my_partition(string s, vector<vector<string> > &final_result, vector<string> &every_result ) {
if (s.length() ==0)
final_result.push_back(every_result);
for (int i =1; i <= s.length();++i) {
string left = s.substr(0,i);
string right = s.substr(i);
if (isPalindrome(left)) {
every_result.push_back(left);
my_partition(right, final_result, every_result);
every_result.pop_back();
}
}
}
vector<vector<string>> partition(string s) {
vector<vector<string> > final_result;
vector<string> every_result;
my_partition(s, final_result, every_result);
return final_result;
}
};
I have done Palindrome Partitioning using backtracking. Depth-first search was used here, idea is to split the given string so that the prefix is a palindrome. push prefix in a vector now explore the string leaving that prefix and then finally pop the last inserted element,
Well on spending time on backtracking is of the form, choose the element, explore without it and unchoose it.
enter code here
#include<iostream>
#include<vector>
#include<string>
using namespace std;
bool ispalidrome(string x ,int start ,int end){
while(end>=start){
if(x[end]!=x[start]){
return false;
}
start++;
end--;
}
return true;
}
void sub_palidrome(string A,int size,int start,vector<string>&small, vector < vector < string > >&big ){
if(start==size){
big.push_back(small);
return;
}
for(int i=start;i<size;i++){
if( ispalidrome(A,start,i) ){
small.push_back(A.substr(start,i-start+1));
sub_palidrome(A,size,i+1,small,big);
small.pop_back();
}
}
}
vector<vector<string> > partition(string A) {
int size=A.length();
int start=0;
vector <string>small;
vector < vector < string > >big;
sub_palidrome(A,size,start,small,big);
return big;
}
int main(){
vector<vector<string> > sol= partition("aab");
for(int i=0;i<sol.size();i++){
for(int j=0;j<sol[i].size();j++){
cout<<sol[i][j]<<" ";
}
cout<<endl;
}
}

Finding cycle in Aho-Corasick automaton

I'am facing a problem which should be solved using Aho-Corasick automaton. I'am given a set of words (composed with '0' or '1') - patterns and I must decide if it is possible to create infinite text, which wouldn't contain any of given patterns. I think, the solution is to create Aho-Corasick automaton and search for a cycle without matching states, but I'm not able to propose a good way to do that. I thought of searching the states graph using DFS, but I'm not sure if it will work and I have an implementation problem - let's assume, that we are in a state, which has an '1' edge - but state pointed by that edge is marked as matching - so we cannot use that edge, we can try fail link (current state doesn't have '0' edge) - but we must also remember, that we could not go with '1' edge from state pointed by fail link of the current one.
Could anyone correct me and show me how to do that? I've written Aho-Corasick in C++ and I'am sure it works - I also understand the entire algorithm.
Here is the base code:
class AhoCorasick
{
static const int ALPHABET_SIZE = 2;
struct State
{
State* edge[ALPHABET_SIZE];
State* fail;
State* longestMatchingSuffix;
//Vector used to remember which pattern matches in this state.
vector< int > matching;
short color;
State()
{
for(int i = 0; i < ALPHABET_SIZE; ++i)
edge[i] = 0;
color = 0;
}
~State()
{
for(int i = 0; i < ALPHABET_SIZE; ++i)
{
delete edge[i];
}
}
};
private:
State root;
vector< int > lenOfPattern;
bool isFailComputed;
//Helper function used to traverse state graph.
State* move(State* curr, char letter)
{
while(curr != &root && curr->edge[letter] == 0)
{
curr = curr->fail;
}
if(curr->edge[letter] != 0)
curr = curr->edge[letter];
return curr;
}
//Function which computes fail links and longestMatchingSuffix.
void computeFailLink()
{
queue< State* > Q;
root.fail = root.longestMatchingSuffix = 0;
for(int i = 0; i < ALPHABET_SIZE; ++i)
{
if(root.edge[i] != 0)
{
Q.push(root.edge[i]);
root.edge[i]->fail = &root;
}
}
while(!Q.empty())
{
State* curr = Q.front();
Q.pop();
if(!curr->fail->matching.empty())
{
curr->longestMatchingSuffix = curr->fail;
}
else
{
curr->longestMatchingSuffix = curr->fail->longestMatchingSuffix;
}
for(int i = 0; i < ALPHABET_SIZE; ++i)
{
if(curr->edge[i] != 0)
{
Q.push(curr->edge[i]);
State* state = curr->fail;
state = move(state, i);
curr->edge[i]->fail = state;
}
}
}
isFailComputed = true;
}
public:
AhoCorasick()
{
isFailComputed = false;
}
//Add pattern to automaton.
//pattern - pointer to pattern, which will be added
//fun - function which will be used to transform character to 0-based index.
void addPattern(const char* const pattern, int (*fun) (const char *))
{
isFailComputed = false;
int len = strlen(pattern);
State* curr = &root;
const char* pat = pattern;
for(; *pat; ++pat)
{
char tmpPat = fun(pat);
if(curr->edge[tmpPat] == 0)
{
curr = curr->edge[tmpPat] = new State;
}
else
{
curr = curr->edge[tmpPat];
}
}
lenOfPattern.push_back(len);
curr->matching.push_back(lenOfPattern.size() - 1);
}
};
int alphabet01(const char * c)
{
return *c - '0';
}
I didn't look through your code, but I know very simple and efficient implementation.
First of all, lets add Dictionary Suffix Links to the tree (their description you can find in Wikipedia). Then you have to look through all your tree and somehow mark matching nodes and nodes that have Dict Suffix Links as bad nodes. The explanation of these actions is obvious: you don't need all the matching nodes, or nodes that have a matching suffix in them.
Now you have an Aho-Corasick tree without any matching nodes. If you just run DFS algo on the resulting tree, you will get what you want.