why this C++ Trie implementation is showing odd behaviour? - c++

I implemented this class to create a trie data structure. The function
unsigned long Insert(string) //inserts the string in trie & return no of words in trie
void PrintAllWords(); // prints all words in trie separated by space in dictionary order
implementation works correctly and prints all the words inserted from a text file of english dictionary words when the number of words is not very large, but when supplied with a file with some 350k words it only prints out a b c d upto z.
private variables
struct TrieTree
{
std::map<char,struct TrieTree*> map_child;
std::map<char,unsigned long> map_count; //keeps incrementing count of char in map during insertion.
bool _isLeaf=false; // this flag is set true at node where word ends
};
struct TrieTree* _root=NULL;
unsigned long _wordCount=0;
unsigned long _INITIALIZE=1;
Below is complete implementation with driver program. The program is executable.
#include<iostream>
#include<map>
#include<fstream>
class Trie
{
private:
struct TrieTree
{
std::map<char,struct TrieTree*> map_child;
std::map<char,unsigned long> map_count;
bool _isLeaf=false;
};
struct TrieTree* _root=NULL;
unsigned long _wordCount=0;
unsigned long _INITIALIZE=1;
struct TrieTree* getNode()
{
return new TrieTree;
};
void printWords(struct TrieTree* Tptr,std::string pre)
{
if(Tptr->_isLeaf==true)
{
std::cout<<pre<<" ";
return;
}
std::map<char,struct TrieTree*>::iterator it;
it=Tptr->map_child.begin();
while(it!=Tptr->map_child.end())
{
pre.push_back(it->first);
printWords(it->second,pre);
pre.erase(pre.length()-1); //erase last prefix character
it++;
}
}
public:
Trie()
{
_root=getNode();
}
unsigned long WordCount()
{
return _wordCount;
}
unsigned long WordCount(std::string pre) //count words with prefix pre
{
if(WordCount()!=0)
{
struct TrieTree *Tptr=_root;
std::map<char,unsigned long>::iterator it;
char lastChar;
for(int i=0;i<pre.length()-1;i++)
{
Tptr=Tptr->map_child[pre[i]];
}
lastChar=pre[pre.length()-1];
it=Tptr->map_count.find(lastChar);
if(it!=Tptr->map_count.end())
{
return Tptr->map_count[lastChar];
}
else
{
return 0;
}
}
return 0;
}
unsigned long Insert(std::string key) //return word count after insertion
{
struct TrieTree *Tptr =_root;
std::map<char,struct TrieTree*>::iterator it;
if(!SearchWord(key))
{
for(int level=0;level<key.length();level++)
{
it=Tptr->map_child.find(key[level]);
if(it==Tptr->map_child.end())
{
//alphabet does not exist in map
Tptr->map_child[key[level]]=getNode(); // new node with value pointing to it
Tptr->map_count[key[level]] = _INITIALIZE;
Tptr=Tptr->map_child[key[level]]; //assign pointer to newly obtained node
if(level==key.length()-1)
Tptr->_isLeaf=true;
}
else
{ //alphabet exists at this level
Tptr->map_count[key[level]]++;
Tptr=Tptr->map_child[key[level]];
}
}
_wordCount++;
}
return _wordCount;
}
bool SearchWord(std::string key)
{
struct TrieTree *Tptr =_root;
std::map<char,struct TrieTree*>::iterator it;
for(int level=0;level<key.length();level++)
{
it=Tptr->map_child.find(key[level]);
// cout<<" "<<Tptr->map_child.size()<<endl; //test to count entries at each map level
if(it!=Tptr->map_child.end())
{
Tptr=Tptr->map_child[key[level]];
}
else
{
return false;
}
}
if(Tptr->_isLeaf==true)
return true;
return false;
}
void PrintAllWords()
{ //print all words in trie in dictionary order
struct TrieTree *Tptr =_root;
if(Tptr->map_child.empty())
{
std::cout<<"Trie is Empty"<<std::endl;
return;
}
printWords(Tptr,"");
}
void PrintAllWords(std::string pre)
{ //print all words in trie with prefix pre in Dictionary order
struct TrieTree *Tptr =_root;
if(Tptr->map_child.empty())
{
std::cout<<"Trie is Empty"<<std::endl;
return;
}
for(int i=0;i<pre.length();i++)
{
Tptr=Tptr->map_child[pre[i]];
}
printWords(Tptr,pre);
}
};
int main(){
Trie t;
std::string str;
std::fstream fs;
fs.open("words.txt",std::ios::in);
while(fs>>str){
t.Insert(str);
}
t.PrintAllWords();
return 0;
}
I don't understand the output, please take a look at the code and suggest a fix. Thanks

When you add the word "a", if there is no word starting with 'a' in the tree, you will add a "leaf" node with 'a' as the value. If you then add a word starting with 'a', such as "an", you will add the 'n' node as a child of the 'a' node. However, when you print all the words, you stop recursing when you hit a leaf node, meaning you ignore all the other words starting with that word.
Simple solution: remove the return from printWords.
Similarly if you already have "an" in the tree, when you add 'a', you don't mark it as a leaf, so it will never be output.
Simple solution: Set _isLeaf when adding a word, even if the node already exists (i.e. add Tptr->_isLeaf=true; to the else clause in Insert
I think you would be better off changing _isLeaf to something like _isWord as it seems odd to have leaf nodes with child items.

Related

Printing all the words from a prefixtree in order

I've set up a program that can take in user input to create a prefixtree. Each character is a node which are linked together. I have a "print" command that will print the words out as the following if the user gave this input: cat, car, sat, saw:
ca(R,T),sa(T,W).
I'm trying to create two functions that will instead print out the words given from the user in alphabetical word. One function PrintAllWords() is the function that will be doing most of the work, I'm thinking of having this function be a recursive function that would print to a global string of some sort each word through push_back() then delete that current word from pull_back() and move onto the next. The second function printWordList(); would call printAllWords(); and just print out the list of words create.
I've start with some code trying to slowly get to where I want, but at the moment when I use the command "list" (the command for the new functions) my code only gives me the parent nodes C and S as the following: cs.
How can I just get the first nodes of each word, try and get the first word in the prefixtree being "cat".
My Header File:
#ifndef __PREFIX_TREE_H
#define __PREFIX_TREE_H
#include <iostream>
using namespace std;
const int ALPHABET_SIZE = 26;
class PrefixTreeNode;
/*
Prefix tree
Stores a collection of strings as a tree
*/
class PrefixTree
{
private:
PrefixTreeNode* root;
public:
//Constructs an empty prefix tree
PrefixTree();
//Copy constructor
PrefixTree(const PrefixTree&);
//Copy assignment
const PrefixTree& operator=(const PrefixTree&);
//Utility func: checks whether all characters in str are letters
bool isAllLetters(const string&) const;
//Returns the root of the prefix tree
PrefixTreeNode* getRoot() { return root; };
//Returns the root of the prefix tree
const PrefixTreeNode* getRoot() const { return root; };
//Returns whether or not the given word belongs to the prefixtree
bool contains(const string&) const;
//Adds the given word to the prefix tree
void addWord(const string&);
//Prints all of the words in the prefix tree
void printWordList() const;
//Destructor
~PrefixTree();
};
/*
Node of a prefix tree
*/
class PrefixTreeNode
{
friend PrefixTree;
private:
char c;
bool final;
PrefixTreeNode* link[ALPHABET_SIZE];
public:
//Constructs a new node
PrefixTreeNode();
//Copy constructor
PrefixTreeNode(const PrefixTreeNode&);
//Copy assignment
const PrefixTreeNode& operator=(const PrefixTreeNode&);
//Returns the character this node contains
char getChar() const { return c; }
//Returns whether this node is the end of a word
bool isFinal() const { return final; }
//Changes whether this node is the end of a word
void setFinal(bool b) { final = b; }
//Returns the node corresponding to the given character
PrefixTreeNode* getChild(char);
//Returns the node corresponding to the given character
const PrefixTreeNode* getChild(char) const;
//Adds a child corresponding to the given character
void addChild(char);
//Removes the child corresponding to the given character
void deleteChild(char);
//print all words that end at or below this PrefixTreeNode
void printAllWords() const;
//Destructor
~PrefixTreeNode();
};
ostream& operator<<(ostream&, const PrefixTree&);
ostream& operator<<(ostream&, const PrefixTreeNode&);
#endif
My Source File functions:
void PrefixTreeNode::printAllWords() const{
for (char c = 'a'; c < 'z' + 1; c++)
{
if (this->getChild(c) == nullptr)
continue;
this->getChild(c);
cout << c;
}
}
//Calls all words
void PrefixTree::printWordList() const{
PrefixTreeNode* node = root;
node->printAllWords();
}
PrefixTreeNode* PrefixTreeNode::getChild(char c)
{
if (isalpha(c))
return link[tolower(c)-'a'];
else
return nullptr;
}
void PrefixTree::addWord(const string& str)
{
PrefixTreeNode* node = root;
for (int i = 0; i < str.size(); i++)
{
if (node->getChild(str[i]) == nullptr)
node->addChild(str[i]);
node = node->getChild(str[i]);
}
node->setFinal(true);
}
We use recursion to print all the stored strings in the tree in order. Call the function from main using printAllWords(root, ""). If root points to nullptr, we return. If root->final is true, we print the word. Then we append the current character to word and loop through all it's children and call printAllWords for each of them.
The same will happen for every node.
void printAllWords(Node* current, std::string word)
{
if (current == nullptr)
return;
if (current->final)
std::cout << (word+current->c) << std::endl;
for (int i = 0; i < ALPHABET_SIZE; ++i)
printAllWords(current->link[i], word + current->c);
}
Edit: Although I must confess I'm not sure what's the use of c in the treenode. If you construct the trie such that if let's say the 2nd child (b) of the current node is not null, then that means that b is part of a trail of another word(s) through it. The following code should make it clear:
void printAllWords(Node* root)
{
string word = "";
for (int i = 0; i < ALPHABET_SIZE; ++i)
printAllWords(root->link[i], word + (char)(i + 'a'));
}
void printAllWords(Node* current, std::string word)
{
if (current == nullptr)
return;
if (final)
std::cout << word << std::endl;
for (int i = 0; i < ALPHABET_SIZE; ++i)
printAllWords(current->link[i], word + (char)(i + 'a'));
}

Properly exiting out of recursions?

TrieNode and Trie Object:
struct TrieNode {
char nodeChar = NULL;
map<char, TrieNode> children;
TrieNode() {}
TrieNode(char c) { nodeChar = c; }
};
struct Trie {
TrieNode *root = new TrieNode();
typedef pair<char, TrieNode> letter;
typedef map<char, TrieNode>::iterator it;
Trie(vector<string> dictionary) {
for (int i = 0; i < dictionary.size(); i++) {
insert(dictionary[i]);
}
}
void insert(string toInsert) {
TrieNode * curr = root;
int increment = 0;
// while letters still exist within the trie traverse through the trie
while (curr->children.find(toInsert[increment]) != curr->children.end()) { //letter found
curr = &(curr->children.find(toInsert[increment])->second);
increment++;
}
//when it doesn't exist we know that this will be a new branch
for (int i = increment; i < toInsert.length(); i++) {
TrieNode temp(toInsert[i]);
curr->children.insert(letter(toInsert[i], temp));
curr = &(curr->children.find(toInsert[i])->second);
if (i == toInsert.length() - 1) {
temp.nodeChar = NULL;
curr->children.insert(letter(NULL, temp));
}
}
}
vector<string> findPre(string pre) {
vector<string> list;
TrieNode * curr = root;
/*First find if the pre actually exist*/
for (int i = 0; i < pre.length(); i++) {
if (curr->children.find(pre[i]) == curr->children.end()) { //DNE
return list;
}
else {
curr = &(curr->children.find(pre[i])->second);
}
}
/*Now curr is at the end of the prefix, now we will perform a DFS*/
pre = pre.substr(0, pre.length() - 1);
findPre(list, curr, pre);
}
void findPre(vector<string> &list, TrieNode *curr, string prefix) {
if (curr->nodeChar == NULL) {
list.push_back(prefix);
return;
}
else {
prefix += curr->nodeChar;
for (it i = curr->children.begin(); i != curr->children.end(); i++) {
findPre(list, &i->second, prefix);
}
}
}
};
The problem is this function:
void findPre(vector<string> &list, TrieNode *curr, string prefix) {
/*if children of TrieNode contains NULL char, it means this branch up to this point is a complete word*/
if (curr->nodeChar == NULL) {
list.push_back(prefix);
}
else {
prefix += curr->nodeChar;
for (it i = curr->children.begin(); i != curr->children.end(); i++) {
findPre(list, &i->second, prefix);
}
}
}
The purpose is to return all words with the same prefix from a trie using DFS. I manage to retrieve all the necessary strings but I can't exit out of the recursion.
The code completes the last iteration of the if statement and breaks. Visual Studio doesn't return any error code.
The typical end to a recursion is just as you said- return all words. A standard recursion looks something like this:
returnType function(params...){
//Do stuff
if(need to recurse){
return function(next params...);
}else{ //This should be your defined base-case
return base-case;
}
The issue arises in that your recursive function can never return- it can either execute the push_back, or it can call itself again. Neither of these seems to properly exit, so it'll either end quietly (with an inferred return of nothing), or it'll keep recursing.
In your situation, you likely need to store the results from recursion in an intermediate structure like a list or such, and then return that list after iteration (since it's a tree search and ought to check all the children, not return the first one only)
On that note, you seem to be missing part of the point of recursions- they exist to fill a purpose: break down a problem into pieces until those pieces are trivial to solve. Then return that case and build back to a full solution. Any tree-searching must come from this base structure, or you may miss something- like forgetting to return your results.
Check the integrity of your Trie structure. The function appears to be correct. The reason why it wouldn't terminate is if one or more of your leaf nodes doesn't have curr->nodeChar == NULL.
Another case is that any node (leaf or non-leaf) has a garbage child node. This will cause the recursion to break into reading garbage values and no reason to stop. Running in debug mode should break the execution with segmentation fault.
Write another function to test if all leaf-nodes have NULL termination.
EDIT:
After posting the code, the original poster has already pointed out that the problem was that he/she was not returning the list of strings.
Apart from that, there are a few more suggestions I would like to provide based on the code:
How does this while loop terminate if toInsert string is already in the Trie.
You will overrun the toInsert string and read a garbage character.
It will exit after that, but reading beyond your string is a bad way to program.
// while letters still exist within the trie traverse through the trie
while (curr->children.find(toInsert[increment]) != curr->children.end())
{ //letter found
curr = &(curr->children.find(toInsert[increment])->second);
increment++;
}
This can be written as follows:
while (increment < toInsert.length() &&
curr->children.find(toInsert[increment]) != curr->children.end())
Also,
Trie( vector<string> dictionary)
should be
Trie( const vector<string>& dictionary )
because dictionary can be a large object. If you don't pass by reference, it will create a second copy. This is not efficient.
I am a idiot. I forgot to return list on the first findPre() function.
vector<string> findPre(string pre) {
vector<string> list;
TrieNode * curr = root;
/*First find if the pre actually exist*/
for (int i = 0; i < pre.length(); i++) {
if (curr->children.find(pre[i]) == curr->children.end()) { //DNE
return list;
}
else {
curr = &(curr->children.find(pre[i])->second);
}
}
/*Now curr is at the end of the prefix, now we will perform a DFS*/
pre = pre.substr(0, pre.length() - 1);
findPre(list, curr, pre);
return list; //<----- this thing
}

Logic flaw in trie search

I'm currently working on a trie implementation for practice and have run into a mental roadbloack.
The issue is with my searching function. I am attempting to have my trie tree be able to retrieve a list of strings from a supplied prefix after they are loaded into the programs memory.
I also understand I could be using a queue/shouldnt use C functions in C++ ect.. This is just a 'rough draft' so to speak.
This is what I have so far:
bool SearchForStrings(vector<string> &output, string data)
{
Node *iter = GetLastNode("an");
Node *hold = iter;
stack<char> str;
while (hold->visited == false)
{
int index = GetNextChild(iter);
if (index > -1)
{
str.push(char('a' + index));
//current.push(iter);
iter = iter->next[index];
}
//We've hit a leaf so we want to unwind the stack and print the string
else if (index < 0 && IsLeaf(iter))
{
iter->visited = true;
string temp("");
stringstream ss;
while (str.size() > 0)
{
temp += str.top();
str.pop();
}
int i = 0;
for (std::string::reverse_iterator it = temp.rbegin(); it != temp.rend(); it++)
ss << *it;
//Store the string we have
output.push_back(data + ss.str());
//Move our iterator back to the root node
iter = hold;
}
//We know this isnt a leaf so we dont want to print out the stack
else
{
iter->visited = true;
iter = hold;
}
}
return (output.size() > 0);
}
int GetNextChild(Node *s)
{
for (int i = 0; i < 26; i++)
{
if (s->next[i] != nullptr && s->next[i]->visited == false)
return i;
}
return -1;
}
bool IsLeaf(Node *s)
{
for (int i = 0; i < 26; i++)
{
if (s->next[i] != nullptr)
return false;
}
return true;
}
struct Node{
int value;
Node *next[26];
bool visited;
};
The code is too long or i'd post it all, GetLastNode() retrieves the node at the end of the data passed in, so if the prefix was 'su' and the string was 'substring' the node would be pointing to the 'u' to use as an artificial root node
(might be completely wrong... just typed it here, no testing)
something like:
First of all, we need a way of indicating that a node represents an entry.
So let's have:
struct Node{
int value;
Node *next[26];
bool entry;
};
I've removed your visited flag because I don't have a use for it.
You should modify your insert/update/delete functions to support this flag. If the flag is true it means there's an actual entry up to that node.
Now we can modify the
bool isLeaf(Node *s) {
return s->entry;
}
Meaning that we consider a leaf when there's an entry... perhaps the name is wrong now, as the leaf might have childs ("y" node with "any" and "anywhere" is a leaf, but it has childs)
Now for the search:
First a public function that can be called.
bool searchForStrings(std::vector<string> &output, const std::string &key) {
// start the recursion
// theTrieRoot is the root node for the whole structure
return searchForString(theTrieRoot,output,key);
}
Then the internal function that will use for recursion.
bool searchForStrings(Node *node, std::vector<string> &output, const std::string &key) {
if(isLeaf(node->next[i])) {
// leaf node - add an empty string.
output.push_back(std::string());
}
if(key.empty()) {
// Key is empty, collect all child nodes.
for (int i = 0; i < 26; i++)
{
if (node->next[i] != nullptr) {
std::vector<std::string> partial;
searchForStrings(node->next[i],partial,key);
// so we got a list of the childs,
// add the key of this node to them.
for(auto s:partial) {
output.push_back(std::string('a'+i)+s)
}
}
} // end for
} // end if key.empty
else {
// key is not empty, try to get the node for the
// first character of the key.
int c=key[0]-'a';
if((c<0 || (c>26)) {
// first character was not a letter.
return false;
}
if(node->next[c]==nullptr) {
// no match (no node where we expect it)
return false;
}
// recurse into the node matching the key
std::vector<std::string> partial;
searchForStrings(node->next[c],partial,key.substr(1));
// add the key of this node to the result
for(auto s:partial) {
output.push_back(std::string(key[0])+s)
}
}
// provide a meaningful return value
if(output.empty()) {
return false;
} else {
return true;
}
}
And the execution for "an" search is.
Call searchForStrings(root,[],"an")
root is not leaf, key is not empty. Matched next node keyed by "a"
Call searchForStrings(node(a),[],"n")
node(a) is not leaf, key is not empty. Matched next node keyed by "n"
Call searchForStrings(node(n),[],"")
node(n) is not leaf, key is empty. Need to recurse on all not null childs:
Call searchForStrings(node(s),[],"")
node(s) is not leaf, key is empty, Need to recurse on all not null childs:
... eventually we will reach Node(r) which is a leaf node, so it will return an [""], going back it will get added ["r"] -> ["er"] -> ["wer"] -> ["swer"]
Call searchForStings(node(y),[],"")
node(y) is leaf (add "" to the output), key is empty,
recurse, we will get ["time"]
we will return ["","time"]
At this point we will add the "y" to get ["y","ytime"]
And here we will add the "n" to get ["nswer","ny","nytime"]
Adding the "a" to get ["answer","any","anytime"]
we're done

Calling new on object with pointer to of same type, seems to allocate memory to pointer

I'm trying to implement a Trie data structure on my own, without looking at other implementations, so simply based on my conceptual knowledge of the structure. I would like to avoid using vectors, simply because they are easy to use... I prefer to use pointers for dynamically allocating memory for arrays when I'm programming as practice. That said, with the structure that I currently have, I have a Node class that contains a pointer to a Node array, a letter (bool), and a marker (bool). My Trie class has a pointer to the starting Node array. Each node array has 26 elements to refer to each letter of the English alphabet from 'a' to 'z' lowercase (I convert each word inserted to lowercase). When a letter is set to 'true' then its letterArray is allocated new memory. Node has a constructor to set letter and marker to false and letterArray to nullptr. I can insert the first letter no problem and go to the next letterArray (which is nullptr at this point) after which memory is allocated to the new array. The problem is, the next letterArray of each Node is also allocated memory, but the constructor is not called on them, resulting in their letter and marker containing garbage, and I'm wondering what is the reason the constructor is not called? Hopefully the code will make this more clear:
class Node {
private:
bool letter;
bool marker;
Node* letterArray;
void initNode();
public:
Node();
bool setLetter(bool set);
bool setMarker(bool set);
bool checkLetter();
bool checkMarker();
char getLetter();
Node*& getNextLetterArray();
};
class Trie {
private:
Node* start;
int wordCount;
int letterCount;
const int totalLetters = 26;
void destroyTrie();
bool initBranch(Node*& nextBranch);
void insertCharAndMove(Node*& ptr, int, int, int);
public:
Trie();
Trie(string firstWord);
~Trie();
bool insertWord(string word);
bool deleteWord(string word);
bool getToLetter(char letter);
string getLowerCase(string word);
bool wordExists(string word);
};
insertWord:
bool Trie::insertWord(string word) {
Node* ptr = start;
string wordLower = getLowerCase(word);
int wordLength = word.length();
if (wordLength <= 0) return false;
for (int i = 0; i < wordLength; i++) {
int charIndex = (word[i] - 'a');
insertCharAndMove(ptr, charIndex, wordLength, i);
}
wordCount++;
return true;
}
void Trie::insertCharAndMove(Node*& ptr, int charIndex, int wordLength, int i) {
if (ptr[charIndex].setLetter(true)) letterCount++;
if (i < wordLength) {
ptr = ptr[i].getNextLetterArray();
initBranch(ptr);
}
else ptr[i].setMarker(true);
}
initBranch:
bool Trie::initBranch(Node*& nextBranch) {
if (nextBranch != nullptr) return false;
nextBranch = new Node[letterCount];
return true;
}
Trie Constructor:
Trie::Trie() {
start = new Node[totalLetters];
wordCount = 0;
letterCount = 0;
}
Node Constructor:
Node::Node() {
initNode();
}
void Node::initNode() {
letter = false;
marker = false;
letterArray = nullptr;
}
getNextLetterArray:
Node*& Node::getNextLetterArray() {
return letterArray;
}

What's the correct approach to solve SPOJ www.spoj.com/problems/PRHYME/?

I have been trying to solve this problem SPOJ www.spoj.com/problems/PRHYME/? for several days, but have had no success.
Here is the problem in brief:
Given is a wordlist L, and a word w. Your task is to find a word in L that forms a perfect rhyme with w. This word u is uniquely determined by these properties:
It is in L.
It is different from w.
Their common suffix is as long as possible.
Out of all words that satisfy the previous points, u is the lexicographically smallest one.
Length of a word will be<=30.
And number of words both in the dictionary and the queries can be 2,50,000.
I am using a trie to store all the words in the dictionary reversed.
Then to solve the queries I proceed in the following fashion:-
If word is present in the trie,delete it from trie.
Now traverse the trie from the root till the point the character from the query string match the trie values.Let this point where last character match was found be P.
Now from this point P onward ,I traverse the trie using DFS,and on encountering a leaf node,push the string formed to the possible results list.
Now I return the lexicographic ally smallest result from this list.
When I submit my solution on SPOJ,my solution gets a Time Limit Exceeded Error.
Can someone please suggest a detailed algorithm or hint to solve this problem ?
I can post my code if required.
#include<iostream>
#include<cstdio>
#include<cstring>
#include<climits>
#include<vector>
#include<string>
#include<algorithm>
#include<cctype>
#include<cstdlib>
#include<utility>
#include<map>
#include<queue>
#include<set>
#define ll long long signed int
#define ull unsigned long long int
const int alpha=26;
using namespace std;
struct node
{
int value;
node * child[alpha];
};
node * newnode()
{
node * newt=new node;
newt->value=0;
for(int i=0;i<alpha;i++)
{
newt->child[i]=NULL;
}
return newt;
}
struct trie
{
node * root;
int count;
trie()
{
count=0;
root=newnode();
}
};
trie * dict=new trie;
string reverse(string s)
{
int l=s.length();
string rev=s;
for(int i=0;i<l;i++)
{
int j=l-1-i;
rev[j]=s[i];
}
return rev;
}
void insert(string s)
{
int l=s.length();
node * ptr=dict->root;
dict->count++;
for(int i=0;i<l;i++)
{
int index=s[i]-'a';
if(ptr->child[index]==NULL)
{
ptr->child[index]=newnode();
}
ptr=ptr->child[index];
}
ptr->value=dict->count;
}
void dfs1(node *ptr,string p)
{
if(ptr==NULL) return;
if(ptr->value) cout<<"word" <<p<<endl;
for(int i=0;i<26;i++)
{
if(ptr->child[i]!=NULL)
dfs1(ptr->child[i],p+char('a'+i));
}
}
vector<string> results;
pair<node *,string> search(string s)
{
int l=s.length();
node * ptr=dict->root;
node *save=ptr;
string match="";
int i=0;
bool no_match=false;
while(i<l and !no_match)
{
int in=s[i]-'a';
if(ptr->child[in]==NULL)
{
save=ptr;
no_match=true;
}
else
{
ptr=ptr->child[in];
save=ptr;
match+=in+'a';
}
i++;
}
//cout<<s<<" matched till here"<<match <<" "<<endl;
return make_pair(save,match);
}
bool find(string s)
{
int l=s.length();
node * ptr=dict->root;
string match="";
for(int i=0;i<l;i++)
{
int in=s[i]-'a';
//cout<<match<<"match"<<endl;
if(ptr->child[in]==NULL)
{
return false;
}
ptr=ptr->child[in];
match+=char(in+'a');
}
//cout<<match<<"match"<<endl;
return true;
}
bool leafNode(node *pNode)
{
return (pNode->value != 0);
}
bool isItFreeNode(node *pNode)
{
int i;
for(i = 0; i < alpha; i++)
{
if( pNode->child[i] )
return false;
}
return true;
}
bool deleteHelper(node *pNode, string key, int level, int len)
{
if( pNode )
{
// Base case
if( level == len )
{
if( pNode->value )
{
// Unmark leaf node
pNode->value = 0;
// If empty, node to be deleted
if( isItFreeNode(pNode) )
{
return true;
}
return false;
}
}
else // Recursive case
{
int index = (key[level])-'a';
if( deleteHelper(pNode->child[index], key, level+1, len) )
{
// last node marked, delete it
free(pNode->child[index]);
pNode->child[index]=NULL;
// recursively climb up, and delete eligible nodes
return ( !leafNode(pNode) && isItFreeNode(pNode) );
}
}
}
return false;
}
void deleteKey(string key)
{
int len = key.length();
if( len > 0 )
{
deleteHelper(dict->root, key, 0, len);
}
}
string result="***";
void dfs(node *ptr,string p)
{
if(ptr==NULL) return;
if(ptr->value )
{
if((result)=="***")
{
result=reverse(p);
}
else
{
result=min(result,reverse(p));
}
}
for(int i=0;i<26;i++)
{
if(ptr->child[i]!=NULL)
dfs(ptr->child[i],p+char('a'+i));
}
}
int main(int argc ,char ** argv)
{
#ifndef ONLINE_JUDGE
freopen("prhyme.in","r",stdin);
#endif
string s;
while(getline(cin,s,'\n'))
{
if(s[0]<'a' and s[0]>'z')
break;
int l=s.length();
if(l==0) break;
string rev;//=new char[l+1];
rev=reverse(s);
insert(rev);
//cout<<"...........traverse..........."<<endl;
//dfs(dict->root);
//cout<<"..............traverse end.............."<<endl;
}
while(getline(cin,s))
{
results.clear();
//cout<<s<<endl;
int l=s.length();
if(!l) break;
string rev;//=new char[l+1];
rev=reverse(s);
//cout<<rev<<endl;
bool del=false;
if(find(rev))
{
del=true;
//cout<<"here found"<<endl;
deleteKey(rev);
}
if(find(rev))
{
del=true;
//cout<<"here found"<<endl;
deleteKey(rev);
}
else
{
//cout<<"not here found"<<endl;
}
// cout<<"...........traverse..........."<<endl;
//dfs1(dict->root,"");
// cout<<"..............traverse end.............."<<endl;
pair<node *,string> pp=search(rev);
result="***";
dfs(pp.first,pp.second);
//cout<<"search results"<<endl;
//dfs1(pp.first,pp.second);
//cout<<"end of search results"<<
for(int i=0;i<results.size();i++)
{
results[i]=reverse(results[i]);
// cout<<s<<" "<<results[i]<<endl;
}
string smin=result;
if(del)
{
insert(rev);
}
cout<<smin<<endl;
}
return 0;
}
Your algorithm (using a trie that stores all reversed words) is a good start. But one issue with it is that for each lookup, you have to enumerate all words with a certain suffix in order to find the lexicographically smallest one. For some cases, this can be a lot of work.
One way to fix this: In each node (corresponding to each suffix), store the two lexicographically smallest words that have that suffix. This is easy to maintain while building the trie by updating all ancestor nodes of each newly added leaf (see pseudo-code below).
Then to perform a lookup of a word w, start at the node corresponding to the word, and go up in the tree until you reach a node which contains a descendant word other than w. Then return the lexicographically smallest word stored in that node, or the second smallest in case the smallest is equal to w.
To create the trie, the following pseudo-code can be used:
for each word:
add word to trie
let n be the node corresponding to the new word.
for each ancestor a of n (including n):
if a.smallest==null or word < a.smallest:
a.second_smallest = a.smallest
a.smallest = word
else if a.second_smallest==null or word < a.second_smallest:
a.second_smallest = word
To lookup a word w:
let n be the node corresponding to longest possible suffix of w.
while ((n.smallest==w || n.smallest==null) &&
(n.second_smallest==w || n.second_smallest==null)):
n = n.parent
if n.smallest==w:
return n.second_smallest
else:
return n.smallest
Another similar possibility is to use a hash table mapping all suffixes to the two lexicographically smallest words instead of using a trie. This is probably easier to implement if you can use std::unordered_map.