Trie Implementation in C++ - c++

I am trying to implement the trie as shown on the TopCoder page. I am modifying it a bit to store the phone numbers of the users. I am getting segmentation fault. Can some one please point out the error.
#include<iostream>
#include<stdlib.h>
using namespace std;
struct node{
int words;
int prefix;
long phone;
struct node* children[26];
};
struct node* initialize(struct node* root) {
root = new (struct node);
for(int i=0;i<26;i++){
root->children[i] = NULL;
}
root->word = 0;
root->prefix = 0;
return root;
}
int getIndex(char l) {
if(l>='A' && l<='Z'){
return l-'A';
}else if(l>='a' && l<='z'){
return l-'a';
}
}
void add(struct node* root, char * name, int data) {
if(*(name)== '\0') {
root->words = root->words+1;
root->phone = data;
} else {
root->prefix = root->prefix + 1;
char ch = *name;
int index = getIndex(ch);
if(root->children[ch]==NULL) {
struct node* temp = NULL;
root->children[ch] = initialize(temp);
}
add(root->children[ch],name++, data);
}
}
int main(){
struct node* root = NULL;
root = initialize(root);
add(root,(char *)"test",1111111111);
add(root,(char *)"teser",2222222222);
cout<<root->prefix<<endl;
return 0;
}
Added a new function after making suggested changes:
void getPhone(struct node* root, char* name){
while(*(name) != '\0' || root!=NULL) {
char ch = *name;
int index = getIndex(ch);
root = root->children[ch];
++name;
}
if(*(name) == '\0'){
cout<<root->phone<<endl;
}
}

Change this:
add(root->children[ch], name++, data);
// ---------------------^^^^^^
To this:
add(root->children[ch], ++name, data);
// ---------------------^^^^^^
The remainder of the issues in this code I leave to you, but that is the cause of your run up call-stack.
EDIT OP ask for further analysis, and while I normally don't do so, this was a fairly simple application on which to expand.
This is done in several places:
int index = getIndex(ch);
root = root->children[ch];
... etc. continue using ch instead of index
It begs the question: "Why did we just ask for an index that we promptly ignore and use the char anyway?" This is done in add() and getPhone(). You should use index after computing it for all peeks inside children[] arrays.
Also, the initialize() function needs to be either revamped or outright thrown out in favor of a constructor-based solution, where that code truly belongs. Finally, if this trie is supposed to be tracking usage counts of words generated and prefixes each level is participating in, I'm not clear why you need both words and prefix counters, but in either case to update the counters your recursive decent in add() should bump them up on the back-recurse.

Related

Properly exiting out of recursions?

TrieNode and Trie Object:
struct TrieNode {
char nodeChar = NULL;
map<char, TrieNode> children;
TrieNode() {}
TrieNode(char c) { nodeChar = c; }
};
struct Trie {
TrieNode *root = new TrieNode();
typedef pair<char, TrieNode> letter;
typedef map<char, TrieNode>::iterator it;
Trie(vector<string> dictionary) {
for (int i = 0; i < dictionary.size(); i++) {
insert(dictionary[i]);
}
}
void insert(string toInsert) {
TrieNode * curr = root;
int increment = 0;
// while letters still exist within the trie traverse through the trie
while (curr->children.find(toInsert[increment]) != curr->children.end()) { //letter found
curr = &(curr->children.find(toInsert[increment])->second);
increment++;
}
//when it doesn't exist we know that this will be a new branch
for (int i = increment; i < toInsert.length(); i++) {
TrieNode temp(toInsert[i]);
curr->children.insert(letter(toInsert[i], temp));
curr = &(curr->children.find(toInsert[i])->second);
if (i == toInsert.length() - 1) {
temp.nodeChar = NULL;
curr->children.insert(letter(NULL, temp));
}
}
}
vector<string> findPre(string pre) {
vector<string> list;
TrieNode * curr = root;
/*First find if the pre actually exist*/
for (int i = 0; i < pre.length(); i++) {
if (curr->children.find(pre[i]) == curr->children.end()) { //DNE
return list;
}
else {
curr = &(curr->children.find(pre[i])->second);
}
}
/*Now curr is at the end of the prefix, now we will perform a DFS*/
pre = pre.substr(0, pre.length() - 1);
findPre(list, curr, pre);
}
void findPre(vector<string> &list, TrieNode *curr, string prefix) {
if (curr->nodeChar == NULL) {
list.push_back(prefix);
return;
}
else {
prefix += curr->nodeChar;
for (it i = curr->children.begin(); i != curr->children.end(); i++) {
findPre(list, &i->second, prefix);
}
}
}
};
The problem is this function:
void findPre(vector<string> &list, TrieNode *curr, string prefix) {
/*if children of TrieNode contains NULL char, it means this branch up to this point is a complete word*/
if (curr->nodeChar == NULL) {
list.push_back(prefix);
}
else {
prefix += curr->nodeChar;
for (it i = curr->children.begin(); i != curr->children.end(); i++) {
findPre(list, &i->second, prefix);
}
}
}
The purpose is to return all words with the same prefix from a trie using DFS. I manage to retrieve all the necessary strings but I can't exit out of the recursion.
The code completes the last iteration of the if statement and breaks. Visual Studio doesn't return any error code.
The typical end to a recursion is just as you said- return all words. A standard recursion looks something like this:
returnType function(params...){
//Do stuff
if(need to recurse){
return function(next params...);
}else{ //This should be your defined base-case
return base-case;
}
The issue arises in that your recursive function can never return- it can either execute the push_back, or it can call itself again. Neither of these seems to properly exit, so it'll either end quietly (with an inferred return of nothing), or it'll keep recursing.
In your situation, you likely need to store the results from recursion in an intermediate structure like a list or such, and then return that list after iteration (since it's a tree search and ought to check all the children, not return the first one only)
On that note, you seem to be missing part of the point of recursions- they exist to fill a purpose: break down a problem into pieces until those pieces are trivial to solve. Then return that case and build back to a full solution. Any tree-searching must come from this base structure, or you may miss something- like forgetting to return your results.
Check the integrity of your Trie structure. The function appears to be correct. The reason why it wouldn't terminate is if one or more of your leaf nodes doesn't have curr->nodeChar == NULL.
Another case is that any node (leaf or non-leaf) has a garbage child node. This will cause the recursion to break into reading garbage values and no reason to stop. Running in debug mode should break the execution with segmentation fault.
Write another function to test if all leaf-nodes have NULL termination.
EDIT:
After posting the code, the original poster has already pointed out that the problem was that he/she was not returning the list of strings.
Apart from that, there are a few more suggestions I would like to provide based on the code:
How does this while loop terminate if toInsert string is already in the Trie.
You will overrun the toInsert string and read a garbage character.
It will exit after that, but reading beyond your string is a bad way to program.
// while letters still exist within the trie traverse through the trie
while (curr->children.find(toInsert[increment]) != curr->children.end())
{ //letter found
curr = &(curr->children.find(toInsert[increment])->second);
increment++;
}
This can be written as follows:
while (increment < toInsert.length() &&
curr->children.find(toInsert[increment]) != curr->children.end())
Also,
Trie( vector<string> dictionary)
should be
Trie( const vector<string>& dictionary )
because dictionary can be a large object. If you don't pass by reference, it will create a second copy. This is not efficient.
I am a idiot. I forgot to return list on the first findPre() function.
vector<string> findPre(string pre) {
vector<string> list;
TrieNode * curr = root;
/*First find if the pre actually exist*/
for (int i = 0; i < pre.length(); i++) {
if (curr->children.find(pre[i]) == curr->children.end()) { //DNE
return list;
}
else {
curr = &(curr->children.find(pre[i])->second);
}
}
/*Now curr is at the end of the prefix, now we will perform a DFS*/
pre = pre.substr(0, pre.length() - 1);
findPre(list, curr, pre);
return list; //<----- this thing
}

Implementation of stack in C++ without using <stack>

I want to make an implementation of stack, I found a working model on the internet, unfortunately it is based on the idea that I know the size of the stack I want to implement right away. What I want to do is be able to add segments to my stack as they are needed, because potential maximum amount of the slots required goes into 10s of thousands and from my understanding making the size set in stone (when all of it is not needed most of the time) is a huge waste of memory and loss of the execution speed of the program. I also do not want to use any complex prewritten functions in my implementation (the functions provided by STL or different libraries such as vector etc.) as I want to understand all of them more by trying to make them myself/with brief help.
struct variabl {
char *given_name;
double value;
};
variabl* variables[50000];
int c = 0;
int end_of_stack = 0;
class Stack
{
private:
int top, length;
char *z;
int index_struc = 0;
public:
Stack(int = 0);
~Stack();
char pop();
void push();
};
Stack::Stack(int size) /*
This is where the problem begins, I want to be able to allocate the size
dynamically.
*/
{
top = -1;
length = size;
z = new char[length];
}
void Stack::push()
{
++top;
z[top] = variables[index_struc]->value;
index_struc++;
}
char Stack::pop()
{
end_of_stack = 0;
if (z == 0 || top == -1)
{
end_of_stack = 1;
return NULL;
}
char top_stack = z[top];
top--;
length--;
return top_stack;
}
Stack::~Stack()
{
delete[] z;
}
I had somewhat of a idea, and tried doing
Stack stackk
//whenever I want to put another thing into stack
stackk.push = new char;
but then I didnt completely understand how will it work for my purpose, I don't think it will be fully accessible with the pop method etc because it will be a set of separate arrays/variables right? I want the implementation to remain reasonably simple so I can understand it.
Change your push function to take a parameter, rather than needing to reference variables.
To handle pushes, start with an initial length of your array z (and change z to a better variable name). When you are pushing a new value, check if the new value will mean that the size of your array is too small (by comparing length and top). If it will exceed the current size, allocate a bigger array and copy the values from z to the new array, free up z, and make z point to the new array.
Here you have a simple implementation without the need of reallocating arrays. It uses the auxiliary class Node, that holds a value, and a pointer to another Node (that is set to NULL to indicate the end of the stack).
main() tests the stack by reading commands of the form
p c: push c to the stack
g: print top of stack and pop
#include <cstdlib>
#include <iostream>
using namespace std;
class Node {
private:
char c;
Node *next;
public:
Node(char cc, Node *nnext){
c = cc;
next = nnext;
}
char getChar(){
return c;
}
Node *getNext(){
return next;
}
~Node(){}
};
class Stack {
private:
Node *start;
public:
Stack(){
start = NULL;
}
void push(char c){
start = new Node(c, start);
}
char pop(){
if(start == NULL){
//Handle error
cerr << "pop on empty stack" << endl;
exit(1);
}
else {
char r = (*start).getChar();
Node* newstart = (*start).getNext();
delete start;
start = newstart;
return r;
}
}
bool empty(){
return start == NULL;
}
};
int main(){
char c, k;
Stack st;
while(cin>>c){
switch(c){
case 'p':
cin >> k;
st.push(k);
break;
case 'g':
cout << st.pop()<<endl;
break;
}
}
return 0;
}

Why this code failed to run

i want to generate a tree of siblings as under
ABCD
/ | \ \
A B C D
ABCD has four nodes i have taken a array for this *next[]. but this code does not run successfully but it produces the sequence. i have written code in main() which provide characters to the enque function. e.g. str.at(x) where x is variable in for loop.
struct node
{
string info;
struct node *next[];
}*root,*child;
string str, goal;
int dept=0,bnod=0,cl,z=0;
void enqueue(string n);
void enqueue(string n)
{
node *p, *temp;
p=new node[sizeof(str.length())];
p->info=n;
for (int x=0;x<str.length();x++)
p->next[x]=NULL;
if(root==NULL)
{
root=p;
child=p;
}
else
{
cout<<" cl="<<cl<<endl;
if(cl<str.length())
{
child->next[cl]=p;
temp=child->next[cl];
cout<<"chile-info "<<temp->info<<endl;
}
else
cout<<" clif="<<cl<<endl;
}
}
OUTPUT
Enter String: sham
cl=0
chile-info s
cl=1
chile-info h
cl=2
chile-info a
cl=3
chile-info m
RUN FAILED (exit value 1, total time: 2s)
Firstly, where does "RUN FAILED" come from? Is that specific to your compiler?
Secondly, about the line p=new node[sizeof(str.length())];, it probably won't give you what you wanted because you're taking the sizeof of an unsigned integer ( which, depending on your platform is likely to give you 4 regardless of the string length. Which is not what you're after - you want the actual length of the string ).
So - since you're already using std::string, why not use std::vector? Your code would look a lot friendlier :-)
If I take the first couple of lines as your desired output ( sorry, the code you posted is very hard to decipher, and I don't think it compiles either, so I'm ignoring it ;-) )
Would something like this work better for you?
#include <iostream>
#include <vector>
#include <string>
typedef struct node
{
std::string info;
std::vector<struct node*> children;
}Node;
Node * enqueue(std::string str)
{
Node * root;
root = new Node();
root->info = str;
for (int x = 0; x < str.length(); x++)
{
Node * temp = new Node();
temp->info = str[x];
root->children.push_back(temp);
}
return root;
}
int main()
{
Node * myRoot = enqueue("ABCD");
std::cout << myRoot->info << "\n";
for( int i = 0; i < myRoot->children.size(); i++)
{
std::cout << myRoot->children[i]->info << ", ";
}
char c;
std::cin >> c;
return 0;
}
Your code seems not full.
At least the line
p=new node[sizeof(str.length())];
seems wrong.
I guess enqueue should be something similar to the following:
struct node
{
string info;
struct node *next; // [] - is not necessary here
}*root,*child;
string str, goal;
int dept=0,bnod=0,cl,z=0;
void enqueue(string n)
{
node *p, *temp;
p = new node;
p->next = new node[str.length()];
p->info=n;
for (int x=0;x<str.length();x++)
{
p->next[x] = new node;
p->next[x]->next = 0;
p->next[x]->info = str[x];
}
if(root==NULL)
{
root=p;
child=p;
}
}
Please provide more info to give a more correct answer

Creating an n array with a linked list of ints

I recently made an 26array and tried to simulate a dictionary.
I can't seem to figure out how to make this. I've tried to work with passing in a linkedlist of ints instead of a string. My current code creates 26 nodes(a-z) and then each of those nodes has 26 nodes(a-z). I would like to implement a way to do this with ints, say (1-26). These int nodes will represent items, and the linkedlist of ints I want to pass in will contain a set of ints that I want represented in the tree similar to a string.
Example: pass in the set {1, 6 , 8}, instead of a string such as "hello"
#include <iostream>
using namespace std;
class N26
{
private:
struct N26Node
{
bool isEnd;
struct N26Node *children[26];
}*head;
public:
N26();
~N26();
void insert(string word);
bool isExists(string word);
void printPath(char searchKey);
};
N26::N26()
{
head = new N26Node();
head->isEnd = false;
}
N26::~N26()
{
}
void N26::insert(string word)
{
N26Node *current = head;
for(int i = 0; i < word.length(); i++)
{
int letter = (int)word[i] - (int)'a';
if(current->children[letter] == NULL)
{
current->children[letter] = new N26Node();
}
current = current->children[letter];
}
current->isEnd = true;
}
/* Pre: A search key
* Post: True is the search key is found in the tree, otherwise false
* Purpose: To determine if a give data exists in the tree or not
******************************************************************************/
bool N26::isExists(string word)
{
N26Node *current = head;
for(int i=0; i<word.length(); i++)
{
if(current->children[((int)word[i]-(int)'a')] == NULL)
{
return false;
}
current = current->children[((int)word[i]-(int)'a')];
}
return current->isEnd;
}
class N26
{
private:
N26Node newNode(void);
N26Node *mRootNode;
...
};
N26Node *newNode(void)
{
N26Node *mRootNode = new N26Node;
mRootNode = NULL;
mRootNode->mData = NULL;
for ( int i = 0; i < 26; i++ )
mRootNode->mAlphabet[i] = NULL;
return mRootNode;
}
Ah! My eyes!
Seriously, you are attempting something much too advanced. Your code is full of bugs and cannot work as intended. Tinkering will not help, you must go back to basics of pointers and linked lists. Study the basics and do not attempt anything like a linked list of linked lists until you understand what is wrong with the code above.
I'll give you some hints: "memory leak", "dangling pointer", "type mismatch", "undefined behavior".
I didnt quite use linked lists, but I managed to get it working using arrays.
/* *** Author: Jamie Roland
* Class: CSI 281
* Institute: Champlain College
* Last Update: October 31, 2012
*
* Description:
* This class is to implement an n26 trie. The
* operations
* available for this impementation are:
*
* 1. insert
* 2. isEmpty
* 3. isExists
* 4. remove
* 5. showInOrder
* 6. showPreOrder
* 7. showPostOrder
*
* Certification of Authenticity:
* I certify that this assignment is entirely my own work.
**********************************************************************/
#include <iostream>
using namespace std;
class N26
{
private:
struct N26Node
{
bool isEnd;
struct N26Node *children[26];
}*head;
public:
N26();
~N26();
void insert(int word[]);
bool isExists(int word[]);
void printPath(char searchKey);
};
N26::N26()
{
head = new N26Node();
head->isEnd = false;
}
N26::~N26()
{
}
void N26::insert(int word[])
{
int size = sizeof word/sizeof(int);
N26Node *current = head;
for(int i = 0; i < size; i++)
{
int letter = word[i] - 1;
if(current->children[letter] == NULL)
{
current->children[letter] = new N26Node();
}
current = current->children[letter];
}
current->isEnd = true;
}
/* Pre: A search key
* Post: True is the search key is found in the tree, otherwise false
* Purpose: To determine if a give data exists in the tree or not
******************************************************************************/
bool N26::isExists(int word[])
{
int size = sizeof word/sizeof(int);
N26Node *current = head;
for(int i=0; i<size; i++)
{
if(current->children[(word[i]-1)] == NULL)
{
return false;
}
current = current->children[(word[i]-1)];
}
return current->isEnd;
}

C++ vector and segmentation faults

I am working on a simple mathematical parser. Something that just reads number = 1 + 2;
I have a vector containing these tokens. They store a type and string value of the character. I am trying to step through the vector to build an AST of these tokens, and I keep getting segmentation faults, even when I am under the impression my code should prevent this from happening.
Here is the bit of code that builds the AST:
struct ASTGen
{
const vector<Token> &Tokens;
unsigned int size,
pointer;
ASTGen(const vector<Token> &t) : Tokens(t), pointer(0)
{
size = Tokens.size() - 1;
}
unsigned int next()
{
return pointer + 1;
}
Node* Statement()
{
if(next() <= size)
{
switch(Tokens[next()].type)
{
case EQUALS
:
Node* n = Assignment_Expr();
return n;
}
}
advance();
}
void advance()
{
if(next() <= size) ++pointer;
}
Node* Assignment_Expr()
{
Node* lnode = new Node(Tokens[pointer], NULL, NULL);
advance();
Node* n = new Node(Tokens[pointer], lnode, Expression());
return n;
}
Node* Expression()
{
if(next() <= size)
{
advance();
if(Tokens[next()].type == SEMICOLON)
{
Node* n = new Node(Tokens[pointer], NULL, NULL);
return n;
}
if(Tokens[next()].type == PLUS)
{
Node* lnode = new Node(Tokens[pointer], NULL, NULL);
advance();
Node* n = new Node(Tokens[pointer], lnode, Expression());
return n;
}
}
}
};
...
ASTGen AST(Tokens);
Node* Tree = AST.Statement();
cout << Tree->Right->Data.svalue << endl;
I can access Tree->Data.svalue and get the = Node's token info, so I know that node is getting spawned, and I can also get Tree->Left->Data.svalue and get the variable to the left of the =
I have re-written it many times trying out different methods for stepping through the vector, but I always get a segmentation fault when I try to access the = right node (which should be the + node)
Any help would be greatly appreciated.
There's plenty more code that we haven't seen, so I can't tell you precisely what's going on, but I see a few things that are reasons for concern. One is that the Statement() method doesn't always return a value. If the first if test doesn't pass, then we call advance() and fall off the bottom of the routine without an explicit return. The caller will try to get the return value of the function but it'll get garbage. This could lead to all sorts of problems, including things like double free() calls, etc, which can easily cause segfaults.
Expression() has the same problem.