Calling new on object with pointer to of same type, seems to allocate memory to pointer - c++

I'm trying to implement a Trie data structure on my own, without looking at other implementations, so simply based on my conceptual knowledge of the structure. I would like to avoid using vectors, simply because they are easy to use... I prefer to use pointers for dynamically allocating memory for arrays when I'm programming as practice. That said, with the structure that I currently have, I have a Node class that contains a pointer to a Node array, a letter (bool), and a marker (bool). My Trie class has a pointer to the starting Node array. Each node array has 26 elements to refer to each letter of the English alphabet from 'a' to 'z' lowercase (I convert each word inserted to lowercase). When a letter is set to 'true' then its letterArray is allocated new memory. Node has a constructor to set letter and marker to false and letterArray to nullptr. I can insert the first letter no problem and go to the next letterArray (which is nullptr at this point) after which memory is allocated to the new array. The problem is, the next letterArray of each Node is also allocated memory, but the constructor is not called on them, resulting in their letter and marker containing garbage, and I'm wondering what is the reason the constructor is not called? Hopefully the code will make this more clear:
class Node {
private:
bool letter;
bool marker;
Node* letterArray;
void initNode();
public:
Node();
bool setLetter(bool set);
bool setMarker(bool set);
bool checkLetter();
bool checkMarker();
char getLetter();
Node*& getNextLetterArray();
};
class Trie {
private:
Node* start;
int wordCount;
int letterCount;
const int totalLetters = 26;
void destroyTrie();
bool initBranch(Node*& nextBranch);
void insertCharAndMove(Node*& ptr, int, int, int);
public:
Trie();
Trie(string firstWord);
~Trie();
bool insertWord(string word);
bool deleteWord(string word);
bool getToLetter(char letter);
string getLowerCase(string word);
bool wordExists(string word);
};
insertWord:
bool Trie::insertWord(string word) {
Node* ptr = start;
string wordLower = getLowerCase(word);
int wordLength = word.length();
if (wordLength <= 0) return false;
for (int i = 0; i < wordLength; i++) {
int charIndex = (word[i] - 'a');
insertCharAndMove(ptr, charIndex, wordLength, i);
}
wordCount++;
return true;
}
void Trie::insertCharAndMove(Node*& ptr, int charIndex, int wordLength, int i) {
if (ptr[charIndex].setLetter(true)) letterCount++;
if (i < wordLength) {
ptr = ptr[i].getNextLetterArray();
initBranch(ptr);
}
else ptr[i].setMarker(true);
}
initBranch:
bool Trie::initBranch(Node*& nextBranch) {
if (nextBranch != nullptr) return false;
nextBranch = new Node[letterCount];
return true;
}
Trie Constructor:
Trie::Trie() {
start = new Node[totalLetters];
wordCount = 0;
letterCount = 0;
}
Node Constructor:
Node::Node() {
initNode();
}
void Node::initNode() {
letter = false;
marker = false;
letterArray = nullptr;
}
getNextLetterArray:
Node*& Node::getNextLetterArray() {
return letterArray;
}

Related

Permutations of String Using Stack C++

The program I have below finds all the permutations of a given string using a stack without recursion. I am having some trouble understanding what the place in the struct is for and how it plays into the logic for the algorithm. Could anyone help me understand this code? I have a struct that only has two entities:
class Node
{
public:
string word; // stores the word in the node
Node *next;
};
I would just like to understand why the place entity is needed.
Here is the code that finds all the permutations of a given string:
struct State
{
State (std::string topermute_, int place_, int nextchar_, State* next_ = 0)
: topermute (topermute_)
, place (place_)
, nextchar (nextchar_)
, next (next_)
{
}
std::string topermute;
int place;
int nextchar;
State* next;
};
std::string swtch (std::string topermute, int x, int y)
{
std::string newstring = topermute;
newstring[x] = newstring[y];
newstring[y] = topermute[x]; //avoids temp variable
return newstring;
}
void permute (std::string topermute, int place = 0)
{
// Linked list stack.
State* top = new State (topermute, place, place);
while (top != 0)
{
State* pop = top;
top = pop->next;
if (pop->place == pop->topermute.length () - 1)
{
std::cout << pop->topermute << std::endl;
}
for (int i = pop->place; i < pop->topermute.length (); ++i)
{
top = new State (swtch (pop->topermute, pop->place, i), pop->place + 1, i, top);
}
delete pop;
}
}
int main (int argc, char* argv[])
{
if (argc!=2)
{
std::cout<<"Proper input is 'permute string'";
return 1;
}
else
{
permute (argv[1]);
}
return 0;
}
Place helps you to know where is going to be the next character "swap". As you can see, it increments inside the for loop. As you can see, inside that for loop, it behaves like a pivot and i increments in order to behave like a permutator (by swapping characters)

Resizing and copying elements in a Hashtable Array

Right now I have struct IndexLocation that defines a page number pageNum and a word number wordNum on a page, and a struct IndexRecord that consists of a specific word and its locations that is a vector of IndexLocations.
In IndexRecord.h:
struct IndexLocation {
int pageNum; //1 = first page
int wordNum; //1 = first word on page
IndexLocation(int pageNumber, int wordNumber);
};
struct IndexRecord {
//indexed word
std::string word;
//list of locations it appears
std::vector<IndexLocation> locations;
IndexRecord();
//Constructor - make a new index record with no locations
explicit IndexRecord(const std::string& wordVal);
//Add an IndexLocation to the record
// Does NOT check for duplicate records
void addLocation(const IndexLocation& loc);
//Returns true if the record contains the indicated location
bool hasLocation(const IndexLocation& loc) const;
};
Then, I have a Hash Map IndexMap which stores values of IndexRecords using the word as the key. Within one, an IndexRecord may be stored at bucket 3, have a word apple, and have locations be 1,2 and 2,5.
#include "IndexRecord.h"
class IndexMap
{
private:
int numBuckets;
int keyCount;
IndexRecord* buckets;
//handle resizing the hash table into a new array with twice as many buckets
void grow();
//Get the location this key should be placed at.
// Will either containt IndexRecord with that key or an empty IndexRecord
unsigned int getLocationFor(const std::string& key) const;
public:
//Construct HashMap with given number of buckets
IndexMap(int startingBuckets = 10);
//Destructor
~IndexMap();
//Copy constructor and assignment operators
IndexMap(const IndexMap &other);
IndexMap& operator=(const IndexMap& other);
//Returns true of indicated key is in the map
bool contains(const std::string& key) const;
//Add indicated location to the map.
// If the key does not exist in the map, add an IndexRecord for it
// If the key does exist, add a Location to its IndexRecord
void add(const std::string& key, int pageNumber, int wordNumber);
void IndexMap::add2(const std::string &key, IndexLocation location)
};
Furthermore, in IndexMap.cpp, I have the add function, the add2 function, and grow function.
void IndexMap::add(const std::string &key, int pageNumber, int wordNumber) {
if (keyCount == numBuckets)
grow();
int bucketNumber = getLocationFor(key);
if (this->contains(key) == true)
buckets[bucketNumber].addLocation(IndexLocation(pageNumber, wordNumber));
else if (this->contains(key) == false) {
while (buckets[bucketNumber].word != "?") {
if (bucketNumber < numBuckets)
bucketNumber++;
else if (bucketNumber == numBuckets)
bucketNumber = 0;
}
string foo = key;
buckets[bucketNumber].word = key;
buckets[bucketNumber].addLocation(IndexLocation(pageNumber, wordNumber));
keyCount++;
}
return;
}
void IndexMap::add2(const std::string &key, IndexLocation location) {
if (keyCount > 0.7 * numBuckets)
grow();
int bucketNumber = getLocationFor(key);
if (this->contains(key) == true)
buckets[bucketNumber].addLocation(location);
else if (this->contains(key) == false) {
while (buckets[bucketNumber].word != "?") {
if (bucketNumber < numBuckets)
bucketNumber++;
else if (bucketNumber == numBuckets)
bucketNumber = 0;
}
string foo = key;
buckets[bucketNumber].word = key;
buckets[bucketNumber].addLocation(location);
keyCount++;
}
return;
}
void IndexMap::grow() {
IndexRecord* oldTable = buckets;
int oldSize = numBuckets;
numBuckets = numBuckets * 2 + 1;
IndexRecord* newArray = new IndexRecord[numBuckets];
keyCount = 0;
for (int i = 0; i < oldSize; i++) {
if (oldTable[i].word != "?") {
this->add2(oldTable[i].word, oldTable[i].locations[i]); // having trouble here
}
}
buckets = newArray;
delete [] oldTable;
}
My issue begins here. I believe my basic logic is sound: keep the old array around with a pointer, make a new, larger one and reset the size of the HashTable, iterate through the old array and add anything it contains back into the hashtable with the add function, and then delete the old array, but this just results in a segmentation fault (SIGSEGV) once keyCount hits numBuckets. (The reason I have an add2 function which is almost identical to my add function and use it in grow is because I didn't know how to modify get a pageNumber and a wordNumber for the this->add2 line within grow; the assignment specifications say we cannot modify the original add function's header).
You never assign to buckets in grow, so the newly enlarged array is not accessible by your other functions.

Printing all the words from a prefixtree in order

I've set up a program that can take in user input to create a prefixtree. Each character is a node which are linked together. I have a "print" command that will print the words out as the following if the user gave this input: cat, car, sat, saw:
ca(R,T),sa(T,W).
I'm trying to create two functions that will instead print out the words given from the user in alphabetical word. One function PrintAllWords() is the function that will be doing most of the work, I'm thinking of having this function be a recursive function that would print to a global string of some sort each word through push_back() then delete that current word from pull_back() and move onto the next. The second function printWordList(); would call printAllWords(); and just print out the list of words create.
I've start with some code trying to slowly get to where I want, but at the moment when I use the command "list" (the command for the new functions) my code only gives me the parent nodes C and S as the following: cs.
How can I just get the first nodes of each word, try and get the first word in the prefixtree being "cat".
My Header File:
#ifndef __PREFIX_TREE_H
#define __PREFIX_TREE_H
#include <iostream>
using namespace std;
const int ALPHABET_SIZE = 26;
class PrefixTreeNode;
/*
Prefix tree
Stores a collection of strings as a tree
*/
class PrefixTree
{
private:
PrefixTreeNode* root;
public:
//Constructs an empty prefix tree
PrefixTree();
//Copy constructor
PrefixTree(const PrefixTree&);
//Copy assignment
const PrefixTree& operator=(const PrefixTree&);
//Utility func: checks whether all characters in str are letters
bool isAllLetters(const string&) const;
//Returns the root of the prefix tree
PrefixTreeNode* getRoot() { return root; };
//Returns the root of the prefix tree
const PrefixTreeNode* getRoot() const { return root; };
//Returns whether or not the given word belongs to the prefixtree
bool contains(const string&) const;
//Adds the given word to the prefix tree
void addWord(const string&);
//Prints all of the words in the prefix tree
void printWordList() const;
//Destructor
~PrefixTree();
};
/*
Node of a prefix tree
*/
class PrefixTreeNode
{
friend PrefixTree;
private:
char c;
bool final;
PrefixTreeNode* link[ALPHABET_SIZE];
public:
//Constructs a new node
PrefixTreeNode();
//Copy constructor
PrefixTreeNode(const PrefixTreeNode&);
//Copy assignment
const PrefixTreeNode& operator=(const PrefixTreeNode&);
//Returns the character this node contains
char getChar() const { return c; }
//Returns whether this node is the end of a word
bool isFinal() const { return final; }
//Changes whether this node is the end of a word
void setFinal(bool b) { final = b; }
//Returns the node corresponding to the given character
PrefixTreeNode* getChild(char);
//Returns the node corresponding to the given character
const PrefixTreeNode* getChild(char) const;
//Adds a child corresponding to the given character
void addChild(char);
//Removes the child corresponding to the given character
void deleteChild(char);
//print all words that end at or below this PrefixTreeNode
void printAllWords() const;
//Destructor
~PrefixTreeNode();
};
ostream& operator<<(ostream&, const PrefixTree&);
ostream& operator<<(ostream&, const PrefixTreeNode&);
#endif
My Source File functions:
void PrefixTreeNode::printAllWords() const{
for (char c = 'a'; c < 'z' + 1; c++)
{
if (this->getChild(c) == nullptr)
continue;
this->getChild(c);
cout << c;
}
}
//Calls all words
void PrefixTree::printWordList() const{
PrefixTreeNode* node = root;
node->printAllWords();
}
PrefixTreeNode* PrefixTreeNode::getChild(char c)
{
if (isalpha(c))
return link[tolower(c)-'a'];
else
return nullptr;
}
void PrefixTree::addWord(const string& str)
{
PrefixTreeNode* node = root;
for (int i = 0; i < str.size(); i++)
{
if (node->getChild(str[i]) == nullptr)
node->addChild(str[i]);
node = node->getChild(str[i]);
}
node->setFinal(true);
}
We use recursion to print all the stored strings in the tree in order. Call the function from main using printAllWords(root, ""). If root points to nullptr, we return. If root->final is true, we print the word. Then we append the current character to word and loop through all it's children and call printAllWords for each of them.
The same will happen for every node.
void printAllWords(Node* current, std::string word)
{
if (current == nullptr)
return;
if (current->final)
std::cout << (word+current->c) << std::endl;
for (int i = 0; i < ALPHABET_SIZE; ++i)
printAllWords(current->link[i], word + current->c);
}
Edit: Although I must confess I'm not sure what's the use of c in the treenode. If you construct the trie such that if let's say the 2nd child (b) of the current node is not null, then that means that b is part of a trail of another word(s) through it. The following code should make it clear:
void printAllWords(Node* root)
{
string word = "";
for (int i = 0; i < ALPHABET_SIZE; ++i)
printAllWords(root->link[i], word + (char)(i + 'a'));
}
void printAllWords(Node* current, std::string word)
{
if (current == nullptr)
return;
if (final)
std::cout << word << std::endl;
for (int i = 0; i < ALPHABET_SIZE; ++i)
printAllWords(current->link[i], word + (char)(i + 'a'));
}

Properly exiting out of recursions?

TrieNode and Trie Object:
struct TrieNode {
char nodeChar = NULL;
map<char, TrieNode> children;
TrieNode() {}
TrieNode(char c) { nodeChar = c; }
};
struct Trie {
TrieNode *root = new TrieNode();
typedef pair<char, TrieNode> letter;
typedef map<char, TrieNode>::iterator it;
Trie(vector<string> dictionary) {
for (int i = 0; i < dictionary.size(); i++) {
insert(dictionary[i]);
}
}
void insert(string toInsert) {
TrieNode * curr = root;
int increment = 0;
// while letters still exist within the trie traverse through the trie
while (curr->children.find(toInsert[increment]) != curr->children.end()) { //letter found
curr = &(curr->children.find(toInsert[increment])->second);
increment++;
}
//when it doesn't exist we know that this will be a new branch
for (int i = increment; i < toInsert.length(); i++) {
TrieNode temp(toInsert[i]);
curr->children.insert(letter(toInsert[i], temp));
curr = &(curr->children.find(toInsert[i])->second);
if (i == toInsert.length() - 1) {
temp.nodeChar = NULL;
curr->children.insert(letter(NULL, temp));
}
}
}
vector<string> findPre(string pre) {
vector<string> list;
TrieNode * curr = root;
/*First find if the pre actually exist*/
for (int i = 0; i < pre.length(); i++) {
if (curr->children.find(pre[i]) == curr->children.end()) { //DNE
return list;
}
else {
curr = &(curr->children.find(pre[i])->second);
}
}
/*Now curr is at the end of the prefix, now we will perform a DFS*/
pre = pre.substr(0, pre.length() - 1);
findPre(list, curr, pre);
}
void findPre(vector<string> &list, TrieNode *curr, string prefix) {
if (curr->nodeChar == NULL) {
list.push_back(prefix);
return;
}
else {
prefix += curr->nodeChar;
for (it i = curr->children.begin(); i != curr->children.end(); i++) {
findPre(list, &i->second, prefix);
}
}
}
};
The problem is this function:
void findPre(vector<string> &list, TrieNode *curr, string prefix) {
/*if children of TrieNode contains NULL char, it means this branch up to this point is a complete word*/
if (curr->nodeChar == NULL) {
list.push_back(prefix);
}
else {
prefix += curr->nodeChar;
for (it i = curr->children.begin(); i != curr->children.end(); i++) {
findPre(list, &i->second, prefix);
}
}
}
The purpose is to return all words with the same prefix from a trie using DFS. I manage to retrieve all the necessary strings but I can't exit out of the recursion.
The code completes the last iteration of the if statement and breaks. Visual Studio doesn't return any error code.
The typical end to a recursion is just as you said- return all words. A standard recursion looks something like this:
returnType function(params...){
//Do stuff
if(need to recurse){
return function(next params...);
}else{ //This should be your defined base-case
return base-case;
}
The issue arises in that your recursive function can never return- it can either execute the push_back, or it can call itself again. Neither of these seems to properly exit, so it'll either end quietly (with an inferred return of nothing), or it'll keep recursing.
In your situation, you likely need to store the results from recursion in an intermediate structure like a list or such, and then return that list after iteration (since it's a tree search and ought to check all the children, not return the first one only)
On that note, you seem to be missing part of the point of recursions- they exist to fill a purpose: break down a problem into pieces until those pieces are trivial to solve. Then return that case and build back to a full solution. Any tree-searching must come from this base structure, or you may miss something- like forgetting to return your results.
Check the integrity of your Trie structure. The function appears to be correct. The reason why it wouldn't terminate is if one or more of your leaf nodes doesn't have curr->nodeChar == NULL.
Another case is that any node (leaf or non-leaf) has a garbage child node. This will cause the recursion to break into reading garbage values and no reason to stop. Running in debug mode should break the execution with segmentation fault.
Write another function to test if all leaf-nodes have NULL termination.
EDIT:
After posting the code, the original poster has already pointed out that the problem was that he/she was not returning the list of strings.
Apart from that, there are a few more suggestions I would like to provide based on the code:
How does this while loop terminate if toInsert string is already in the Trie.
You will overrun the toInsert string and read a garbage character.
It will exit after that, but reading beyond your string is a bad way to program.
// while letters still exist within the trie traverse through the trie
while (curr->children.find(toInsert[increment]) != curr->children.end())
{ //letter found
curr = &(curr->children.find(toInsert[increment])->second);
increment++;
}
This can be written as follows:
while (increment < toInsert.length() &&
curr->children.find(toInsert[increment]) != curr->children.end())
Also,
Trie( vector<string> dictionary)
should be
Trie( const vector<string>& dictionary )
because dictionary can be a large object. If you don't pass by reference, it will create a second copy. This is not efficient.
I am a idiot. I forgot to return list on the first findPre() function.
vector<string> findPre(string pre) {
vector<string> list;
TrieNode * curr = root;
/*First find if the pre actually exist*/
for (int i = 0; i < pre.length(); i++) {
if (curr->children.find(pre[i]) == curr->children.end()) { //DNE
return list;
}
else {
curr = &(curr->children.find(pre[i])->second);
}
}
/*Now curr is at the end of the prefix, now we will perform a DFS*/
pre = pre.substr(0, pre.length() - 1);
findPre(list, curr, pre);
return list; //<----- this thing
}

C++ malloc() memory corruption(fast)

I am fairly new to programming and am having memory issues with my program. Somewhere I am overusing memory, but can't find the source. I don't understand why it is giving me issues with malloc allocation as i don't dynamically allocate any variables. Thanks
//returns the index of the character in the string
int find(string line, int begin, int end, char character) {
for (int i = begin; i <= end; i++) {
if (line[i] == character) {
return i;
}
}
//return -1 if not found
return -1;
}
//Get the characters from levelorder that align with inorder
char* getCharacters(char inOrder[], char levelOrder[], int a, int b) {
char *newLevelOrder = new char[a];
int j = 0;
for (int i = 0; i <= b; i++)
if (find(inOrder, 0, a-1, levelOrder[i]) != -1)
newLevelOrder[j] = levelOrder[i], j++;
return newLevelOrder;
}
//creates a new Node given a character
Node* newNode(char character) {
Node *node = new Node;
node->character = character;
node->left = NULL;
node->right = NULL;
return node;
}
//creates the huffman tree from inorder and levelorder
Node* createInLevelTree(char inOrder[], char levelOrder[], int beginning, int end, int size) {
//if start index is out of range
if (beginning > end) {
return NULL;
}
//the head of the tree is the 1st item in level order's traversal
Node *head = newNode(levelOrder[0]);
//if there are no children we can't go farther down
if (beginning == end) {
return head;
}
//get the index of the node
int index = find(inOrder, beginning, end, head->character);
//get the subtree on the left
char *leftTree = getCharacters(inOrder, levelOrder, index, size);
//get the subtree on the right
char *rightTree = getCharacters(inOrder + index + 1, levelOrder, size-index-1, size);
//branch off to the left and right
head->left = createInLevelTree(inOrder, leftTree, beginning, index-1, size);
head->right = createInLevelTree(inOrder, rightTree, index+1, end, size);
//delete
delete [] leftTree;
delete [] rightTree;
return head;
}
Fixed with this line. Thanks Sam.
Char* new level order = new char [b]
Somewhere I am overusing memory, but can't find the source.
I'd suggest you at least replace your character arrays with std::vector<char> or std::string and put some size assertions in, or use the at member to see no over-indexing happens. Furthermore, using operator new more than likely is implemented in terms of malloc, and operator delete in terms of free. Therefore you are allocated dynamically.
Also, wiki for RAII. Try and employ RAII for dynamically allocated memory ... always. std::vector and std::string gives you this for free.
Also, consider the code below:
char* getCharacters(char inOrder[], char levelOrder[], int a, int b) {
char *newLevelOrder = new char[a];
int j = 0;
for (int i = 0; i <= b; i++)
if (find(inOrder, 0, a-1, levelOrder[i]) != -1)
newLevelOrder[j] = levelOrder[i], j++;
return newLevelOrder;
}
Reading this, I'm not sure of the quantity of b. There is no restriction imposed at the call sight. How do I know that the for loop won't invoke indefined behavior (by overindexing). Typically a correct for loop would use "a" here, as "a" was used to create the array... If you want to code like this, use asserts liberally, as you are making assumptions about the calling code (but just use a vector....).
char *newLevelOrder = new char[a];
int j = 0;
for (int i = 0; (i < a) && (i <= b); i++)
{
or
assert (b < a);
char *newLevelOrder = new char[a];
int j = 0;
for (int i = 0; (i <= b); i++)
{
I leave the task of replacing your arrays with vectors and string as an exercise for you, as well as liberally spraying asserts in for loops mentioned... That will likely solve your problems