Trie Tree Initialization - c++

I am using C++ to construct a trie tree with a bunch of words in dictionary. This is how I defined my TrieNode and built a tree:
struct TrieNode {
TrieNode *children[26];
bool isWord;
TrieNode(): isWord(false) {} /* to be deleted */
};
void buildTree(TrieNode *root, string word) {
TrieNode *cur = root;
for (char c : word) {
if (!cur->children[c-'a'])
cur->children[c-'a'] = new TrieNode();
cur = cur->children[c-'a'];
}
cur->isWord = true;
}
This works fine on some compilers, but on others this produces some strange results. For example, one time I found isWord was initialized to be 152, and the whole program crashed. I tried deleting the line marked above in the code, things worked out again. What is going on here?
Also, what is the difference between "new TrieNode()" and "new TrieNode"? Sometimes I found they produce different results too.

The problem with your code, is that you assume the members to be initialized. Unfortunately, this is not true. So the pointers to the children are not necessarily initialized to nullptr, which causes your code to dereference invalid pointers, which causes undefined behavior (UB) (e.g. memory corruption, crashes, etc...).
Easy solution:
Add a default initializer for your array in the class:
TrieNode *children[26]{};
Demo
My advices:
use vectors instead of native arrays. Their default constructor ensures they are empty.
read this article about initialisation
make some bound checking, because if some capitals are lost in your data, you'll go out of range, and again, UB.

Related

Why does the array length change when an array that contains a pointer is a parameter of a method?

I have a Node class, and when I created an array of Node pointer(Node*) and passed it through the method, I had a different length of the array as the parameter.
Node* hands[4];
Deal(deck,hands,4,"one-at-a-time",13);
void Deal(Node* &deck, Node* hands[], int people, std::string type, int count){
Node*& temp = deck;
for (int i = 0; i < count; ++i) {
for (int j = 0; j < people; ++j) {
append(hands[j], CopyDeck(temp));
temp = temp->after;
}
}
}
When I use Clion debugger to see the value of variables, I found that hands that I create has values of
hands[0] = 0x746365667265700e
hands[1] = NULL
hands[2] = NULL
hands[3] = 0x00007fc44b402430
And when it is passed through the method, in method the hands is
*hands=0x746365667265700e
hands[1]=NULL
hands[2]=NULL
hands[3]=0x00007fc44b402430
hands[4]=0x00007fc44b402570
What does the "*hands" stand for? And why the initial value in hands are not NULL? Actually the minimal example I can have is something like:
class Node{};
void test(Node* list[]){}
int main(int argc, char* argv[]){
Node * temp[4];
test(temp);
}
But it works. And I have already written the same code in other files and works as I thought.
The deck is a simply doubly-linked list of Node. Node has an attribute "after" point to the next Node. My debugger told me before
Node* &temp = deck;
the parameter "hands" already becomes a 5 elements array.
I think I found a possible reason but I can't understand the relationship between. There are two test methods in my main function. The first one is called "SortingTest" and the second one is "DealingTest". When I comment the first test method out, my DealingTest works properly, but after I uncomment it, the DealingTest doesn't work. After SortingTest ends there is no attribute or anything left in the main method. Can anyone explain it to me? Thank you all. Or maybe my clear method is wrong so it not frees the memory correctly?
void DeleteAllCards(Node* root){
Node *current, *next;
current = root;
while (current != nullptr){
next = current->after;
delete current;
current = next;
}
}
The array you created is a C-Style array, which is a fixed size array with 4 elements. In your case, the element type is Node pointer.
C-Arrays do not initialize with default values, unlike many other popular languages. Therefore, the pointer values you are seeing in hands are either a pointer to a Node * or derived type or a garbage memory address with some exceptions to this rule (see below for the edge cases defined by the Standard. For the ones that do say NULL, their memory address is at ox0000...
Update Edit To reflect a comment made by #AlgirdasPreidZius -
For C and C++, there is a standard rule where a standard C-Array shall be populated with default values upon initialization. C++ standard section 6.8.3.2.2 ([basic.start.static]): "If constant initialization is not performed, a variable with static storage duration or thread storage duration is zero-initialized."
As to why your array has those values in them from the function provided, we need more context. A reproducible example is always the best.
Your for loop, judging by the passed in parameters, is an N^2 time complexity loop with 4*4 iterations. The C-Array Node * was also passed in by reference, so when you assign Node *& to deck, the memory address marking the start of the array changes to the location of the deck array. So, it will have the values that the deck C-Array of Node *'s contains, assuming copy is a 1 : 1 copy, deep or shallow

How do you create a pointer-based binary tree without using dynamic memory allocation?

Some C++ programmers say that dynamic memory allocation is bad and should be avoided whenever possible. I tried making a binary tree data structure without using dynamic memory allocation, and it doesn't work. Here's what I tried:
struct BTNode {
BTNode *left = 0, *right = 0;
int data;
BTNode(int d_) { data = d_; }
void insert(int d_) {
BTNode n(d_);
if (d_ <= data)
if (left == 0) left = &n;
else left->insert(d_);
else
if (right == 0) right = &n;
else right->insert(d_);
}
}
And then doing this in main...
BTNode root(8);
root.insert(9);
root.insert(10);
cout << root.right->right->data;
results in a segfault, because the BTNode containing the data went out of scope a long time ago.
My question is, how is one supposed to structure a pointer-based binary tree like this without the use of new and delete?
The short answer is: you pretty much can't.
The only possible way is for the entire tree to be in either automatic, or global scope, and constructed manually:
BTNode root;
BTNode left, right;
root.left=&left;
root.right=&right;
But, either the whole thing gets destroyed, when the automatic scope is left, or you now have a bunch of ugly globals.
There's nothing wrong with dynamic scope, and dynamic memory allocation; provided that it's used correctly.
You can have all your nodes in a single std::array<>. They can freely point to each other an when you release your array, all your nodes are safely released as well. You just have to make sure to know which elements in your array have already been used.
Any way, please only implement your own trees and similar containers for educational reasons, or if you have very, very, very good reasons not to use the ones provided by the standard library. In the latter case, mimic the standard interface as closely as possible in order to enable standard algorithms, range based for-loops, and easily understandable code for other C++ developers.
The insert method is allocating the BTNode object on the stack, that means the object's memory is invalid once the insert function returns.
You could do
BTNode newObj( 9 )
root.insert( newObj );
You would also have to modify the insert method to
void insert(BTNode &node) { ...
in this case newObj object remains in scope until you leave your main function.
Even better you can make it of static scope, then it will be around for the duration of the program.

Problems dereferencing node defined by struct

This is for a homework assignment, so explanations (and not direct code) are what I need.
We recently started learned about copy constructors/assignment = operators and such. In the handout we got in class our professor showed us how if you want to deep copy pointers you have to dereference them and copy the values directly.
Eg: (from handout)
class IntCellFixed {
public:
IntCellFixed(int initialValue = 0) { storedValue = new
int(initialValue); }
//v This bit here v
IntCellFixed(const IntCellFixed &source) {
storedValue = new int();
*storedValue = *source.storedValue;
}
//^ This bit here ^
(...)
private:
int *storedValue;
};
This makes sense to me. You have a pointer, and you want to have the value it points to be equal to the value of the pointer you're copying from. So you do * to have it change the value at the address it's pointing at, and not the address itself. That makes sense to me.
So when we went to apply that in lab, I tried a similar thing, but with linked lists instead of just pointers to an integer. Even with the TA's help (with the TA looking at and tweaking my code until it was the "correct" thing), it still did not work and just gave me a segfault. I did not have a chance to ask what exactly the code was supposed to be, and the solution hasn't been posted yet.
We're doing almost the same thing in our homework. Here is the structure of our node (in a binary search tree):
struct Node {
int data;
int count;
Node *left;
Node *right;
};
In my void BinarySearchTree::insert(Node *node,Node *parent, int value) function, I have this bit of code:
if (node == NULL) {
node = new Node();
*node->data = value;
*node->count = 1;
node->left = NULL;
node->right = NULL;
}
When I try to do this, however, it gives me the error: invalid type argument of unary β€˜*’ (have β€˜int’).
When I take off the *, it runs fine, but doesn't actually save the data for the node outside of the insert function. (ie: when trying to insert multiple values, it always starts with an empty tree, even after I've supposedly inserted values already).
As far as I understand it, I WOULD need to dereference the node because I don't want the address it's pointing to to change, I want what's AT the address to change.
Am I just completely misunderstanding how/when to dereference? If I am, could somebody explain when I would need to, and why I wouldn't need to in this case? Or, if it's a syntax error, could somebody tell me what the syntax should be?

linked list first member variable of node structure always next node

Suppose you have a linked list of nodes as defined below:
C++ code
struct node {
node *next;
int i ;
};
Is there any benefit in making the next pointer as the first member variable of the structure ?
I think that people try this via the above approach (I may be wrong here)
node n, m;
*n=&m;
If above is right, is it right to code like above. What's the right way?
Is there any benefit in making the next pointer as the first member
variable of the structure ?
A very small performance benefit can be reached to reduce the assembly instruction size in loads from and writes to zero offset members, but only in classes without virtual table (vtbl is a omitted first member).
If you want prebuild a scope/global allocated list, its elements can be initialized as mentioned.
You can try it:
struct node {
struct node* next;
int i;
};
node z = {0}, c={&z}, b={&c}, a={&b};
node * stack = &a;
you can find very useful information about liked list searching for 'linux kernel linked list':
Linux Kernel Linked List Explained
How does the kernel implements Linked Lists?
I working now in my own design of 'intrusive node' general purpose containers using c++ templates, perhaps this question might seem interesting.
node n, m;
*n = &m;
is not legal code, perhaps you mean
node n, m;
n.next = &m;
but normally this would be done with dynamic allocation
node* n = new node;
n->next = new node;
because normally you would use node to create a variable length list. Since the length of the list varies there is no way to declare the right number of variables, instead you must allocate the nodes dynamcally.

c++ outputting and inputting a single character

I am writing a program in c++ which implements a doubly-linked list that holds a single character in each node. I am inserting characters via the append function:
doubly_linked_list adam;
adam.append('a');
This function is implemented as follows:
//Append node
node* append(const item c){
//If the list is not empty...
if(length){
//maintain pointers to end nodes
node* old_last_node = last;
node* new_last_node = new node;
//re-assign the double link and exit link
old_last_node->next = new_last_node;
new_last_node->back = old_last_node;
new_last_node->next = NULL;
//re-assign the last pointer
last = new_last_node;
}
//If this is the first node
else{
//assign first and last to the new node
last = first = new node;
//assign nulls to the pointers on new node
first->next = first->back = NULL;
}
//increase length and exit
++length;
return last;
}
However, I think there is an issue, perhaps with the way C++ handles characters. When I go to print my list, somehow I never get the characters to print which I have appended to my list. This is what I'm using to print:
//Friendly output function
friend std::ostream& operator << (std::ostream& out_s, const doubly_linked_list& source_list){
//create iteration node pointer
node* traverse_position = source_list.first;
//iterate through, reading from start
for(int i = 1; i <= source_list.length; ++i){
//print the character
out_s << (traverse_position->data);
traverse_position = traverse_position->next;
}
//return the output stream
return out_s;
}
I just get crap when I print it. It prints characters that I never appended to my list - you know, just characters just from somewhere in the memory. What could possibly be causing this?
Where are you assigning the value c in the append() function? I fear you may have concentrated too much on the doubly-linked-list part and not enough on the storing-data part. :)
As others have already mentioned, you forgot to store the characters you were supposedly appending. It's a reasonable mistake to make. To avoid it in the future, you can let the compiler help you.
Most compilers offer warnings about things that are technically OK, but probably aren't what you really want to do. In your case, you declared the parameter c, but you never used it. With warnings enabled, your compiler could have noticed that and told you that you hadn't used it. That would probably have been enough of a reminder for you that you weren't finished writing that function.
GCC's option to enable common warnings is -Wall. (That's "W" for "warning," plus "all"; it has nothing to do with walls. But it's not really all warnings, either.) For example:
g++ -Wall list-program.cpp
Other compilers have similar options. Check your compiler's documentation for details.
No where in your append method do you actually place the item into the new node. When you go to print, it just prints whatever value happens to be in that memory location (some random value).