Trie c++ implementation segmentation fault - c++

I'm implementing a simple trie data structure in c++ using struct and pointers. When I pass a string to add in trie, it gives segmentation fault in the addString() function.
struct node {
char ch;
node *link[26];
node() : link(){}
};
node head;
void addString(node *n, string s) {
if (!s.length()) return;
if (!n -> link[(int)s[0] - 97]) {
node m;
m.ch = s[0];
n -> link[(int)s[0] - 97] = &m;
}
addString(n -> link[(int)s[0] - 97], s.substr(1));
}
int main(){
addString(&head, "red");
return 0;
}
I tried debug statements and even printed and matched the address values of newly created node and the one passed recursively, they were same.
PS I'm using head node as epsilon state.

You are using addresses of objects allocated on stack. node m; is on stack. It will be deleted as soon as you leave an if block in which it is declared. And you assign it's address to a node n -> link[(int)s[0] - 97] = &m; which lives longer than that.

n -> link[(int)s[0] - 97] = &m;
You're storing the address of m while it is destroyed at the end of its scope.
You should redesign your project with a proper memory management.

There are two problems that could explain segmentation fault:
the first is that you add a pointer to a local object m into your array of links. As soon as you return from the function the pointer will be dangling and you'll have UB. Allocate m properly: node *m = new node; Better: use unique_ptr instead of raw pointers.
you assume that the string contains only lower case letters between 'a' and 'z'. If the string would contain anything else, you'll go out of bounds and might cause memory corruption and UB. You should have at least an assert()
Here a small fix to address both issues, based on your current structure and approach:
struct node {
...
node(char c=0) : link(), ch(c) {}
~node() { for (int i=0;i<26; i++) delete link[i]; }
};
...
void addString(node *n, string s) {
if (!s.length()) return;
size_t c = tolower(s[0]);
if (c<'a' || c>'z') return; // char not ok-> do like end of string
if (!n -> link[c-'a']) {
n -> link[c-'a'] = new node(c);
}
addString(n -> link[c-'a'], s.substr(1));
}
Note that when you use pointers in a struct, you have to be extra-careful about the rule of 3. It will not hurt here, though, as you do'nt copy nodes yet.
Online demo

Related

why does "a->content" give me a address instead of a value?

now i have been making games for a few years using the gm:s engine(tho i assure you i aint some newbie who uses drag and drop, as is all to often the case), and i have decided to start to learn to use c++ on its own, you know expand my knowledge and all that good stuff =D
while doing this, i have been attempting to make a list class as a practice project, you know, have a set of nodes linked together, then loop threw those nodes to get a value at a index, well here is my code, and i ask as the code has a single major issue that i struggle to understand
template<class type>
class ListNode
{
public:
type content;
ListNode<type>* next;
ListNode<type>* prev;
ListNode(type content) : content(content), next(NULL), prev(NULL) {}
protected:
private:
};
template<class type>
class List
{
public:
List() : SIZE(0), start(NULL), last(NULL) {}
unsigned int Add(type value)
{
if (this->SIZE == 0)
{
ListNode<type> a(value);
this->start = &a;
this->last = &a;
}
else
{
ListNode<type> a(value);
this->last->next = &a;
a.prev = this->last;
this->last = &a;
}
this->SIZE++;
return (this->SIZE - 1);
}
type Find(unsigned int pos)
{
ListNode<type>* a = this->start;
for(unsigned int i = 0; i<this->SIZE; i++)
{
if (i < pos)
{
a = a->next;
continue;
}
else
{
return (*a).content;
}
continue;
}
}
protected:
private:
unsigned int SIZE;
ListNode<type>* start;
ListNode<type>* last;
};
regardless, to me at least, this code looks fine, and it works in that i am able to create a new list without crashing, as well as being able to add elements to this list with it returning the proper index of those elements from within the list, however, beyond that the problem arises when getting the value of a element from the list itself, as when i ran the following test code, it didn't give me what it was built to give me
List<int> a;
unsigned int b = a.Add(313);
unsigned int c = a.Add(433);
print<unsigned int>(b);
print<int>(a.Find(b));
print<unsigned int>(c);
print<int>(a.Find(c));
now this code i expected to give me
0
313
1
433
as that's what is been told to do, however, it only half does this, giving me
0
2686684
1
2686584
now, this i am at a lost, i assume that the values provided are some kind of pointer address, but i simply don't understand what those are meant to be for, or what is causing the value to become that, or why
hence i ask the internet, wtf is causing these values to be given, as i am quite confused at this point
my apologies if that was a tad long and rambling, i tend to write such things often =D
thanks =D
You have lots of undefined behaviors in your code, when you store pointers to local variables and later dereference those pointers. Local variables are destructed once the scope they were declared in ends.
Example:
if (this->SIZE == 0)
{
ListNode<type> a(value);
this->start = &a;
this->last = &a;
}
Once the closing brace is reached the scope of the if body ends, and the variable a is destructed. The pointer to this variable is now a so called stray pointer and using it in any way will lead to undefined behavior.
The solution is to allocate the objects dynamically using new:
auto* a = new ListNode<type>(value);
Or if you don't have a C++11 capable compiler
ListNode<type>* a = new ListNode<type>(value);
First suggestion: use valgrind or a similar memory checker to execute this program. You will probably find there are many memory errors caused by dereferencing stack pointers that are out of scope.
Second suggestion: learn about the difference between objects on the stack and objects on the heap. (Hint: you want to use heap objects here.)
Third suggestion: learn about the concept of "ownership" of pointers. Usually you want to be very clear which pointer variable should be used to delete an object. The best way to do this is to use the std::unique_ptr smart pointer. For example, you could decide that each ListNode is owned by its predecessor:
std::unique_ptr<ListNode<type>> next;
ListNode<type>* prev;
and that the List container owns the head node of the list
std::unique_ptr<ListNode<type>> start;
ListNode<type>* last;
This way the compiler will do a lot of your work for you at compile-time, and you wont have to depend so much on using valgrind at runtime.

C++ object pointer scope

I started writing a binary tree and then came up with this example and I'm not sure what's going on. So here's the code:
#include<iostream>
using namespace std;
struct Node
{
Node *left, *right;
int key;
Node()
{
left = NULL;
right = NULL;
key = 0;
}
Node(int key)
{
left = NULL;
right = NULL;
key = key;
}
};
struct Tree
{
Node* root;
void Add(int k)
{
Node* t;
t->key = k;
root->left = t;
}
Tree(int key)
{
this->root = new Node(key);
}
};
int main()
{
Tree* tree = new Tree(5);
tree->Add(4);
cout<<tree->root->left->key;
return 0;
}
Add function Add in Tree is whats confuses me. So, there is a pointer to Node object, but new keyword is not used and it appears to me that anyway there is memory allocated in the heap because I can reach the object. Shouldn't go out of scope and be destroyed? And why I can reach that object and print out its key?
Probably that memory belongs to your program and nothing bad seems to happen because you are using so little memory. If you use more memory, some object will own that unallocated space and expect it to remain unmodified. Then this code will start giving you problems.
You are "dereferencing an uninitilized pointer". There are questions relating to this here and here, for instance. Your compiler may blow up if you do this, or it may not: the behaviour is undefined. Anything might happen, including the appearance that things are working.
Use new, like you should.
This line …
Node* t;
… is like:
Node* t = random_address;
It means that the next line …
t->key = k;
… is able to corrupt interesting memory locations.
The code is invalid. In this function
void Add(int k)
{
Node* t;
t->key = k;
root->left = t;
}
local variable t is not initialized and has indeterminate value. So the execution of the statement
t->key = k;
results in undefined behaviour.
You correctly pointed to that there must be used operator new. For example
Node* t = new Node( k );
Nevertheless even in this case the function is invalid because it has to check whether the new key is less of greater than the key of root. Depending on the condition there should be either
root->left = t;
or
root->right = t;

What causes run time error in the following program?

I am using this simple function to create a new node
node* Tree::createNewNode(int score, const char* word)
{
// Create a new node with the information available
node* n = new node;
n->left=NULL;
n->right = NULL;
n->parent = NULL;
n->score = score;
strcpy(n->word,word);
return n;
}
node is a structure:
struct node
{
int score; // the score or label of the node
char *word; // the word stored in the node
node *left; // the pointer to left child of the node
node *right; // the pointer to right child of the node
node *parent; // the pointer to parent node
};
And I am calling the createNewNode function from another function
temp = t->createNewNode(score,"");
The function runs properly for only one time and then it crashes while executing:
node* n = new node;
You need to allocate memory to the word field. You are trying to copy data into word with out allocating space for it.
change char *word to char word[100];
char *word; // this is a pointer to string, aka this is not a string
char word[100]; // this is a string
n->word is uninitialized. when you are using strcpy you are copying word content in an unknown address.
This result on unknown behavior (The first call look like it work and the second made the program crash). You need to allocate the memory space to hold word string inside the structure.
Your error is due to word not being allocated memory.
You could fix this using legacy C functionality like in the other answers, or you could actually write idomatic C++.
All of the initialization done in the createNewNode function should be done in the node constructor. You should use std::string instead of char* to avoid memory allocation failures like you currently have. You should also protect the members of your node class, instead providing mutators to attach/detach them from the tree so you don't need to do it manually.
Your program crashes in the following line,
strcpy(n->word,word);
because, n->word in struct node
char *word; // the word stored in the node
was not allocated any memory.
Use char array instead of char pointer or change the function definition like this:
node* createNewNode(int score, const char* word, int wordLen)
{ ^^^^
// Create a new node with the information available
node* n = new node;
n->left=NULL;
n->right = NULL;
n->parent = NULL;
n->score = score;
n->word = (char *) malloc(wordLen);
strcpy(n->word,word);
return n;
}
strcpy(n->word, word) copies the input string into n->word which has not been initialized. For that experession to work correcly n->word must point to an allocated buffer.
strdup function allocates that buffer for you and copies the input string into that buffer, e.g.:
n->word = strdup(word);

C++ bad PTR in char* (Expression cannot be evaluated)

I've been searching for quite a time for an answer, although there were similar problems I still couldn't improve my code so it would work.
I have a simple lifo structure to which I am trying to add one element and print the structure. It prints nothing and when I am debbuging I have this <bad ptr> in char * nameOfVariable.
I would appreciate any help! Here is my source code:
#include<stdio.h>
struct Variable
{
double value;
char *name;
struct Variable *next;
} *variables[80000];
void pop(Variable * head);
void push(Variable * head, char *name, double value);
void show(Variable * head);
int main(){
for(int i = 0; i <80000; i++){
variables[i] = nullptr;
}
char *nameOfVariable = "aaab";
double value = 5;
push(variables[0], nameOfVariable, value );
show(variables[0]);
system("pause");
return 0;
}
void push(Variable * head, char *name, double value)
{
Variable * p ;
p = head;
head = new Variable;
head -> name = name;
head -> value = value;
head -> next = p;
}
void pop(Variable * head)
{
Variable *p;
if (head != NULL)
{
p = head;
head = head -> next;
free(p);
}
}
void show(Variable * head)
{
Variable *p;
p = head;
while (p!=NULL){
printf("%c %f ", p->name, p->value);
p=p->next;
}
printf("\n");
}
PS - I cant use STL so string is not an option :)
You are storing a pointer into a parameter location:
void push(Variable * head, char *name, double value)
{
Variable * p ;
p = head;
head = new Variable;
But the parameter location is local to the function and discarded upon return.
Why do you allocate an array of 80000 elements?
In order to change a location by a function you must either pass the address of that location (a Variable** head in your case) or use a reference.
Much better would be the definition of a class for your stack...
And another one: storing a variable's name as a char* will almost certainly cause trouble later on. Prepare for memory allocation of a char[] and copy the name string.
You do not save the variable you created in push so they all get lost
void push(Variable * head, char *name, double value) {
Variable * p ;
p = head;
head = new Variable;
head -> name = name;
head -> value = value;
head -> next = p;
}
When the function enters head points to null.
in head = new Variable; head now points to a newly created variable on the heap
when the function exits no one keeps track of the newly created variable on the heap. The memory is leaked and there is no way to access that element.
NOTE: You should be aware that Changes you write to head in the function push do not affect variables[0] you passed to the function. variables[0] is pointer to a Variable somewhere. Initially it is nullptr meaning it does not point at anything. head is a copy of variables[0] that means a different pointer that happens to point at the same place in memory (in your case nullptr). That means though that if you change head it points at something else and is no longer pointing to the same object as variables[0]
Suggested Changes:
Make push a function that returns a Variable* to the caller. Which is the new head.
Make push a function that accepts a Variable*& as an in/out parameter and returns the new head in that
(My preference) create a deque struct that holds a Variable* head memeber. pass a deque* to all these functions (push/pop) and in these functions manage the memory

Pointers and reference issue

I'm creating something similar to structure list. At the beginning of main I declare a null pointer. Then I call insert() function a couple of times, passing reference to that pointer, to add new elements.
However, something seems to be wrong. I can't display the list's element, std::cout just breaks the program, even though it compiler without a warning.
#include <iostream>
struct node {
node *p, *left, *right;
int key;
};
void insert(node *&root, const int key)
{
node newElement = {};
newElement.key = key;
node *y = NULL;
std::cout << root->key; // this line
while(root)
{
if(key == root->key) exit(EXIT_FAILURE);
y = root;
root = (key < root->key) ? root->left : root->right;
}
newElement.p = y;
if(!y) root = &newElement;
else if(key < y->key) y->left = &newElement;
else y->right = &newElement;
}
int main()
{
node *root = NULL;
insert(root, 5);
std::cout << root->key; // works perfectly if I delete cout in insert()
insert(root, 2);
std::cout << root->key; // program breaks before this line
return 0;
}
As you can see, I create new structure element in insert function and save it inside the root pointer. In the first call, while loop isn't even initiated so it works, and I'm able to display root's element in the main function.
But in the second call, while loop already works, and I get the problem I described.
There's something wrong with root->key syntax because it doesn't work even if I place this in the first call.
What's wrong, and what's the reason?
Also, I've always seen inserting new list's elements through pointers like this:
node newElement = new node();
newElement->key = 5;
root->next = newElement;
Is this code equal to:
node newElement = {};
newElement.key = 5;
root->next = &newElement;
? It would be a bit cleaner, and there wouldn't be need to delete memory.
The problem is because you are passing a pointer to a local variable out of a function. Dereferencing such pointers is undefined behavior. You should allocate newElement with new.
This code
node newElement = {};
creates a local variable newElement. Once the function is over, the scope of newElement ends, and its memory gets destroyed. However, you are passing the pointer to that destroyed memory to outside the function. All references to that memory become invalid as soon as the function exits.
This code, on the other hand
node *newElement = new node(); // Don't forget the asterisk
allocates an object on free store. Such objects remain available until you delete them explicitly. That's why you can use them after the function creating them has exited. Of course since newElement is a pointer, you need to use -> to access its members.
The key thing you need to learn here is the difference between stack allocated objects and heap allocated objects. In your insert function your node newElement = {} is stack allocated, which means that its life time is determined by the enclosing scope. In this case that means that when the function exits your object is destroyed. That's not what you want. You want the root of your tree to stored in your node *root pointer. To do that you need to allocate memory from the heap. In C++ that is normally done with the new operator. That allows you to pass the pointer from one function to another without having its life time determined by the scope that it's in. This also means you need to be careful about managing the life time of heap allocated objects.
Well you have got one problem with your Also comment. The second may be cleaner but it is wrong. You have to new memory and delete it. Otherwise you end up with pointers to objects which no longer exist. That's exactly the problem that new solves.
Another problem
void insert(node *&root, const int key)
{
node newElement = {};
newElement.key = key;
node *y = NULL;
std::cout << root->key; // this line
On the first insert root is still NULL, so this code will crash the program.
It's already been explained that you would have to allocate objects dynamically (with new), however doing so is fraught with perils (memory leaks).
There are two (simple) solutions:
Have an ownership scheme.
Use an arena to put your nodes, and keep references to them.
1 Ownership scheme
In C and C++, there are two forms of obtaining memory where to store an object: automatic storage and dynamic storage. Automatic is what you use when you declare a variable within your function, for example, however such objects only live for the duration of the function (and thus you have issues when using them afterward because the memory is probably overwritten by something else). Therefore you often must use dynamic memory allocation.
The issue with dynamic memory allocation is that you have to explicitly give it back to the system, lest it leaks. In C this is pretty difficult and requires rigor. In C++ though it's made easier by the use of smart pointers. So let's use those!
struct Node {
Node(Node* p, int k): parent(p), key(k) {}
Node* parent;
std::unique_ptr<Node> left, right;
int key;
};
// Note: I added a *constructor* to the type to initialize `parent` and `key`
// without proper initialization they would have some garbage value.
Note the different declaration of parent and left ? A parent owns its children (unique_ptr) whereas a child just refers to its parent.
void insert(std::unique_ptr<Node>& root, const int key)
{
if (root.get() == nullptr) {
root.reset(new Node{nullptr, key});
return;
}
Node* parent = root.get();
Node* y = nullptr;
while(parent)
{
if(key == parent->key) exit(EXIT_FAILURE);
y = parent;
parent = (key < parent->key) ? parent->left.get() : parent->right.get();
}
if (key < y->key) { y->left.reset(new Node{y, key}); }
else { y->right.reset(new Node{y, key}); }
}
In case you don't know what unique_ptr is, the get() it just contains an object allocated with new and the get() method returns a pointer to that object. You can also reset its content (in which case it properly disposes of the object it already contained, if any).
I would note I am not too sure about your algorithm, but hey, it's yours :)
2 Arena
If this dealing with memory got your head all mushy, that's pretty normal at first, and that's why sometimes arenas might be easier to use. The idea of using an arena is pretty general; instead of bothering with memory ownership on a piece by piece basis you use "something" to hold onto the memory and then only manipulate references (or pointers) to the pieces. You just have to keep in mind that those references/pointers are only ever alive as long as the arena is.
struct Node {
Node(): parent(nullptr), left(nullptr), right(nullptr), key(0) {}
Node* parent;
Node* left;
Node* right;
int key;
};
void insert(std::list<Node>& arena, Node *&root, const int key)
{
arena.push_back(Node{}); // add a new node
Node& newElement = arena.back(); // get a reference to it.
newElement.key = key;
Node *y = NULL;
while(root)
{
if(key == root->key) exit(EXIT_FAILURE);
y = root;
root = (key < root->key) ? root->left : root->right;
}
newElement.p = y;
if(!y) root = &newElement;
else if(key < y->key) y->left = &newElement;
else y->right = &newElement;
}
Just remember two things:
as soon as your arena dies, all your references/pointers are pointing into the ether, and bad things happen should you try to use them
if you ever only push things into the arena, it'll grow until it consumes all available memory and your program crashes; at some point you need cleanup!