search function causes program to crash - c++

I have been going through the debugger but can't seem to pinpoint exactly what is going wrong. I have come to my own conclusion i must be missing a nullptr check somewhere or something. If anyone can provide some help it would be greatly appreciated.
error message from debugger
error msg
which looks like makes the program crash on this line:
if (node->children_[index] == nullptr) {
search function
Node* search(const string& word, Node* node, int index) const {
Node* temp;
//same as recurssive lookup just difference is returns node weather terminal or not
if (index < word.length()) {
index = node->getIndex(word[index]);
if (node->children_[index] == nullptr) {
return nullptr;
}
else {
temp = search(word, node->children_[index], index++);
}
}
return temp; // this would give you ending node of partialWord
}
Node struct for reference
struct Node {
bool isTerminal_;
char ch_;
Node* children_[26];
Node(char c = '\0') {
isTerminal_ = false;
ch_ = c;
for (int i = 0; i < 26; i++) {
children_[i] = nullptr;
}
}
//given lower case alphabetic charachters ch, returns
//the associated index 'a' --> 0, 'b' --> 1...'z' --> 25
int getIndex(char ch) {
return ch - 'a';
}
};
Node* root_;
int suggest(const string& partialWord, string suggestions[]) const {
Node* temp;
temp = search(partialWord, root_, 0);
int count = 0;
suggest(partialWord, temp, suggestions, count);
return count;
}

Might be a very simple thing. Without digging I am not sure about the rank of the -> operator versus the == operator. I would take a second and try putting parenthesis around the "node->children_[index] == nullptr" part like this:
(node->children_[index]) == nullptr
just to make sure that the logic runs like you seem to intend.
Dr t

I believe the root cause is that you're using index for two distinct purposes: as an index into the word you're looking for, and as an index into the node's children.
When you get to the recursion, index has changed meaning, and it's all downhill from there.
You're also passing index++ to the recursion, but the value of index++ is the value it had before the increment.
You should pass index + 1.
[An issue in a different program would be that the order of evaluation of function parameters is unspecified, and you should never both modify a variable and use it in the same parameter list. (I would go so far as to say that you should never modify anything in a parameter list, but many disagree.)
But you shouldn't use the same variable here at all, so...]
I would personally restructure the code a little, something like this:
Node* search(const string& word, Node* node, int index) const {
// Return immediately on failure.
if (index >= word.length())
{
return nullptr;
}
int child_index = node->getIndex(word[index]);
// The two interesting cases: we either have this child or we don't.
if (node->children_[child_index] == nullptr) {
return nullptr;
}
else {
return search(word, node->children_[child_index], index + 1);
}
}
(Side note: returning a pointer to a non-const internal Node from a const function is questionable.)

Related

C Creating a binary tree based on a sequence

I need help adjusting the createTree function.
Which accepts a string and after that character by character traverses it, creating a binary tree based on it
If it encounters the character 0, it recursively creates two sub-branches.
If it encounters another character, it saves it in the leaf node.
For the string in the example, I need to make a tree as in the picture, but the function does not work properly for me. Thank you in advance for your advice.
int x = 0;
Node* createTree(string str, int si, int ei)
{
if (si > ei)
return NULL;
Node *root = new Node((str[si] - '0'));
if(str[si] != '0')
{
x++;
root->m_Data = (str[si] - '0');
return root;
}
if(str[si]=='0')
{
x++;
root->m_Left = createTree(str,x,ei);
root->m_Right = createTree(str,x,ei);
}
return root;
}
int main ()
{
string str = "050067089";
Node *node = createTree(str,0,str.length());
printPreorder(node);
return 0;
}
The problem can quite easily be broken down into small steps (what you partly did in your question).
Start iterating at the first character
Create the root node
If the current character is non-zero, set the value of this node to this character
If current character is a zero, set this node to zero, create a left and a right node and get back to step 3 for every one of them. (That's the recursive part.)
Below is my implementation of this algorithm.
First, a little bit of setting up:
#include <iostream>
#include <string>
#include <memory>
struct Node;
// Iterator to a constant character, NOT a constant iterator
using StrConstIt = std::string::const_iterator;
using UniqueNode = std::unique_ptr<Node>;
struct Node
{
int value;
UniqueNode p_left;
UniqueNode p_right;
Node(int value)
: value(value) {}
Node(int value, UniqueNode p_left, UniqueNode p_right)
: value(value), p_left(std::move(p_left)), p_right(std::move(p_right)) {}
};
As you can see, I'm using std::unique_ptr for managing memory. This way, you don't have to worry about manually deallocating memory. Using smart pointers is often considered the more "modern" approach, and they should virtually always be preferred over raw pointers.
UniqueNode p_createNodeAndUpdateIterator(StrConstIt& it, StrConstIt stringEnd)
{
if (it >= stringEnd)
return nullptr;
UniqueNode node;
if (*it == '0')
// Create node with appropriate value
// Create branches and increment iterator
node = std::make_unique<Node>(
0,
p_createNodeAndUpdateIterator(++it, stringEnd),
p_createNodeAndUpdateIterator(it, stringEnd)
);
else
{
// Create leaf node with appropriate value
node = std::make_unique<Node>(*it - '0');
// Increment iterator
++it;
}
return node;
}
UniqueNode p_createTree(StrConstIt begin, StrConstIt end)
{
return p_createNodeAndUpdateIterator(begin, end);
}
The first function takes a reference to the iterator to the next character it should process. That is because you can't know how much characters a branch will have in its leaf nodes beforehand. Therefore, as the function's name suggests, it will update the iterator with the processing of each character.
I'm using iterators instead of a string and indices. They are clearer and easier to work with in my opinion — changing it back should be fairly easy anyway.
The second function is basically syntactic sugar: it is just there so that you don't have to pass an lvalue as the first argument.
You can then just call p_createTree with:
int main()
{
std::string str = "050067089";
UniqueNode p_root = p_createTree(str.begin(), str.end());
return 0;
}
I also wrote a function to print out the tree's nodes for debugging:
void printTree(const UniqueNode& p_root, int indentation = 0)
{
// Print the value of the node
for (int i(0); i < indentation; ++i)
std::cout << "| ";
std::cout << p_root->value << '\n';
// Do nothing more in case of a leaf node
if (!p_root->p_left.get() && !p_root->p_right.get())
;
// Otherwise, print a blank line for empty children
else
{
if (p_root->p_left.get())
printTree(p_root->p_left, indentation + 1);
else
std::cout << '\n';
if (p_root->p_right.get())
printTree(p_root->p_right, indentation + 1);
else
std::cout << '\n';
}
}
Assuming that the code which is not included in your question is correct, there is only one issue that could pose a problem if more than one tree is built. The problem is that x is a global variable which your functions change as a side-effect. But if that x is not reset before creating another tree, things will go wrong.
It is better to make x a local variable, and pass it by reference.
A minor thing: don't use NULL but nullptr.
Below your code with that change and the class definition included. I also include a printSideways function, which makes it easier to see that the tree has the expected shape:
#include <iostream>
using namespace std;
class Node {
public:
int m_Data;
Node* m_Left = nullptr;
Node* m_Right = nullptr;
Node(int v) : m_Data(v) {}
};
// Instead of si, accept x by reference:
Node* createTree(string str, int &x, int ei)
{
if (x >= ei)
return nullptr;
Node *root = new Node((str[x] - '0'));
if(str[x] != '0')
{
root->m_Data = (str[x] - '0');
x++;
return root;
}
if(str[x]=='0')
{
x++;
root->m_Left = createTree(str,x,ei);
root->m_Right = createTree(str,x,ei);
}
return root;
}
// Overload with a wrapper that defines x
Node* createTree(string str)
{
int x = 0;
return createTree(str, x, str.length());
}
// Utility function to visualise the tree with the root at the left
void printSideways(Node *node, string tab) {
if (node == nullptr) return;
printSideways(node->m_Right, tab + " ");
cout << tab << node->m_Data << "\n";
printSideways(node->m_Left, tab + " ");
}
// Wrapper for above function
void printSideways(Node *node) {
printSideways(node, "");
}
int main ()
{
string str = "050067089";
Node *node = createTree(str);
printSideways(node);
return 0;
}
So, as you see, nothing much was altered. Just si was replaced with x, which is passed around by reference, and x is defined locally in a wrapper function.
Here is the output:
9
0
8
0
7
0
6
0
5

C++ return a const pointer inside a none const pointer function

I have coded this function to find the shallowest leaf in binary search tree it is not the best but it does the job, the leaf have to be returned after it have been found.
it is a necessary condition not to change the function prototype.
my problem is pointed by a comment below
The problem is i am returning a const Pointer inside a none const pointer function, i look before posting the question, all of the question where functions inside of classes, I have not studied them so I don't know if it is the same for functions outside of classes, is there any workaround for the problem ?
struct Node {
int _data;
struct Node *_left;
struct Node *_right;
};
//-----------------------------------------------------------------------------------
struct Node *min_depth_leaf(const struct Node *root, int &depth) {
int left_depth;
int right_depth;
if (root == NULL) {
depth = INT32_MAX;
return NULL;
} else if (root->_left == NULL && root->_right == NULL) {
depth = 0;
return root;//<-------------- The problem lays here
} else if (root->_left != NULL || root->_right != NULL) {
struct Node *left_node = min_depth_leaf(root->_left, left_depth);
struct Node *right_node = min_depth_leaf(root->_right, right_depth);
if (right_depth < left_depth) {
right_depth += 1;
depth = right_depth;
return right_node;
} else {
left_depth += 1;
depth = left_depth;
return left_node;
}
}
return NULL;
}
Two ways can be used. The first will help maintain a good project and the second will propagate undefined behaviours , giving an unstable software that behaves differently in the same situatuion.
The first way is to return a copy of the const Node, thus allowing the API user of min_depth_leaf to modify the returned copy value, without modifying the original value in the tree, code will be like:
#include<cstdlib>
struct Node {
int _data;
struct Node *_left;
struct Node *_right;
};
//-----------------------------------------------------------------------------------
struct Node *min_depth_leaf(const struct Node *root, int &depth) {
int left_depth;
int right_depth;
if (root == NULL) {
depth = INT32_MAX;
return NULL;
} else if (root->_left == NULL && root->_right == NULL) {
depth = 0;
// return a copy
Node * p = new Node();
p->_data=root->_data;
p->_left = root->_left;
p->_right = root->_right;
return p;
} else if (root->_left != NULL || root->_right != NULL) {
struct Node *left_node = min_depth_leaf(root->_left, left_depth);
struct Node *right_node = min_depth_leaf(root->_right, right_depth);
if (right_depth < left_depth) {
right_depth += 1;
depth = right_depth;
return right_node;
} else {
left_depth += 1;
depth = left_depth;
return left_node;
}
}
return NULL;
}
The other way (to be avoided) is to cast the const value to non const, causing undefined behaviors (UB), for example:
If the API user deletes the returned Node from min_depth_leaf that is returned it will be deleted from the tree.
if the API user creates the tree on stack in a function f1() and then gets the result of the min_depth_leaf in another function f2(), he will be surprised that as soon as f2() ends, the returned node will be deleted from stack, even though f1() is still not ended, so f1() will get garbage when accessing it .
This way is by using const_cast
return const_cast<Node *>(root); //never use this
Without changing the function's signature the only way to solve this problem is with const_cast:
return const_cast<Node*>(root);
Since your code looks like C rather than C++ to me, a C-style cast may be more appropriate:
return (struct Node*)root;
In any case changing the function signature is a way cleaner approach. If you make your function a template, it will work with both const and non-const nodes:
template<typename T> T* min_depth_leaf(T* root, int &depth)

Properly exiting out of recursions?

TrieNode and Trie Object:
struct TrieNode {
char nodeChar = NULL;
map<char, TrieNode> children;
TrieNode() {}
TrieNode(char c) { nodeChar = c; }
};
struct Trie {
TrieNode *root = new TrieNode();
typedef pair<char, TrieNode> letter;
typedef map<char, TrieNode>::iterator it;
Trie(vector<string> dictionary) {
for (int i = 0; i < dictionary.size(); i++) {
insert(dictionary[i]);
}
}
void insert(string toInsert) {
TrieNode * curr = root;
int increment = 0;
// while letters still exist within the trie traverse through the trie
while (curr->children.find(toInsert[increment]) != curr->children.end()) { //letter found
curr = &(curr->children.find(toInsert[increment])->second);
increment++;
}
//when it doesn't exist we know that this will be a new branch
for (int i = increment; i < toInsert.length(); i++) {
TrieNode temp(toInsert[i]);
curr->children.insert(letter(toInsert[i], temp));
curr = &(curr->children.find(toInsert[i])->second);
if (i == toInsert.length() - 1) {
temp.nodeChar = NULL;
curr->children.insert(letter(NULL, temp));
}
}
}
vector<string> findPre(string pre) {
vector<string> list;
TrieNode * curr = root;
/*First find if the pre actually exist*/
for (int i = 0; i < pre.length(); i++) {
if (curr->children.find(pre[i]) == curr->children.end()) { //DNE
return list;
}
else {
curr = &(curr->children.find(pre[i])->second);
}
}
/*Now curr is at the end of the prefix, now we will perform a DFS*/
pre = pre.substr(0, pre.length() - 1);
findPre(list, curr, pre);
}
void findPre(vector<string> &list, TrieNode *curr, string prefix) {
if (curr->nodeChar == NULL) {
list.push_back(prefix);
return;
}
else {
prefix += curr->nodeChar;
for (it i = curr->children.begin(); i != curr->children.end(); i++) {
findPre(list, &i->second, prefix);
}
}
}
};
The problem is this function:
void findPre(vector<string> &list, TrieNode *curr, string prefix) {
/*if children of TrieNode contains NULL char, it means this branch up to this point is a complete word*/
if (curr->nodeChar == NULL) {
list.push_back(prefix);
}
else {
prefix += curr->nodeChar;
for (it i = curr->children.begin(); i != curr->children.end(); i++) {
findPre(list, &i->second, prefix);
}
}
}
The purpose is to return all words with the same prefix from a trie using DFS. I manage to retrieve all the necessary strings but I can't exit out of the recursion.
The code completes the last iteration of the if statement and breaks. Visual Studio doesn't return any error code.
The typical end to a recursion is just as you said- return all words. A standard recursion looks something like this:
returnType function(params...){
//Do stuff
if(need to recurse){
return function(next params...);
}else{ //This should be your defined base-case
return base-case;
}
The issue arises in that your recursive function can never return- it can either execute the push_back, or it can call itself again. Neither of these seems to properly exit, so it'll either end quietly (with an inferred return of nothing), or it'll keep recursing.
In your situation, you likely need to store the results from recursion in an intermediate structure like a list or such, and then return that list after iteration (since it's a tree search and ought to check all the children, not return the first one only)
On that note, you seem to be missing part of the point of recursions- they exist to fill a purpose: break down a problem into pieces until those pieces are trivial to solve. Then return that case and build back to a full solution. Any tree-searching must come from this base structure, or you may miss something- like forgetting to return your results.
Check the integrity of your Trie structure. The function appears to be correct. The reason why it wouldn't terminate is if one or more of your leaf nodes doesn't have curr->nodeChar == NULL.
Another case is that any node (leaf or non-leaf) has a garbage child node. This will cause the recursion to break into reading garbage values and no reason to stop. Running in debug mode should break the execution with segmentation fault.
Write another function to test if all leaf-nodes have NULL termination.
EDIT:
After posting the code, the original poster has already pointed out that the problem was that he/she was not returning the list of strings.
Apart from that, there are a few more suggestions I would like to provide based on the code:
How does this while loop terminate if toInsert string is already in the Trie.
You will overrun the toInsert string and read a garbage character.
It will exit after that, but reading beyond your string is a bad way to program.
// while letters still exist within the trie traverse through the trie
while (curr->children.find(toInsert[increment]) != curr->children.end())
{ //letter found
curr = &(curr->children.find(toInsert[increment])->second);
increment++;
}
This can be written as follows:
while (increment < toInsert.length() &&
curr->children.find(toInsert[increment]) != curr->children.end())
Also,
Trie( vector<string> dictionary)
should be
Trie( const vector<string>& dictionary )
because dictionary can be a large object. If you don't pass by reference, it will create a second copy. This is not efficient.
I am a idiot. I forgot to return list on the first findPre() function.
vector<string> findPre(string pre) {
vector<string> list;
TrieNode * curr = root;
/*First find if the pre actually exist*/
for (int i = 0; i < pre.length(); i++) {
if (curr->children.find(pre[i]) == curr->children.end()) { //DNE
return list;
}
else {
curr = &(curr->children.find(pre[i])->second);
}
}
/*Now curr is at the end of the prefix, now we will perform a DFS*/
pre = pre.substr(0, pre.length() - 1);
findPre(list, curr, pre);
return list; //<----- this thing
}

Need to reference and update value from nested class C++

Bear with me, I'm new to C++. I'm trying to update a value which is stored in a vector, but I'm getting this error:
non-const lvalue reference to type 'Node'
I'm using a simple wrapper around std::vector so I can share methods like contains and others (similar to how the ArrayList is in Java).
#include <vector>
using namespace std;
template <class T> class NewFrames {
public:
// truncated ...
bool contains(T data) {
for(int i = 0; i < this->vec->size(); i++) {
if(this->vec->at(i) == data) {
return true;
}
}
return false;
}
int indexOf(T data) {
for(int i = 0; i < this->vec->size(); i++) {
if(this->vec->at(i) == data) {
return i;
}
}
return -1;
}
T get(int index) {
if(index > this->vec->size()) {
throw std::out_of_range("Cannot get index that exceeds the capacity");
}
return this->vec->at(index);
}
private:
vector<T> *vec;
};
#endif // A2_NEWFRAMES_H
The class which utilizes this wrapper is defined as follows:
#include "Page.h"
#include "NewFrames.h"
class Algo {
private:
typedef struct Node {
unsigned reference:1;
int data;
unsigned long _time;
Node() { }
Node(int data) {
this->data = data;
this->reference = 0;
this->_time = (unsigned long) time(NULL);
}
} Node;
unsigned _faults;
Page page;
NewFrames<Node> *frames;
};
I'm at a point where I need to reference one of the Node objects inside of the vector, but I need to be able to change reference to a different value. From what I've found on SO, I need to do this:
const Node &n = this->frames->get(this->frames->indexOf(data));
I've tried just using:
Node n = this->frames->get(this->frames->indexOf(data));
n.reference = 1;
and then viewing the data in the debugger, but the value is not updated when I check later on. Consider this:
const int data = this->page.pages[i];
const bool contains = this->frames->contains(Node(data));
Node node = this->frames->get(index);
for(unsigned i = 0; i < this->page.pages.size(); i++) {
if(node == NULL && !contains) {
// add node
} else if(contains) {
Node n = this->frames->get(this->frames->indexOf(data));
if(n.reference == 0) {
n.reference = 1;
} else {
n.reference = 0;
}
} else {
// do other stuff
}
}
With subsequent passes of the loop, the node with that particular data value is somehow different.
But if I attempt to change n.reference, I'll get an error because const is preventing the object from changing. Is there a way I can get this node so I can change it? I'm coming from the friendly Java world where something like this would work, but I want to know/understand why this doesn't work in C++.
Node n = this->frames->get(this->frames->indexOf(data));
n.reference = 1;
This copies the Node from frames and stores the copy as the object n. Modifying the copy does not change the original node.
The simplest "fix" is to use a reference. That means changing the return type of get from T to T&, and changing the previous two lines to
Node& n = this->frames->get(this->frames->indexOf(data));
n.reference = 1;
That should get the code to work. But there is so much indirection in the code that there are likely to be other problems that haven't shown up yet. As #nwp said in a comment, using vector<T> instead of vector<T>* will save you many headaches.
And while I'm giving style advice, get rid of those this->s; they're just noise. And simplify the belt-and-suspenders validity checks: when you loop from 0 to vec.size() you don't need to check that the index is okay when you access the element; change vec.at(i) to vec[i]. And in get, note that vec.at(index) will throw an exception if index is out of bounds, so you can either skip the initial range check or keep the check (after fixing it so that it checks the actual range) and, again, use vec[index] instead of vec.at(index).

C++ vector and segmentation faults

I am working on a simple mathematical parser. Something that just reads number = 1 + 2;
I have a vector containing these tokens. They store a type and string value of the character. I am trying to step through the vector to build an AST of these tokens, and I keep getting segmentation faults, even when I am under the impression my code should prevent this from happening.
Here is the bit of code that builds the AST:
struct ASTGen
{
const vector<Token> &Tokens;
unsigned int size,
pointer;
ASTGen(const vector<Token> &t) : Tokens(t), pointer(0)
{
size = Tokens.size() - 1;
}
unsigned int next()
{
return pointer + 1;
}
Node* Statement()
{
if(next() <= size)
{
switch(Tokens[next()].type)
{
case EQUALS
:
Node* n = Assignment_Expr();
return n;
}
}
advance();
}
void advance()
{
if(next() <= size) ++pointer;
}
Node* Assignment_Expr()
{
Node* lnode = new Node(Tokens[pointer], NULL, NULL);
advance();
Node* n = new Node(Tokens[pointer], lnode, Expression());
return n;
}
Node* Expression()
{
if(next() <= size)
{
advance();
if(Tokens[next()].type == SEMICOLON)
{
Node* n = new Node(Tokens[pointer], NULL, NULL);
return n;
}
if(Tokens[next()].type == PLUS)
{
Node* lnode = new Node(Tokens[pointer], NULL, NULL);
advance();
Node* n = new Node(Tokens[pointer], lnode, Expression());
return n;
}
}
}
};
...
ASTGen AST(Tokens);
Node* Tree = AST.Statement();
cout << Tree->Right->Data.svalue << endl;
I can access Tree->Data.svalue and get the = Node's token info, so I know that node is getting spawned, and I can also get Tree->Left->Data.svalue and get the variable to the left of the =
I have re-written it many times trying out different methods for stepping through the vector, but I always get a segmentation fault when I try to access the = right node (which should be the + node)
Any help would be greatly appreciated.
There's plenty more code that we haven't seen, so I can't tell you precisely what's going on, but I see a few things that are reasons for concern. One is that the Statement() method doesn't always return a value. If the first if test doesn't pass, then we call advance() and fall off the bottom of the routine without an explicit return. The caller will try to get the return value of the function but it'll get garbage. This could lead to all sorts of problems, including things like double free() calls, etc, which can easily cause segfaults.
Expression() has the same problem.