Stack overflow? Interesting behaviour during very deep recursion - c++

While I was making my assignment on BST, Linked Lists and AVL I noticed.. actually it is as in the title.
I believe it is somehow related to stack overflow, but could not find why it is happening.
Creation of the BST and Linked list
Searching for all elements in Linked list and BST
And probably most interesting...
Comparison of the height of BST and AVL
(based on array of unique random integers)
On every graph something interesting begins around 33k elements.
Optimization O2 in MS Visual Studio 2019 Community.
Search function of Linked list is not recursive.
Memory for each "link" was allocated with "new" operator.
X axis ends on 40k elements because when it is about 43k then stack overflow error happens.
Do you know why does it happen? Actually, I'm curious what is happening. Looking forward to your answers! Stay healthy.
Here is some related code although it is not exactly the same, I can assure it works the same and it could be said some code was based on it.
struct tree {
tree() {
info = NULL;
left = NULL;
right = NULL;
}
int info;
struct tree *left;
struct tree *right;
};
struct tree *insert(struct tree*& root, int x) {
if(!root) {
root= new tree;
root->info = x;
root->left = NULL;
root->right = NULL;
return(root);
}
if(root->info > x)
root->left = insert(root->left,x); else {
if(root->info < x)
root->right = insert(root->right,x);
}
return(root);
}
struct tree *search(struct tree*& root, int x) {
struct tree *ptr;
ptr=root;
while(ptr) {
if(x>ptr->info)
ptr=ptr->right; else if(x<ptr->info)
ptr=ptr->left; else
return ptr;
}
int bstHeight(tree*& tr) {
if (tr == NULL) {
return -1;
}
int lefth = bstHeight(tr->left);
int righth = bstHeight(tr->right);
if (lefth > righth) {
return lefth + 1;
} else {
return righth + 1;
}
}
AVL tree is a BST read inorder and then, array of the elements is inserted into tree object through bisection.

Spikes in time could be, and I am nearly sure they are, because of using up some cache of the CPU (L2 for example). Some leftover data was stored somewhere in slower memory.
The answer is thanks to #David_Schwartz
Spike in the height of the BST tree is actually my own fault. For the "array of unique random" integers I used array of already sorted unique items, then mixing them up by swapping elements with the rand() function. I have totally forgotten how devastating could it be if expected to random larger numbers.
Thanks #rici for pointing it out.

Related

Segmentation fault (core dumped) - Threaded Binary Search Tree

I keep getting the following error : Segmentation fault (core dumped) . I found out the line of code that is causing the problem ( marked with a comment inside of the program) . Please tell me why this error is happening and how to fix it.
I've tried to dry run my code (on paper ) and see no logical errors (from my understanding).
I have only recently got into coding and stackoverflow please guide me through how I can further improve my question , as well as my code . Thanks !
class tree
{
struct node // Creates a node for a tree
{
int data;
bool rbit,lbit; // rbit/lbit= defines if right/left child of root is present or not
node *left,*right;
};
public:
node *head,*root;
tree() // constructor initializes root and head
{
root=NULL;
head=createnode(10000);
}
node *createnode(int value)
{// Allocates memory for a node then initializes node with given value and returns that node
node *temp=new node ;
temp->data=value;
temp->lbit=0;
temp->rbit=0;
temp->left=NULL;
temp->right=NULL;
return temp;
}
void insert(node *temp,int value) // Creates binary search tree node by node
{
if(root==NULL) // Checking if tree is empty
{
root=createnode(value); //Root now points to new memory location
head->left=root;
head->lbit=1;
root->left=head;//this line gives the segmentation fault (what i thought before correction)
}
}
void inorder(node *root) // Inorder traversal of tree (this function is logically incorrect)
{
if(root==NULL)
return;
inorder(root->left);
cout<<root->data<<"\t";
inorder(root->right);
}
void getdata()//Accepts data , creates a node through insert() , displays result through inorder()
{
int data;
cout<<"Enter data"<<endl;
cin>>data;
insert(root,data);
inorder(root);
}
/*void inorder(node *root) // Working inorder code
{
if(root->lbit==1)
inorder(root->left);
cout<<root->data<<"\t";
if(root->rbit==1)
inorder(root->right);
}*/
};
int main()
{
tree t; // Tree Object
t.getdata(); // Calling getdata
return 0;
}
I think the comments section largely reflects a miscommunication. It's easy to believe that you are experiencing a crash ON that particular line.
This is not actually the case. Instead what you have done is created a loop in your tree which leads to infinite recursion by the inorder function. That causes a stack overflow which segfaults -- this would have been extremely easy to spot if you had just run your program with a debugger (such as gdb) attached.
temp = createnode(value);
if(root == NULL)
{
root = temp;
head->left = root;
head->lbit = 1;
temp->left = head;
}
Look at the loop you have just created:
head->left points to root
root->left == temp->left, which points to head
An inorder traversal will now visit:
root
head
root
head
root
head
...
Since it never gets to the end of the left-branch, the function never outputs anything before overflowing the stack and crashing.
So no, your code is not logically correct. There's a fundamental design flaw in it. You need to rethink what you are storing in your tree and why.
From the code,
root=temp; //Root now points to temp
head->left=root;
head->lbit=1;
temp->left=head;// this line gives the segmentation fault
root is not pointing to temp. temp(pointer) is assigned to root(pointer).
head's left pointer is root, and temp's left is head (which means root's left is head). so in the function "inorder",
void inorder(node *root) // Inorder traversal of tree
{
if(root==NULL) <<<<<<
return;
inorder(root->left);
cout<<root->data<<"\t";
inorder(root->right);
}
the argument node *root (left) is never NULL and the function never return.
There's not enough information on exactly how this should work (what is node.lbit for example).
The question's insert() function will not work. It's passing in a value which is immediately overwritten (among other issues). There's no explanation of what tree.head is for, so it's ignored. The fields node.lbit and node.rbit look to be superfluous flags of node.left != NULL (similarly for right). These are omitted too. The insert() is also not creating the tree properly.
void insert(int value) // Insert a value into the tree (at branch)
{
// Create a new node to insert
struct node *temp = createnode(value);
if (root == NULL) // Checking if tree is empty
{
root = temp; //Root now points to temp
}
else
{
insertAtBranch(root, temp);
}
}
// recursively find the leaf-node at which to insert the new node
void insertAtBranch(node *branch, node *new_node)
{
// to create a BST, less-than go left
if (new_node->value <= branch->value)
{
if (branch->left == NULL)
branch->left = new_node; // There's no left-branch, so it's the node
else
insertAtBranch(branch->left, new_node); // go deeper to find insertion point
}
else // greater-than go right
{
if (branch->right == NULL)
branch->right = new_node;
else
insertAtBranch(branch->right, new_node);
}
}
Imagine how a binary tree works. New nodes are only ever inserted at the edges. So you look at a given node, and decide if this new-node is less or grater than the one you're looking at (unless the tree is empty, of course).
Say the new-node.value is less than the branch-node.value, you want to branch left. Still with the same node, if it doesn't have a left-branch (node.left == NULL), the new node is the left branch. Otherwise you need to travel down the left-branch and check again.
I would have made node a class, and used a constructor to at least set the default properties and value. But that's not a big deal.

Traversing and Printing a Binary Tree Level by Level

I am trying to traverse a binary tree built with the input data from keyboard. Data is inserted to the binary tree successfully. I have a switch statement, where 'case 4' should traverse (and print) the binary tree level by level. However I got EXC_BAD_ACCESS error. I would be more than happy if someone help me out with this one.
(RootPtr is the top -Level 0- node of the binary tree defined globally; TreeDepth() is the function calculating "Depth" of the tree where Depth defined globally and root node has depth of 0; and GetNode is basically an initializer function (using malloc) for type TreePtr pointers.)
Thank you all in advance.
Here is the relevant code:
This is the struct definition;
typedef struct treeItem
{
int data;
struct treeItem *left;
struct treeItem *right;
}Tree , *TreePtr;
This is the switch case where I call Level by Level traversing function(s);
case 4:
TreePtr temp;
GetNode(&temp);
temp = RootPtr;
printLevelOrder(temp);
printf("\n\n");
break;
These are the functions used for traversing the tree level by level;
void printGivenLevel(TreePtr TemPtr, int level)
{
if (items == 0)
return;
else
{ if(level == 0 )
{
printf(" %d", (*TemPtr).data); //the line I got ERROR
}
else
{
printGivenLevel((*TemPtr).left, (level-1));
printGivenLevel((*TemPtr).right, (level-1));
}
}
}
void printLevelOrder(TreePtr TemPtr)
{
TreeDepth();
if (items == 0)
printf("\nTree is empty.\n");
else
{
printf("Traverse level by level:");
for (int i=0; i<=Depth; i++)
{
printGivenLevel(TemPtr, i);
}
}
}
It's an off by one error. In your for loop:
for (int i=0; i<=Depth; i++)
You're traversing this loop Depth + 1 times. This means you're trying to access one more level than there actually is. In particular, in the final call of printGivenLevel, in the point in the recursion where level == 1, you're already at the bottom of the tree. You're now recursing one more time, but the pointers you pass into the next recursion level are garbage pointers (they aren't guaranteed to point to memory you're allowed to access, or even exists). So when you try to dereference them, you get an error.
One more thing: this implementation is pretty inefficient, since you're traversing the tree many times. It's better to do a breadth-first search, like kiss-o-matic mentioned. This way, you'll only traverse the tree once, which is much faster (although it does use more memory).

Merge sort causing stack to overflow?

I wrote mergesort() in C++ for linked lists. The issue is that my professor has provided test code with a very large list (length of 575,000). This causes a stack overflow error for my function since it is written recursively.
So it's possible my professor expects us to write it using iterations instead of recursion. I wanted to ask if there is anything wrong with my code that may be causing the stack to overflow?
My code:
typedef struct listnode {
struct listnode * next;
long value;
} LNode;
LNode* mergesort(LNode* data) {
if(data == NULL || data->next == NULL) {
return data;
}else {
LNode* s = split(data);
LNode* firstSortedHalf = mergesort(data);
LNode* secondSortedHalf = mergesort(s);
LNode* r = merge(firstSortedHalf, secondSortedHalf);
return r;
}
}
LNode* split(LNode* list) {
if(list) {
LNode* out = list->next;
if(out) {
list->next = out->next;
out->next = split(out->next);
}
return out;
}else {
return NULL;
}
}
LNode* merge(LNode* a, LNode* b) {
if(a == NULL)
return b;
else if(b == NULL)
return a;
if(a->value < b->value) {
a->next = merge(a->next,b);
return a;
}else {
b->next = merge(a, b->next);
return b;
}
}
So you have three recursive functions. Let's look at the maximum depth of each with the worst case of a list of 575000 elements:
merge(): This looks to iterate over the entire list. So 575000 stack frames.
split(): This looks to iterate over the entire list in pairs. So ~250000 stack frames.
mergesort(): This looks to iterate in a splitting fashion. So log_2(575000) or about 20 stack frames.
So, when we run our programs, we're given a limited amount of stack space to fit all of our stack frames. On my computer, the default limit is about 10 megabytes.
A rough estimate would be that each of your stack frames takes up 32 bytes. For the case of merge(), this means that it would take up about 18 megabytes of space, which is well beyond our limit.
The mergesort() call itself though, is only 20 iterations. That should fit under any reasonable limit.
Therefore, my takeaway is that merge() and split() should not be implemented in a recursive manner (unless that manner is tail recursive and optimizations are on).
A bit late, but it's the recursive merge() that is causing stack overflow. The recursive split() is not an issue, because its maximum depth is log2(n).
So only merge() needs to be converted to iteration.
As commented a long time ago, a bottom up approach using a small (25 to 32) array of pointers is simpler and faster, but I wasn't sure it this would be an issue with getting too much help for the assignment. Link to wiki pseudo-code:
http://en.wikipedia.org/wiki/Merge_sort#Bottom-up_implementation_using_lists
Link to working C example:
http://code.geeksforgeeks.org/Mcr1Bf

Deep Copy Linked List - O(n)

I'm trying to deep copy a linked list . I need an algorithm that executes in Linear Time O(n). This is what i have for now , but i'm not able to figure out what's going wrong with it. My application crashes and i'm suspecting a memory leak that i've not been able to figure out yet. This is what i have right now
struct node {
struct node *next;
struct node *ref;
};
struct node *copy(struct node *root) {
struct node *i, *j, *new_root = NULL;
for (i = root, j = NULL; i; j = i, i = i->next) {
struct node *new_node;
if (!new_node)
{
abort();
}
if (j)
{
j->next = new_node;
}
else
{
new_root = new_node;
}
new_node->ref = i->ref;
i->ref = new_node;
}
if (j)
{
j->next = NULL;
}
for (i = root, j = new_root; i; i = i->next, j = j->next)
j->ref =i->next->ref;
return new_root;
}
Can anyone point out where i'm going wrong with this ??
This piece alone:
struct node *new_node;
if (!new_node)
{
abort();
}
Seems good for a random abort() happening. new_node is not assigned and will contain a random value. The !new_node expression could already be fatal (on some systems).
As a general hint, you should only require 1 for-loop. Some code upfront to establish the new_root.
But atruly deep copy would also require cloning whatever ref is pointing to. It seems to me the second loop assigns something from the original into the copy. But I'm not sure, what is ref ?
One thing I immediately noticed was that you never allocate space for new_node. Since auto variables are not guaranteed to be initialized, new_node will be set to whatever value was in that memory before. You should probably start with something like:
struct node *new_node = (new_node *) malloc(sizeof(struct node));
in C, or if you're using C++:
node* new_node = new node;
Copying the list is simple enough to do. However, the requirement that the ref pointers point to the same nodes in the new list relative to the source list is going to be difficult to do in any sort of efficient manner. First, you need some way to identify which node relative to the source list they point to. You could put some kind of identifier in each node, say an int which is set to 0 in the first node, 1 in the second, etc. Then after you've copied the list you could make another pass over the list to set up the ref pointers. The problem with this approach (other that adding another variable to each node) is that it will make the time complexity of the algorithm jump from O(n) to O(n^2).
This is possible, but it takes some work. I'll assume C++, and omit the struct keyword in struct node.
You will need to do some bookkeeping to keep track of the "ref" pointers. Here, I'm converting them to numerical indices into the original list and then back to pointers into the new list.
node *copy_list(node const *head)
{
// maps "ref" pointers in old list to indices
std::map<node const *, size_t> ptr_index;
// maps indices into new list to pointers
std::map<size_t, node *> index_ptr;
size_t length = 0;
node *curn; // ptr into new list
node const *curo; // ptr into old list
node *copy = NULL;
for (curo = head; curo != NULL; curo = curo->next) {
ptr_index[curo] = length;
length++;
// construct copy, disregarding ref for now
curn = new node;
curn->next = copy;
copy = curn;
}
curn = copy;
for (size_t i=0; i < length; i++, curn = curn->next)
index_ptr[i] = curn;
// set ref pointers in copy
for (curo = head, curn = copy; curo != NULL; ) {
curn->ref = index_ptr[ptr_index[curo->ref]];
curo = curo->next;
curn = curn->next;
}
return copy;
}
This algorithm runs in O(n lg n) because it stores all n list elements in an std::map, which has O(lg n) insert and retrieval complexity. It can be made linear by using a hash table instead.
NOTE: not tested, may contain bugs.

Calculate height of a tree

I am trying to calculate the height of a tree. I am doing it with the code written below.
#include<iostream.h>
struct tree
{
int data;
struct tree * left;
struct tree * right;
};
typedef struct tree tree;
class Tree
{
private:
int n;
int data;
int l,r;
public:
tree * Root;
Tree(int x)
{
n=x;
l=0;
r=0;
Root=NULL;
}
void create();
int height(tree * Height);
};
void Tree::create()
{
//Creting the tree structure
}
int Tree::height(tree * Height)
{
if(Height->left==NULL && Height->right==NULL)
{return 0;
}
else
{
l=height(Height->left);
r=height(Height->right);
if (l>r)
{l=l+1;
return l;
}
else
{
r=r+1;
return r;
}
}
}
int main()
{
Tree A(10);//Initializing 10 node Tree object
A.create();//Creating a 10 node tree
cout<<"The height of tree"<<A.height(A.Root);*/
}
It gives me the correct result.
But in some posts(googled page) it was suggested to do a Postorder traversal and use this height method to calculate the height. Any specific reason?
But isn't a postorder traversal precisely what you are doing? Assuming left and right are both non-null, you first do height(left), then height(right), and then some processing in the current node. That's postorder traversal according to me.
But I would write it like this:
int Tree::height(tree *node) {
if (!node) return -1;
return 1 + max(height(node->left), height(node->right));
}
Edit: depending on how you define tree height, the base case (for an empty tree) should be 0 or -1.
The code will fail in trees where at least one of the nodes has only one child:
// code snippet (space condensed for brevity)
int Tree::height(tree * Height) {
if(Height->left==NULL && Height->right==NULL) { return 0; }
else {
l=height(Height->left);
r=height(Height->right);
//...
If the tree has two nodes (the root and either a left or right child) calling the method on the root will not fulfill the first condition (at least one of the subtrees is non-empty) and it will call recursively on both children. One of them is null, but still it will dereference the null pointer to perform the if.
A correct solution is the one posted by Hans here. At any rate you have to choose what your method invariants are: either you allow calls where the argument is null and you handle that gracefully or else you require the argument to be non-null and guarantee that you do not call the method with null pointers.
The first case is safer if you do not control all entry points (the method is public as in your code) since you cannot guarantee that external code will not pass null pointers. The second solution (changing the signature to reference, and making it a member method of the tree class) could be cleaner (or not) if you can control all entry points.
The height of the tree doesn't change with the traversal. It remains constant. It's the sequence of the nodes that change depending on the traversal.
Definitions from wikipedia.
Preorder (depth-first):
Visit the root.
Traverse the left subtree.
Traverse the right subtree.
Inorder (symmetrical):
Traverse the left subtree.
Visit the root.
Traverse the right subtree.
Postorder:
Traverse the left subtree.
Traverse the right subtree.
Visit the root.
"Visit" in the definitions means "calculate height of node". Which in your case is either zero (both left and right are null) or 1 + combined height of children.
In your implementation, the traversal order doesn't matter, it would give the same results. Cant really tell you anything more than that without a link to your source stating postorder is to prefer.
Here is answer :
int Help :: heightTree (node *nodeptr)
{
if (!nodeptr)
return 0;
else
{
return 1 + max (heightTree (nodeptr->left), heightTree (nodeptr->right));
}
}