Merge sort causing stack to overflow? - c++

I wrote mergesort() in C++ for linked lists. The issue is that my professor has provided test code with a very large list (length of 575,000). This causes a stack overflow error for my function since it is written recursively.
So it's possible my professor expects us to write it using iterations instead of recursion. I wanted to ask if there is anything wrong with my code that may be causing the stack to overflow?
My code:
typedef struct listnode {
struct listnode * next;
long value;
} LNode;
LNode* mergesort(LNode* data) {
if(data == NULL || data->next == NULL) {
return data;
}else {
LNode* s = split(data);
LNode* firstSortedHalf = mergesort(data);
LNode* secondSortedHalf = mergesort(s);
LNode* r = merge(firstSortedHalf, secondSortedHalf);
return r;
}
}
LNode* split(LNode* list) {
if(list) {
LNode* out = list->next;
if(out) {
list->next = out->next;
out->next = split(out->next);
}
return out;
}else {
return NULL;
}
}
LNode* merge(LNode* a, LNode* b) {
if(a == NULL)
return b;
else if(b == NULL)
return a;
if(a->value < b->value) {
a->next = merge(a->next,b);
return a;
}else {
b->next = merge(a, b->next);
return b;
}
}

So you have three recursive functions. Let's look at the maximum depth of each with the worst case of a list of 575000 elements:
merge(): This looks to iterate over the entire list. So 575000 stack frames.
split(): This looks to iterate over the entire list in pairs. So ~250000 stack frames.
mergesort(): This looks to iterate in a splitting fashion. So log_2(575000) or about 20 stack frames.
So, when we run our programs, we're given a limited amount of stack space to fit all of our stack frames. On my computer, the default limit is about 10 megabytes.
A rough estimate would be that each of your stack frames takes up 32 bytes. For the case of merge(), this means that it would take up about 18 megabytes of space, which is well beyond our limit.
The mergesort() call itself though, is only 20 iterations. That should fit under any reasonable limit.
Therefore, my takeaway is that merge() and split() should not be implemented in a recursive manner (unless that manner is tail recursive and optimizations are on).

A bit late, but it's the recursive merge() that is causing stack overflow. The recursive split() is not an issue, because its maximum depth is log2(n).
So only merge() needs to be converted to iteration.
As commented a long time ago, a bottom up approach using a small (25 to 32) array of pointers is simpler and faster, but I wasn't sure it this would be an issue with getting too much help for the assignment. Link to wiki pseudo-code:
http://en.wikipedia.org/wiki/Merge_sort#Bottom-up_implementation_using_lists
Link to working C example:
http://code.geeksforgeeks.org/Mcr1Bf

Related

Stack overflow? Interesting behaviour during very deep recursion

While I was making my assignment on BST, Linked Lists and AVL I noticed.. actually it is as in the title.
I believe it is somehow related to stack overflow, but could not find why it is happening.
Creation of the BST and Linked list
Searching for all elements in Linked list and BST
And probably most interesting...
Comparison of the height of BST and AVL
(based on array of unique random integers)
On every graph something interesting begins around 33k elements.
Optimization O2 in MS Visual Studio 2019 Community.
Search function of Linked list is not recursive.
Memory for each "link" was allocated with "new" operator.
X axis ends on 40k elements because when it is about 43k then stack overflow error happens.
Do you know why does it happen? Actually, I'm curious what is happening. Looking forward to your answers! Stay healthy.
Here is some related code although it is not exactly the same, I can assure it works the same and it could be said some code was based on it.
struct tree {
tree() {
info = NULL;
left = NULL;
right = NULL;
}
int info;
struct tree *left;
struct tree *right;
};
struct tree *insert(struct tree*& root, int x) {
if(!root) {
root= new tree;
root->info = x;
root->left = NULL;
root->right = NULL;
return(root);
}
if(root->info > x)
root->left = insert(root->left,x); else {
if(root->info < x)
root->right = insert(root->right,x);
}
return(root);
}
struct tree *search(struct tree*& root, int x) {
struct tree *ptr;
ptr=root;
while(ptr) {
if(x>ptr->info)
ptr=ptr->right; else if(x<ptr->info)
ptr=ptr->left; else
return ptr;
}
int bstHeight(tree*& tr) {
if (tr == NULL) {
return -1;
}
int lefth = bstHeight(tr->left);
int righth = bstHeight(tr->right);
if (lefth > righth) {
return lefth + 1;
} else {
return righth + 1;
}
}
AVL tree is a BST read inorder and then, array of the elements is inserted into tree object through bisection.
Spikes in time could be, and I am nearly sure they are, because of using up some cache of the CPU (L2 for example). Some leftover data was stored somewhere in slower memory.
The answer is thanks to #David_Schwartz
Spike in the height of the BST tree is actually my own fault. For the "array of unique random" integers I used array of already sorted unique items, then mixing them up by swapping elements with the rand() function. I have totally forgotten how devastating could it be if expected to random larger numbers.
Thanks #rici for pointing it out.

What is the difference between user defined stack and built in stack in use of memory?

I want to use user defined stack for my program which has a large number of recursive calls ? Will it be useful to define user defined stack?
There are a few ways to do this.
Primarily, two:
(1) Use the CPU/processor stack. There are some variants, each with its own limitations.
(2) Or, recode your function(s) to use a "stack frame" struct that simulates a "stack". The actual function ceases to be recursive. This can be virtually limitless up to whatever the heap will permit
For (1) ...
(A) If your system permits, you can issue a syscall to extend the process's stack size. There may be limits on how much you can do this and collisions with shared library addresses.
(B) You can malloc a large area. With some [somewhat] intricate inline asm trickery, you can swap this area for the stack [and back again] and call your function with this malloc area as the stack. Doable, but not for the faint of heart ...
(C) An easier way is to malloc a large area. Pass this area to pthread_attr_setstack. Then, run your recursive function as a thread using pthread_create. Note, you don't really care about multiple threads, it's just an easy way to avoid the "messy" asm trickery.
With (A), assuming the stack extend syscall permits, the limit could be all of available memory permitted for stack [up to some system-wide or RLIMIT_* parameter].
With (B) and (C), you have to "guess" and make the malloc large enough before you start. After it has been done, the size is fixed and can not be extended further.
Actually, that's not quite true. Using the asm trickery repeatedly [when needed], you could simulate a near infinite stack. But, IMO, the overhead of keeping track of these large malloc areas is high enough that I'd opt for (2) below.
For (2) ...
This can literally expand/contract as needed. One of the advantages is that you don't need to guess beforehand at how much memory you'll need. The [pseudo] stack can just keep growing as needed [until malloc returns NULL :-)].
Here is a sample recursive function [treat loosely as pseudo code]:
int
myfunc(int a,int b,int c,int d)
{
int ret;
// do some stuff ...
if (must_recurse)
ret = myfunc(a + 5,b + 7,c - 6,d + 8);
else
ret = 0;
return ret;
}
Here is that function changed to use a struct as a stack frame [again, loose pseudo code]:
typedef struct stack_frame frame_t;
struct stack_frame {
frame_t *prev;
int a;
int b;
int c;
int d;
};
stack_t *free_pool;
#define GROWCOUNT 1000
frame_t *
frame_push(frame_t *prev)
{
frame_t *cur;
// NOTE: we can maintain a free pool ...
while (1) {
cur = free_pool;
if (cur != NULL) {
free_pool = cur->prev;
break;
}
// refill free pool from heap ...
free_pool = calloc(GROWCOUNT,sizeof(stack_t));
if (free_pool == NULL) {
printf("frame_push: no memory\n");
exit(1);
}
cur = free_pool;
for (int count = GROWCOUNT; count > 0; --count, ++cur)
cur->prev = cur + 1;
cur->prev = NULL;
}
if (prev != NULL) {
*cur = *prev;
cur->prev = prev;
cur->a += 5;
cur->b += 7;
cur->c += 6;
cur->d += 8;
}
else
memset(cur,0,sizeof(frame_t));
return cur;
}
frame_t *
frame_pop(frame_t *cur)
{
frame_t *prev;
prev = cur->prev;
cur->prev = free_pool;
free_pool = cur;
return prev;
}
int
myfunc(void)
{
int ret;
stack_t *cur;
cur = frame_push(NULL);
// set initial conditions in cur...
while (1) {
// do stuff ...
if (must_recurse) {
cur = frame_push(cur);
must_recurse = 0;
continue;
}
// pop stack
cur = frame_pop(cur);
if (cur == NULL)
break;
}
return ret;
}
All of functions, objects, variable and user defined structures use memory spaces which is control by OS and compiler. So, it means your defined stack works under a general memory space which is specified for the stack of your process in OS. As a result, it does not have a big difference, but you can define an optimized structure with high efficiency to use this general stack much more better.

LinkedList used in an interview's test

[EDIT]Fixed my code. Is while(temp != NULL), not while(temp->next != NULL). Sorry to insert wrong code.
Today I've participated an online programming test. The interviewer used Codility to evaluate my code and the other interviewees.
At some moment a question about Linked list was made. It's about to count how many items a linked list has.
I did the only possible approach to do this, AFAIK:
//This is struct declaration
struct SomeStruct
{
int value;
SomeStruct* next;
}
int elementCount(SomeStruct* list)
{
int count = 0;
if(list != NULL)
{
SomeStruct* temp = list;
while(temp != NULL)
{
count++;
temp = temp->next;
}
}
return count;
}
I remember when I send this code as answer for this question, Codility points me out that this solution is wrong because its consume too much time to execute the task.
In my head and in this thread on SO there's no other way to get size of linked list without traversing it, not in a simple way.
Is there a problem with Codility when it says this solution is wrong? Or there are another approaches?
PS: the test allowed using of STL
Your solution is incorrect, since it returns 1 less than the actual count. Just try applying it to a list with 1 element.
Why did you come up with this strange two-tiered structure with an if and and a cycle that checks temp->next? Why not just
unsigned elementCount(const SomeStruct *list)
{
unsigned count = 0;
for (const SomeStruct *temp = list; temp != NULL; temp = temp->next)
++count;
return count;
}
I suspect that you decided to treat the element pointed by the list as the unused and reserved "header" element. Indeed, sometimes it might make sense to do implement lists that way. But I see nothing like that stated in your post. Did they tell you to treat it that way specifically?
well you don't have to evaluate the indirection temp->next twice for each iteration.
you can simply do
int count( SomeStruct const* pNode )
{
int result = 0;
while( pNode != 0 )
{
++result;
pNode = pNode->next;
}
return result;
}
Also, as WhozCraig notes, your code was logically wrong (yielding an off by one result), not just potentially inefficient.
Codility may be using a circularly linked list to check, in this case, your code will never end.
Using STL trivilailzes this though, as it has a List<> with a size method.

Recursive to Iterative Transformation

I've gotten stuck on trying to re-write my code from a recursive function into an iterative function.
I thought I'd ask if there are any general things to think about/tricks/guidelines etc... in regards to going from recursive code to iterative code.
e.g. I can't rly get my head around how to get the following code iterative, mainly due to the loop inside the recursion which further depends on and calls the next recursion.
struct entry
{
uint8_t values[8];
int32_t num_values;
std::array<entry, 256>* next_table;
void push_back(uint8_t value) {values[num_values++] = value;}
};
struct node
{
node* children; // +0 right, +1 left
uint8_t value;
uint8_t is_leaf;
};
void build_tables(node* root, std::array<std::array<entry, 8>, 255>& tables, int& table_count)
{
int table_index = root->value; // root is always a non-leave, thus value is the current table index.
for(int n = 0; n < 256; ++n)
{
auto current = root;
// Recurse the the huffman tree bit by bit for this table entry
for(int i = 0; i < 8; ++i)
{
current = current->children + ((n >> i) & 1); // Travel to the next node current->children[0] is left child and current->children[1] is right child. If current is a leaf then current->childen[0/1] point to the root.
if(current->is_leaf)
tables[table_index][n].push_back(current->value);
}
if(!current->is_leaf)
{
if(current->value == 0) // For non-leaves, the "value" is the sub-table index for this particular non-leave node
{
current->value = table_count++;
build_tables(current, tables, table_count);
}
tables[table_index][n].next_table = &tables[current->value];
}
else
tables[table_index][n].next_table = &tables[0];
}
}
As tables and table_count always refer to the same objects, you might make a small performance gain by taking tables and table_count out of the argument list of build_tables by storing them as members of a temporary struct and then doing something like this:
struct build_tables_struct
{
build_tables_struct(std::array<std::array<entry, 8>, 255>& tables, int& table_count) :
tables(tables), table_count(table_count) {}
std::array<std::array<entry, 8>, 255>& tables;
int& table_count;
build_tables_worker(node* root)
{
...
build_tables_worker(current); // instead of build_tables(current, tables, table_count);
...
}
}
void build_tables(node* root, std::array<std::array<entry, 8>, 255>& tables, int& table_count)
{
build_tables_struct(tables, table_count).build_tables_worker(root);
}
This applies of course only if your compiler is not smart enough to make this optimisation itself.
The only way you can make this non-recursive otherwise is managing the stack yourself. I doubt this would be much if any faster than the recursive version.
This all being said, I doubt your performance issue here is recursion. Pushing three reference arguments to the stack and calling a function I don't think is a huge burden compared to the work your function does.

Splitting linked list so many times puts into stack overflow c++

Oh dear; I seem to have misthought this.
I would like to split a singly-linked list 10,000 times, but evidently (and I didn't know this before you guys helped me) it causes a stack overflow.
I'm really new to this, so is there any way I could still do this and not cause a stack overflow? Using references or something?
Here's the method:
Node* Node::Split()
{
if(next == NULL)
{
return this;
}
Node *newNode = this->next;
if(this->next != NULL)
{
this->next = newNode->next;
}
if(newNode->next != NULL)
{
newNode->next = newNode->next->Split();
}
return newNode;
}
You'll have to write this as a loop rather than a recursive call. Keep track of your position in the original list, and both ends of the new lists, and append nodes alternately to each list.
Make sure your recursion does stop at some point (try a small data set). If it does then you have no problems there and the next thing to do is ask your compiler to increase the stack size for you. The default is quite small (I think it is one megabyte on vc++ 10).