Tree traversal falls into infinite loop (with huffman algorithm implementation)

Tree traversal falls into infinite loop (with huffman algorithm implementation) - c++

I am trying implementing the huffman algorithm following the steps described in this tutorial: https://www.programiz.com/dsa/huffman-coding, and so far I got this code:
void encode(string filename) {
List<HuffmanNode> priorityQueue;
List<Node<HuffmanNode>> encodeList;
BinaryTree<HuffmanNode> toEncode;
//Map<char, string> encodeTable;
fstream input;
input.open(filename, ios_base::in);
if (input.is_open()) {
char c;
while (!input.eof()) {
input.get(c);
HuffmanNode node;
node.data = c;
node.frequency = 1;
int pos = priorityQueue.find(node);
if(pos) {
HuffmanNode value = priorityQueue.get(pos)->getData();
value++;
priorityQueue.update(pos, value);
} else {
priorityQueue.insert(node);
}
}
}
input.close();
priorityQueue.sort();
for(int i=1; i<=priorityQueue.size(); i++)
encodeList.insert( priorityQueue.get(i) );
while(encodeList.size() > 1) {
Node<HuffmanNode> * left = new Node<HuffmanNode>(encodeList.get(1)->getData());
Node<HuffmanNode> * right = new Node<HuffmanNode>(encodeList.get(2)->getData());
HuffmanNode z;
z.data = 0;
z.frequency = left->getData().frequency + right->getData().frequency;
Node<HuffmanNode> z_node;
z_node.setData(z);
z_node.setPrevious(left);
z_node.setNext(right);
encodeList.remove(1);
encodeList.remove(1);
encodeList.insert(z_node);
}
Node<HuffmanNode> node_root = encodeList.get(1)->getData();
toEncode.setRoot(&node_root);
}
full code for the main.cpp here: https://pastebin.com/Uw5g9s7j.
When I try run this, the program read the bytes from the file, group each character by frequency and order the list, but when I try generate the huffman tree, I am unable to traverse this tree, always falling into a infinte loop (the method get stuck in the nodes containing the 2 first items from the priorityQueue above).
I tried the tree class with BinaryTree<int>, and everything works fine in this case, but with the code above the issue happens. The code for the tree is this (in the code, previous == left and next == right - I am using here the same Node class already implemented for my List class): https://pastebin.com/ZKLjuBc8.
The code for the List used in this example is: https://pastebin.com/Dprh1Pfa. And the code for the Node class used for both the List and the BinaryTree classes is: https://pastebin.com/ATLvYyft. Anyone can tell me what I am missing here? What I am getting wrong here?
UPDATE
I have tried a version using only c++ stl (with no custom List or BinaryTree implementations),but the same problem happened. The code is that: https://pastebin.com/q0wrVYBB.

Too many things to mention as comments so I'm using an answer, sorry:
So going top to bottom through the code:
Why are you defining all methods outside the class? That just makes the code so much harder to read and is much more work to type.
Node::Node()
NULL is C code, use nullptr. And why not use member initialization in the class?
class Node {
private:
T data{};
Node * previous{nullptr};
Node * next{nullptr};
...
Node::Node(Node * node) {
What is that supposed to be? You create a new node, copy the value and attach it to the existing list of Nodes like a Remora.
Is this supposed to replace the old Node? Be a move constructor?
Node::Node(T data)
Write
Node<T>::Node(T data_ = T{}) : data{data_} { }
and remove the default constructor. The member initialization from (1) initializes the remaining members.
Node::Node(T data, Node * previous, Node * next)
Again creating a Remora. This is not inserting into an existing list.
T Node::getData(), void Node::setData(T value)
If everyone can get and set data then just make it public. That will also mean it will work with cons Node<T>. Your functions are not const correct because you lack all the const versions.
Same for previous and next. But those should actually do something when you set the member. The node you point to should point back to you or made to do so:
void Node::setPrevious(Node * previous) {
// don't break an existing list
assert(this->previous == nullptr);
assert(previous->next == nullptr);
this->previous = previous;
previous->next = this;
}
Think about the copy and move constructors and assignment.
Follow the rule of 0/3/5: https://en.cppreference.com/w/cpp/language/rule_of_three . This goes for Node, List, ... all the classes.
List::List()
Simpler to use
Node<T> * first{nullptr};
List::~List()
You are deleting the elements of the list front to back, each time traversing the list from front till you find index number i. While horrible inefficient the front nodes have also already been deleted. This is "use after free".
void List::insert(T data)
this->first = new Node<T>();
this->first->setData(data);
just write
first = new Node<T>(data);
And if insert will append to the tail of the list then why not keep track of the tail so the insert runs in O(1)?
void List::update(int index, T data)
If you need access to a list by index that is a clear sign that you are using the wrong data structure. Use a vector, not a list, if you need this.
void List::remove(int index)
As mentioned in comments there are 2 memory leaks here. Also aux->next->previous still points at the deleted aux likely causing "use after free" later on.
int List::size()
Nothing wrong here, that's a first. But if you need this frequently you could keep track of the size of the list in the List class.
Node * List::get(int index)
Nothing wrong except the place where you use this has already freed the nodes so this blows up. Missing the const counterpart. And again a strong indication you should be using a vector.
void List::set(int index, Node * value)
What's this supposed to do? Replace the n-th node in a list with a new node? Insert the node at a specific position? What it actually does it follow the list for index steps and then assign the local variable aux the value of value. Meaning it does absolutely nothing, slowly.
int List::find(T data)
Why return an index? Why not return a reference to the node? Also const and non-const version.
void List::sort()
This code looks like a bubblesort. Assuming it wasn't totaly broken by all the previous issues, would be O(n^4). I'm assuming the if(jMin != i) is supposed to swap the two elements in the list. Well, it's not.
I'm giving up now. This is all just the support classes to implement the BinaryTree, which itself is just support. 565 lines of code before you even start with your actual problem and it seems a lot of it broken one way or another. None of it can work with the state Node and List are in. Especially with copy construction / copy assignment of lists.

Related

Why can't a node in a linked list be responsible for deleting itself? (C++)

My idea is that once the first node is deleted from the linked list class then the rest will follow. My implementation does not work in practice. Is there solution to deleting a whole list node by node instead of the linked list deleting all of them? Should my nodeT have a destructor?
Linked list implementation:
#include <iostream>
#include <cassert>
#include "nodeT.h"
class linkedListSort
{
public:
int print();
//Function to output the elements of the list
//Postcondition: Elements of the list are output on the
// standard output device. Member current is reset to beginning node
// and currentIndex is reset to 0
//Returns: number of items printed in the list
// also outputs error if number of items printed
// does not equal the length of the list
void insertAt(int location, elemType& insertItem);
//Function to insert an item in the list at the
//position specified by location. The item to be inserted
//is passed as a parameter to the function.
//Postcondition: Starting at location, the elements of the
// list are shifted down, list[location] = insertItem;,
// and length++;. If the list is full or location is
// out of range, an appropriate message is displayed.
linkedListSort(int size = 100);
~linkedListSort();
protected:
//consider making const
nodeT<elemType> *beginningNode; // handle to the beginning of the list
nodeT<elemType> *current; // pointer to current node
int currentIndex; //int representing which node in the list current is pointing to
int length; //to store the length of the list
int maxSize; //to store the maximum size of the list
};
template <class elemType>
linkedListSort<elemType>::linkedListSort(int size)
{
if (size < 0)
{
cerr << "The array size must be positive. Creating "
<< "an array of size 100. " << endl;
maxSize = 100;
}
else
maxSize = size;
beginningNode = NULL;
current = NULL; // initialize to empty linked list
length = 0;
currentIndex = -1; // there is no item that current points to
}
template <class elemType>
linkedListSort<elemType>::~linkedListSort()
{
delete beginningNode; // this should delete all linked list items ( see nodeT destructor )
}
linked list node implementation:
template <class elemType>
class nodeT {
public:
nodeT(elemType& infoParam, nodeT<elemType> *linkParam); //standard
nodeT(elemType& infoParam); //if unlinked node (ex. last item)
nodeT();
//copy constructor
nodeT(nodeT<elemType>& node);
~nodeT();
elemType *info;
nodeT *link;
};
template<class elemType>
nodeT<elemType>::nodeT(elemType& infoParam, nodeT<elemType> *linkParam) {
info = &infoParam;
link = linkParam;
}
//when link is null (last item and uncircular)
template<class elemType>
nodeT<elemType>::nodeT(elemType& infoParam) {
info = &infoParam;
link = NULL;
}
//in case node is needed before info or link is known (default)
template<class elemType>
nodeT<elemType>::nodeT() {
info = NULL;
link = NULL;
}
template<class elemType>
nodeT<elemType>::nodeT(nodeT<elemType>& node) {
info = new elemType();
if (node.link != NULL)
link = new nodeT();
*info = *(node.info); // copy by value
if (node.link != NULL)
*link = *(node.link);
else
link = NULL;
}
template<class elemType>
nodeT<elemType>::~nodeT() {
delete info;
if (link != NULL)
delete link;
}
The last part of the node implementation is the node destructor. If the member of nodeT link is of type nodeT then the code delete link will call the same destructor but just on another instance. Therefore each node should destroy itself once the first node is destroyed. The first node is destroyed in the linked list implementation as such: delete beginningNode where beginningNode always points to the first node in the linked list.
Am I close to a solution? Or am I just going down a rabbit hole that C++ doesn't want you to go down?
The actual error has to do with an assertion failing. Then eventually I can copy this to my clipboard: "Unhandled exception at 0x553056E8 (msvcr120d.dll) in chap10Ex1.exe: 0xC0000005: Access violation reading location 0x00000002."

Technically your nodes aren't responsible for deleting themselves, they're responsible for deleting the next node in the list.
This may seem attractive, but there are some implications here that you may not have considered. First, what #WhozCraig said in a comment that you're going to end up building quite a deep call stack of destructors for a big list.
Secondly, if you managed to build yourself a circular link chain you're going to go all the way round it and then hit undefined behaviour when you try to delete the first node for the second time. Nothing in your code prevents that kind of misuse - these kinds of guarantees are one of the advantages of using a container class for the list which hides the operation of the nodes themselves from the clients.
I think there's also an issue here about ownership. Nodes don't allocate each other, but they are responsible for deleting each other, which means that each node owns the next node in the list. This may not be obvious from the API you provide, which requires the user of the list to create new nodes, but then when you add them to the list the list takes responsibility for deleting them. This means in client code there's no balance between new and delete, which is going to look a bit odd.
However, much worse in ownership terms is that the list node destructor calls delete info, which is created in one of the constructors as the address of a reference passed in which might not even be on the heap. You can't tell, nobody's made any promises there. At the very least you need to accept a pointer instead of a reference, as that's a hint that an ownership transfer is happening. Even better would be to accept a std::unique_ptr<elemType>, which makes the transfer of ownership very explicit (you can still provide access to the contents via a raw pointer).
In general I would advise that if a data structure is going to take responsibility for deleting something that data structure should also be responsible for creating it. Otherwise, you should leave it alone. STL containers don't delete contained pointers - if you delete a std::vector<int *> you have to delete all the int * members yourself first. This gives the user flexibility - they don't have to store pointers to things which are on the heap - and it means it's consistent - those responsible for creating something should, in general, also be responsible for disposing of it. It also means the std::list<T> can contain any T - including pointers. What happens if you try to instantiate a nodeT<int *>? What happens when its destructor runs?
So I'd say if you're going to have nodes deleting each other you should also have nodes creating each other (and don't let the user do it). And if you're going to have nodes deleting their data items you should most definitely also be creating those data items. Or better, just leave that alone and don't touch the lifecycle of something passed to you by reference.

reverse linked list using recursion

I am trying to reverse a linked list using recursion. I made the reverse() function to reverse the list. I created a linked list in main() and also defined print() method.
I don't know what mistake I am making. Please help me correct it. The code snippets are given below.
struct node
{
int data;
struct node *next;
}*head;
void reverse(node **firstnode,node *n)
{
if(n==NULL)
{
head=n;
return;
}
reverse(&head,n->next);
struct node *q=n->next;
n->next=q;
q->next=NULL;
}
void main()
{
......
head=first;
reverse(&first,first);
print(head);
}

It may not address your question directly. However, you mentioned C++11 in the tags. So, take look at std::forward_list. It is a standard container that is based on single linked-list.

List* recur_rlist(List* head)
{
List* result;
if(!(head && head->next))
return head;
result = recur_rlist(head->next);
head->next->next = head;
head->next = NULL;
return result;
}
void printList(List* head)
{
while(head != NULL) {
std::cout<<head->data<<" ";
head = head->next;
}
}
void main()
{
List* list = createNode(2);
append(list, createNode(3));
append(list, createNode(4));
append(list, createNode(5));
append(list, createNode(6));
List* revlist = recur_rlist(list);
printList(revlist);
}

I think you mixed up your addressing at the end of the reverse function, it should probably look like:
q->next=n;
n->next=NULL;
Also, I am not sure if you need the "firstnode" argument.

Since you want to understand the code, and you have several great resources with finished code already, more finished code examples aren't needed. I'll just answer with some concepts and point you at the errors you need to fix.
First, some background concepts.
Linked lists: first and rest
Any linked list is either empty, or can be broken down into first (a node) and rest (a smaller linked list, or empty). This makes recursion much easier.
if (head){
node * first = head;
node * rest = head->next;
}
Invariant (simplified): A guarantee that is always true at the start and end of your function.
In a linked list, you expect that head points to a node, which points to another node, and so forth, until you get to the end, which is signaled by a nullptr. All of the nodes are different. All of the nodes are valid. These are the guarantees that must be true before you call your function and when your function returns.
In a recursive function, the invariants must hold on the sublist you are reversing at every step, because you return from the function at every step. But this makes recursion much easier because all you have to do is make sure that if the input is good, then your function will return a good value at the current step.
End of recursion:
You can prove that your recursive function never gets in an infinite loop by combining the previous concepts. If the invariants hold, then each step will work, and because each recursive call will take rest, which is guaranteed to be either nullptr or a shorter list, eventually we have to reach the end. And of course show that you handle the end.
Okay, on to the actual problems:
You don't handle end of recursion correctly. You just set head=nullptr at the end, and I'm pretty sure that's not what you want for head. You may want to handle the end if (nullptr == n->next), because then you know that is the last node. Of course, you still have to correctly handle the trivial case where nullptr==head.
You don't preserve invariants. You tried, but it looks like your bookkeeping is just all wrong. I suggest using the debugger or 3x5 notecards to step through what you're actually doing to fix the actual work of swapping things around. For example, it looks like you just confused which node is which in this code snippet:
struct node *q=n->next; // n is first, q is rest
// what if nullptr == q?
n->next=q; // n->next = n->next doesn't actually change anything
q->next=NULL; // this must already be true if reverse(rest) did its job
// q and n were swapped?
Also, your function takes "firstnode" but does not use it, but instead sets the global variable "head" as a side effect.

How to add a Node pointer to a Vector pointer?

I am trying to create a maze that consists of Nodes objects. Each Node object has a member variable Node *attachedNodes[4] that essentially contains all of the attached Nodes that will later tell the program the options it has when it is doing a breadth first search. Every time I think that I understand pointers, another issue like this comes up, and I feel lost all over again. Especially since it was working fine (as far as I knew) until I changed something that I thought was unrelated. Anyways, here is where the issues are:
My Node object looks like this
class Node {
public:
...
void attachNewNode(Node *newNode, int index);
...
private:
...
Node *attachedNodes[4];
...
};
My function to attach the Nodes looks like this:
void Node::attachNewNode(Node *newNode, int index) {
*attachedNodes[index] = *newNode;
}
And then lastly, the part of the other function that is calling the attachNewNode function looks like this:
int mazeIndex = 0;
while (inStream.peek() != EOF) {
int count = 0;
Node n;
Node m;
...
if (System::isNode(name2)) {
m = System::findNode(name2);
}
else {
m = Node(name2);
maze[mazeIndex] = m;
mazeIndex++;
}
Node *temp;
*temp = m;
n.attachNewNode(temp, count); //The error usually happens here, but I added the rest of the code because through debugging it is only consistently in this whole area.
count++;
}
n.setNumberUsed(count);
}
Sorry that this got a little lengthy, but I've been searching all over this portion that I have provided trying to figure out what is wrong, but it would be nice to have someone that knows a little more about pointers give their input on the matter. The Node class was given to me, but everything else I made, so basically any of that could be changed. Thanks in advance for the help.

Your class contains a property:
Node *attachedNodes[4];
The above says that attachedNodes is an array that contains 4 pointers to Nodes. In your attachNewNode function, you do:
*attachedNodes[index] = *newNode;
This means that you are trying to assign value of newNode (as * dereferences the pointer) to the value of the element under attachedNodes[index]. What you probably want is:
attachedNodes[index] = newNode;
This means that you just want to store the address (as pointer is just an address to some place in memory) in the array of addresses.
There is also another error here:
Node *temp;
*temp = m;
n.attachNewNode(temp, count);
Again, you are interested in storing the address of node m. In order to do that, you need to get the said address:
Node *temp;
temp = &m;
n.attachNewNode(temp, count);
These are the most obvious problems with the above code, but there might be more.

Inserting Before/After Node in Linked List

I am trying to insert nodes in a list based on the value of a data member. Basically, if the member isVip evaluates to true, that node gets precedence, and should be inserted ahead of any regular node (but behind any existing VIP nodes). Regular nodes simply get added at the end of the list.
I'm pretty sure I have a good idea of how to use two pointers to step through the list and insert elements for n > 2 where n is the number of current list members, but I'm sort of conceptually stuck for the case when there's only one node.
Here is my working version of code below:
void SelfStorageList::rentLocker(Locker e) {
int count = 0;
LockerNode *p = head;
if (isEmpty()) {
head = new LockerNode(e);
tail = head;
}
for(;p!=0;count++, p=p->next) {
if(count == 1) {
if (e.isVip) {
if(p->objLocker.isVip) {
LockerNode*p = new LockerNode(e, p->next);
}
}
}
}
As you can see, I'm checking to see if the passed in object is VIP, and then whether the current one is. Here, I've hit some trouble. Assuming both are VIP, will this line:
LockerNode*p = new LockerNode(e, p->next);
put the passed in locker object in the correct place (i.e. after the current VIP one). If so, would:
LockerNode*p = new LockerNode(e, p);
equivalently place it before? Is the use or absence of the 'next' member of the node what defines the placement location, or is it something entirely different?
Hope someone can clear my doubts, and sorry if it seem a foolish question! Thanks!

Simply iterate over the list while the next node have isVip set (current->next->isVip). After the iteration, the last node visited will be the last with isVip set, and you should insert the new node after that one.
It can be implemented in fewer lines, without the explicit isEmpty check, and without any counter. Even less than that if you use a standard container instead.

Pointing to the first object in a linked list, inside or outside class?

Which of these is a more correct way to store the first object in a linked list?
Or could someone please point out the advantages/disadvantages of each. Thanks.
class Node
{
int var;
Node *next;
static Node *first;
Node()
{
if (first == NULL)
{
first = this;
next = NULL;
}
else
//next code here
}
}
}
Node* Node::first = NULL;
new Node();
-- OR --
class Node
{
int var;
Node *next;
Node()
{
//next code here
}
}
Node* first = new Node();

The latter is definitely preferable. By making the first node pointer a static class member, you are basically saying that there will only be a single linked list in your whole program.
The second example lets you create several lists.

The first example has the definite drawback of only being able to create a single linked list in your entire program, so I wouldn't do that.
The second works fine, but doesn't shield the user of the class from how the linked list works. It would be better to add a second class, for example named LinkedList, that stores the 'first' pointer and performs list management.
Even better, you could use std::list instead.

It's most usual to have a separate List class and a separate Node class. Node is usually very simple. List holds a pointer to the first Node and implements the various list operations (add, remove, find and so on).
Something like the following
class List
{
public:
List()
{
first = new Node();
}
void insert(int val);
void remove(int val);
// ... and so on
~List()
{
// ... clean up
}
private:
struct Node
{
int val;
Node* next;
Node(int val_ = 0, Node* next_ = 0)
: val(val_), next(next_)
{}
};
Node* first;
};
Note that you can place Node outside List if you want to, but this usually doesn't make much sense.

Presumably you may have more than one list? In which case, the static option is a non-starter.

You definitely don't want the "first" to be a static. This implies there's only one linked list in your entire program. static means that every Node in every linked list in your entire program has the same beginning.
That being said you want your Node to have the fewest responsibilities-- It make sense for it to store its value and be able to get to the next Node. It adds complexity to add the job of1 maintaining the "first" pointer. For example what happens when you insert a new element at the beginning? Then you'd have to update everyone's "first" pointer. Given the two choices above I'd chose the second choice.
I would furthermore add a third choice. Add a "linked list" wrapper that gave you easy access to "first", "last" and allow you to easily insert into and iterate through the list. Something like:
class LinkedList
{
Node* First;
Node* Last;
public:
Node* GetFirst() {return First;}
Node* GetLast() {return Last;}
// insert after "where"
void Insert(Node* where, Node* newNode);
...
}

Not uselessly limiting your code to a single list instance is one very good argument for code variant 2.
However, just from superficially looking at the two examples, the sheer number of lines of code also gives a good indication that variant 2 has merits over variant 1 by being significantly shorter.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Tree traversal falls into infinite loop (with huffman algorithm implementation) - c++

Related

Why can't a node in a linked list be responsible for deleting itself? (C++)

reverse linked list using recursion

How to add a Node pointer to a Vector pointer?

Inserting Before/After Node in Linked List

Pointing to the first object in a linked list, inside or outside class?

Categories

Resources