Recursive struct without pointers? (Huffman)

Recursive struct without pointers? (Huffman) - c++

I defined a binary tree this way:
struct btree {
int x;
btree* left_child = nullptr;
btree* right_child = nullptr;
};
Then I have a vector of floats (prob_distr) and I turn each float into a leaf and I put all the leafs into a priority queue (with a custom sort function but it doesn't matter here).
auto comp = [] (btree &a, btree &b) -> bool { return a.x > b.x; };
priority_queue<btree, vector<btree>, decltype(comp)> q(comp);
for(auto t: prob_distr)
{
btree leaf;
leaf.x = t;
q.push(leaf);
}
For the Huffman algorithm, I then loop on the priority queue until it only has 1 element left: I take the 2 nodes at the top out of the queue, I create a new node with these two nodes as children and I put these new node into the queue.
while(q.size() >= 2)
{
btree left = q.top(); q.pop();
btree right = q.top(); q.pop();
btree root;
root.x = left.x + right.x;
root.left_child = &left;
root.right_child = &right;
q.push(root);
}
The problem is that I have pointers to the left and right binary trees, and these trees are deleted when going out of the while loop. Then I have pointers pointing Nothing. Is there any way to solve this, for example by having a struct which stores the children trees and not just pointers, or by storing the nodes somewhere else without it gets too complicated?

You need to use pointers here, because the size of the structure has to be known at the compile time. If you would have struct itself as a member variable the memory requirement would recursively go to infinity.
You need to dynamically allocate the memory with new. The memory allocated with new is allocated until you explicitly release it by using delete. Just write
root.left_child = new btree(left);
root.right_child = new btree(right);
As already said, you also need to use delete to free the memory of the children. Be careful not to make copies of children which do not get deleted, and not to delete children which are still in use somewhere else. Consider using smart pointers.

Related

How to avoid using new operator in C++?

I have a C++ program that creates Huffman codes for all characters in file. It works good, but I want to create nodes without using new operator because I know that you shouldn't use it. I tried using a vector global variable for saving nodes but that doesn't work.
std::vector<Node> nodes;
Node* create_node(unsigned char value, unsigned long long counter, Node* left, Node* right) {
Node temp;
temp.m_value = value;
temp.m_counter = counter;
temp.m_left = left;
temp.m_right = right;
nodes.push_back(temp);
return &nodes[nodes.size() - 1];
}
Edit: I added more code, I did't really explained what doesn't work. Problem is in generate_code(), it never reaches nullptr. I also tried using Node and not Node* but the same thing happened.
void generate_code(Node* current, std::string code, std::map<unsigned char, std::string>& char_codes) {
if (current == nullptr) {
return;
}
if (!current->m_left && !current->m_right) {
char_codes[current->m_value] = code;
}
generate_code(current->m_left, code + "0", char_codes);
generate_code(current->m_right, code + "1", char_codes);
}
void huffman(std::ifstream& file) {
std::unordered_map<unsigned char, ull> char_frequency;
load_data(file, char_frequency);
std::priority_queue<Node*, std::vector<Node*>, Comparator> queue;
for (auto& node : char_frequency) {
queue.push(create_node(node.first, node.second, nullptr, nullptr));
}
while (queue.size() != 1) {
Node* left = queue.top();
queue.pop();
Node* right = queue.top();
queue.pop();
auto counter = left->m_counter + right->m_counter;
queue.push(create_node('\0', counter, left, right));
}
std::map<unsigned char, std::string> char_codes;
Node* root = queue.top();
generate_code(root, "", char_codes);
for (auto& i : char_codes) {
std::cout << +i.first << ": " << i.second << "\n";
}
}

The general answer is of course to use smart pointers, like std::shared_ptr<Node>.
That said, using regular pointers is not that bad, especially if you hide all pointers from the outside. I wouldn't agree with "you shouldn't use new", more like "you should realize that you have to make sure not to create a memory leak if you do".
In any case, for something like you do, especially with your vector, you don't need actual pointers at all. Simply store an index for your vector and replace every occurence of Node* by int, somewhat like:
class Node
{
public:
// constructors and accessors
private:
ValueType value;
int index_left;
int index_right;
}
I used a signed integer as index here in order to allow storing -1 for a non-existent reference, similar to a null pointer.
Note that this only works if nothing gets erased from the vector, at least not before everything is destroyed. If flexibility is the key, you need pointers of some sort.
Also note that you should not have a vector as a global variable. Instead, have a wrapping class, of which Node is an inner class, somewhat like this:
class Tree
{
public:
class Node
{
...
};
// some methods here
private:
vector<Node> nodes;
}
With such an approach, you can encapsulate your Node class better. Tree should most likely be a friend. Each Node would store a reference to the Tree it belongs to.
Another possibility would be to make the vector a static member for Node, but I would advise against that. If the vector is a static member of Node or a global object, in both cases, you have all trees you create being in one big container, which means you can't free your memory from one of them when you don't need it anymore.
While this would technically not be a memory leak, in practice, it could easily work as one.
On the other hand, if it is stored as a member of a Tree object, the memory is automatically freed as soon as that object is removed.

but I want to create nodes without using new operator because I know that you shouldn't use it.
The reason it is discouraged to use new directly is that the semantics of ownership (i.e. who is responsible for the corresponding delete) isn't clear.
The c++ standard library provides the Dynamic memory management utilities for this, the smart pointers in particular.
So I think your create function should look like follows:
std::unique_ptr<Node> create_node(unsigned char value, unsigned long long counter, Node* left, Node* right) {
std::unique_ptr<Node> temp = std::make_unique<Node>();
temp->m_value = value;
temp->m_counter = counter;
temp->m_left = left;
temp->m_right = right;
return temp;
}
This way it's clear that the caller takes ownership of the newly created Node instance.

How to delete dynamically allocated struct consisting dynamically allocated array member in C++?

I am doing my homework which requires me to implement a Trie Tree without using vector. I have a struc defined as following:
typedef struct{
char _name;
int32_t * _children;
int32_t _children_size;
int32_t _children_capacity;
int32_t * _location;
int32_t _location_size;
int32_t _location_capacity;
} TrieTreeNode;
To reduce the memory use, I store all the pointers of TrieTreeNode into a global variable TrieTreeNode ** nodes_array. Then the _children member of each TrieTreeNode is just an array whose elements are int32_t indices to nodes_array.
For example, say we have TrieTreeNode * parent. To access its first child, we use nodes_array[parent -> _children[0]].
My question is how to delete the whole Trie Tree? I have tried the following approach (tail is the number of pointers nodes_array has):
void delete_node(TrieTreeNode *node){
delete [] node -> _children;
delete [] node -> _location;
}
void delete_tree(){
for (int i = 0; i < tail; i++){
delete_node(nodes_array[i]);
}
delete [] nodes_array;
nodes_array = NULL;
}
However, when I used both -ps -l command and GDB to monitor the memory use of my program before and after deleting a tree, the memory only decreases a little bit. The RRS goes from 13744 to 13156, while it is only 1072 before I build the tree.
Any suggestions will be appreciated!

You are not deleting the nodes, only the pointers within each node.
Consider this:
void delete_tree(){
for (int i = 0; i < tail; i++){
delete_node(nodes_array[i]);
delete node_array[i]; // Delete the node itself.
}
delete [] nodes_array;
nodes_array = NULL;
}
After calling delete_node to free the two pointers in each node, you should then delete the node itself delete node_array[i] to free up the remaining memory for each node.
Personally though, I am a fan of defining constructors and destructors for structures so that I don't have to remember to initialize everywhere I create them or do the extra deletion everywhere I might dispose of one.

Using the destructor to free linked objects

I have a class called "node". I link a bunch of node objects together to form a linked list. When the "node" destructor is called, it only deletes the first node. How do I iterate through the entire linked list of nodes and delete each node object?
Here is the class definition:
class Node
{
private:
double coeff;
int exponent;
Node *next;
public:
Node(double c, int e, Node *nodeobjectPtr)
{
coeff = c;
exponent = e;
next = nodeobjectPtr;
}
~Node()
{
printf("Node Destroyed");
}
The destructor is called by invoking delete on the pointer to the first node of the linked node list.

Since you don't know how many nodes there are in a list, if you do not have firm bounds on that it's not a good idea to invoke destructors recursively, because each call uses some stack space, and when available stack space is exhausted you get Undefined Behavior, like a crash.
So if you absolutely want to do deallocate following nodes in a node's destructor, then it has to first unlink each node before destroying it.
It can go like this:
Node* unlink( Node*& p )
{
Node* result = p;
p = p->next;
result->next = nullptr;
return result;
}
Node::~Node()
{
while( next != nullptr )
{
delete unlink( next );
}
}
But better, make a List object that has ownership of the nodes in a linked list.
Of course, unless this is for learning purposes or there is a really good reason to roll your own linked list, just use a std::vector (and yes I mean that, not std::list).

How do I iterate through the entire linked list of nodes and delete each node object?
It would be cleaner if you had a separate class to manage the entire list, so that nodes can be simple data structures. Then you just need a simple loop in the list's destructor:
while (head) {
Node * victim = head;
head = victim->next; // Careful: read this before deleting
delete victim;
}
If you really want to delegate list management to the nodes themselves, you'll need to be a bit more careful:
while (next) {
Node * victim = next;
next = victim->next;
victim->next = nullptr; // Careful: avoid recursion
delete victim;
}
Under this scheme, you'll also need to be careful when deleting a node after removing it from the list - again, make sure you reset its pointer so it doesn't delete the rest of the list. That's another reason to favour a separate "list" class.

Pointers and reference issue

I'm creating something similar to structure list. At the beginning of main I declare a null pointer. Then I call insert() function a couple of times, passing reference to that pointer, to add new elements.
However, something seems to be wrong. I can't display the list's element, std::cout just breaks the program, even though it compiler without a warning.
#include <iostream>
struct node {
node *p, *left, *right;
int key;
};
void insert(node *&root, const int key)
{
node newElement = {};
newElement.key = key;
node *y = NULL;
std::cout << root->key; // this line
while(root)
{
if(key == root->key) exit(EXIT_FAILURE);
y = root;
root = (key < root->key) ? root->left : root->right;
}
newElement.p = y;
if(!y) root = &newElement;
else if(key < y->key) y->left = &newElement;
else y->right = &newElement;
}
int main()
{
node *root = NULL;
insert(root, 5);
std::cout << root->key; // works perfectly if I delete cout in insert()
insert(root, 2);
std::cout << root->key; // program breaks before this line
return 0;
}
As you can see, I create new structure element in insert function and save it inside the root pointer. In the first call, while loop isn't even initiated so it works, and I'm able to display root's element in the main function.
But in the second call, while loop already works, and I get the problem I described.
There's something wrong with root->key syntax because it doesn't work even if I place this in the first call.
What's wrong, and what's the reason?
Also, I've always seen inserting new list's elements through pointers like this:
node newElement = new node();
newElement->key = 5;
root->next = newElement;
Is this code equal to:
node newElement = {};
newElement.key = 5;
root->next = &newElement;
? It would be a bit cleaner, and there wouldn't be need to delete memory.

The problem is because you are passing a pointer to a local variable out of a function. Dereferencing such pointers is undefined behavior. You should allocate newElement with new.
This code
node newElement = {};
creates a local variable newElement. Once the function is over, the scope of newElement ends, and its memory gets destroyed. However, you are passing the pointer to that destroyed memory to outside the function. All references to that memory become invalid as soon as the function exits.
This code, on the other hand
node *newElement = new node(); // Don't forget the asterisk
allocates an object on free store. Such objects remain available until you delete them explicitly. That's why you can use them after the function creating them has exited. Of course since newElement is a pointer, you need to use -> to access its members.

The key thing you need to learn here is the difference between stack allocated objects and heap allocated objects. In your insert function your node newElement = {} is stack allocated, which means that its life time is determined by the enclosing scope. In this case that means that when the function exits your object is destroyed. That's not what you want. You want the root of your tree to stored in your node *root pointer. To do that you need to allocate memory from the heap. In C++ that is normally done with the new operator. That allows you to pass the pointer from one function to another without having its life time determined by the scope that it's in. This also means you need to be careful about managing the life time of heap allocated objects.

Well you have got one problem with your Also comment. The second may be cleaner but it is wrong. You have to new memory and delete it. Otherwise you end up with pointers to objects which no longer exist. That's exactly the problem that new solves.
Another problem
void insert(node *&root, const int key)
{
node newElement = {};
newElement.key = key;
node *y = NULL;
std::cout << root->key; // this line
On the first insert root is still NULL, so this code will crash the program.

It's already been explained that you would have to allocate objects dynamically (with new), however doing so is fraught with perils (memory leaks).
There are two (simple) solutions:
Have an ownership scheme.
Use an arena to put your nodes, and keep references to them.
1 Ownership scheme
In C and C++, there are two forms of obtaining memory where to store an object: automatic storage and dynamic storage. Automatic is what you use when you declare a variable within your function, for example, however such objects only live for the duration of the function (and thus you have issues when using them afterward because the memory is probably overwritten by something else). Therefore you often must use dynamic memory allocation.
The issue with dynamic memory allocation is that you have to explicitly give it back to the system, lest it leaks. In C this is pretty difficult and requires rigor. In C++ though it's made easier by the use of smart pointers. So let's use those!
struct Node {
Node(Node* p, int k): parent(p), key(k) {}
Node* parent;
std::unique_ptr<Node> left, right;
int key;
};
// Note: I added a *constructor* to the type to initialize `parent` and `key`
// without proper initialization they would have some garbage value.
Note the different declaration of parent and left ? A parent owns its children (unique_ptr) whereas a child just refers to its parent.
void insert(std::unique_ptr<Node>& root, const int key)
{
if (root.get() == nullptr) {
root.reset(new Node{nullptr, key});
return;
}
Node* parent = root.get();
Node* y = nullptr;
while(parent)
{
if(key == parent->key) exit(EXIT_FAILURE);
y = parent;
parent = (key < parent->key) ? parent->left.get() : parent->right.get();
}
if (key < y->key) { y->left.reset(new Node{y, key}); }
else { y->right.reset(new Node{y, key}); }
}
In case you don't know what unique_ptr is, the get() it just contains an object allocated with new and the get() method returns a pointer to that object. You can also reset its content (in which case it properly disposes of the object it already contained, if any).
I would note I am not too sure about your algorithm, but hey, it's yours :)
2 Arena
If this dealing with memory got your head all mushy, that's pretty normal at first, and that's why sometimes arenas might be easier to use. The idea of using an arena is pretty general; instead of bothering with memory ownership on a piece by piece basis you use "something" to hold onto the memory and then only manipulate references (or pointers) to the pieces. You just have to keep in mind that those references/pointers are only ever alive as long as the arena is.
struct Node {
Node(): parent(nullptr), left(nullptr), right(nullptr), key(0) {}
Node* parent;
Node* left;
Node* right;
int key;
};
void insert(std::list<Node>& arena, Node *&root, const int key)
{
arena.push_back(Node{}); // add a new node
Node& newElement = arena.back(); // get a reference to it.
newElement.key = key;
Node *y = NULL;
while(root)
{
if(key == root->key) exit(EXIT_FAILURE);
y = root;
root = (key < root->key) ? root->left : root->right;
}
newElement.p = y;
if(!y) root = &newElement;
else if(key < y->key) y->left = &newElement;
else y->right = &newElement;
}
Just remember two things:
as soon as your arena dies, all your references/pointers are pointing into the ether, and bad things happen should you try to use them
if you ever only push things into the arena, it'll grow until it consumes all available memory and your program crashes; at some point you need cleanup!

How to delete a binary search tree from memory?

I have a BST which is a linked list in C++. How would I delete the whole thing from memory? Would it be done from a class function?

Just delete the children:
struct TreeNode {
TreeNode *l, *r, *parent;
Data d;
TreeNode( TreeNode *p ) { l = nullptr; r = nullptr; parent = p; }
TreeNode( TreeNode const & ) = delete;
~TreeNode() {
delete l; // delete does nothing if ptr is 0
delete r; // or recurses if there's an object
}
};
or if you're using unique_ptr or some such, that's not even needed:
struct TreeNode {
unique_ptr< TreeNode > l, r;
TreeNode *parent;
Data d;
TreeNode( TreeNode *p ) { l = nullptr; r = nullptr; parent = p; }
TreeNode( TreeNode const & ) = delete;
~TreeNode() = default;
};

If you have access to the linked list itself, it's a piece of cake:
// Making liberal assumptions about the kind of naming / coding conventions that might have been used...
ListNode *currentNode = rootNode;
while(currentNode != NULL)
{
ListNode *nextNode = currentNode->Next;
delete currentNode;
currentNode = nextNode;
}
rootNode = NULL;
If this is a custom implemention of a BST, then this may well be how it works internally, if it has tied itself to a particular data structure.
If you don't have access to the internals, then Potatoswatter's answer should be spot on. Assuming the BST is setup as they suggest, then simply deleting the root node should automatically delete all the allocated memory as each parent down the tree will delete its children.
If you are asking how to go about iterating across a binary tree manually, then you would do the following recursive step:
void DeleteChildren(BSTNode *node)
{
// Recurse left down the tree...
if(node->HasLeftChild()) DeleteChildren(node->GetLeftChild());
// Recurse right down the tree...
if(node->HasRightChild()) DeleteChildren(node->GetRightChild());
// Clean up the data at this node.
node->ClearData(); // assume deletes internal data
// Free memory used by the node itself.
delete node;
}
// Call this from external code.
DeleteChildren(rootNode);
I hope I've not missed the point here and that something of this helps.

Perform a post-order traversal of the tree (i.e. visiting children before parents), and delete each node as you visit it.
Whether or not this has anything to do with classes depends entirely on your implementation.

With the limited information provided ....
If you allocated the nodes with new or malloc (or related functions) than you need to traverse over all the nodes and free or delete them.
An alternative is to put shared_ptr's (and weak_ptr's to kill cyclics) in your allocations -- provided you do it correctly you won't have to free the nodes manually
If you used a quality implementation that you picked up on the internet than provided the classes don't leak, you don't have to worry about anything.

Use smart pointers and forget about it.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Recursive struct without pointers? (Huffman) - c++

Related

How to avoid using new operator in C++?

How to delete dynamically allocated struct consisting dynamically allocated array member in C++?

Using the destructor to free linked objects

Pointers and reference issue

How to delete a binary search tree from memory?

Categories

Resources