Binary Tree Copy Constructor - c++

I have a school assignment which requires me to create a binary tree in c++, complete with all the usual overloaded operators (=, ==, !=, copy, destroy, etc). I am currently struggling to program the copy constructor method. In this assignment, I am being asked NOT to use the binary tree class' insert method in either the operator= or copy constructor methods. Here's the binary tree class header file:
#ifndef BINTREE_H
#define BINTREE_H
#include <iostream>
#include "nodedata.h"
using namespace std;
class BinTree
{
public:
BinTree(); // constructor
BinTree(const BinTree &); // copy constructor, calls copyHelper
~BinTree(); // destructor, calls makeEmpty
bool insert(NodeData*); // insert method, inserts new nodes
bool isEmpty() const; // true if tree is empty, otherwise false
private:
struct Node
{
NodeData* data; // pointer to data object
Node* left; // left subtree pointer
Node* right; // right subtree pointer
};
Node* root; // root of the tree
Node& copyHelper(const Node*); // copies the tree recursively
void makeEmpty(Node*); // deletes nodes recursively
};
#endif
And here's what I've got so far for the copy constructor:
BinTree::BinTree(const BinTree &otherTree)
{
root = copyHelper(otherTree.root); // 1
}
Node& BinTree::copyHelper(const Node* other) // 2
{
if(other == NULL)
{
return NULL; // 3
}
Node newNode; // 4
if(other.data == NULL)
{
newNode.data = NULL; // 5
}
else
{
NodeData newNodeData(*other->data); // 6
newNode.data = newNodeData; // 7
}
newNode.left = copyHelper(other.left); // 8
newNode.right = copyHelper(other.right); // 9
return newNode; // 10
}
This is causing all manner of compile errors, and I don't understand any of them. Here's what I THINK should be happening:
•Overview: The entire tree will be copied from the ground up recursively. Each node should already contain data and links to all subsidiary nodes (if they exist) when it is returned to the node above it.
Since root is, by definition, a pointer to a Node object, there should be no issue with assigning it to a function which returns a Node object. However, the compiler is giving me a conversion error here.
The copyHelper function, which is called recursively, takes a pointer to one of the original nodes as an argument, and returns a copy of this node.
If there is no original node, then there's no point in building a copy of it, so a NULL value is returned.
newNode will eventually become a copy of "other".
If "other" does not have any NodeData linked to it, then newNode's data pointer will link to NULL instead.
Otherwise, a new NodeData called "newNodeData" will be created, using the NodeData copy constructor, which is called by dereferencing other's NodeData pointer.
newNode's data pointer now points to newNodeData, which now contains the same string as other's NodeData object.
newNode's left pointer will point to another node, which is created recursively by calling copyHelper on whatever Node other's left pointer is assigned to.
newNode's right pointer will point to another node, which is created recursively by calling copyHelper on whatever Node other's right pointer is assigned to.
Only once this node and all the nodes beneath it have been copied and returned to it will it be returned itself.
Here are my existing questions:
I'm assuming that we only need to use -> instead of . when we're dereferencing a pointer. Is this correct?
I know sometimes we need to use the "new" statement when creating objects (see part 4), but I've typically only seen this done when creating pointers to objects. When, specifically are we supposed to be using the "new" statement?
If I were to create newNode as a pointer (IE: Node* newNode = new Node;), wouldn't this cause a pointer to a pointer when it was returned? Wouldn't that be less efficient than simply having the first pointer point directly to the returned Node object? Is there some technical reason I can't do it this way?
When is it advisable to take a pointer by reference as a parameter, instead of simply taking the pointer, or even the object itself? Is this relevant here?
The compiler seems to think I'm declaring a Node named BinTree::copyHelper instead of defining a function that returns a Node. How can I prevent this?
In general, I think I have the concepts down, but the syntax is completely killing me. I literally spent all day yesterday on this, and I'm ready to admit defeat and ask for help. What mistakes do you guys see in my code here, and how can I fix them?

This line:
Node newNode; // 4
Allocates a Node on the stack. When the function returns, the Node is no longer valid. It has gone out of scope. It may work for a while if another function call doesn't rewrite the stack.
You need to do a New on the node.

Node& copyHelper(const Node*); // copies the tree recursively
Must that be a reference? It looks like it should be Node* instead.
In your declaration of copyHelper, the type Node is nested within BinTree but you fail to implement this. It should instead be
BinTree::Node* BinTree::copyHelper(const BinTree::Node* other)

Yes, -> dereferences a pointer. It's a shortcut for *pointer.member.
new creates object on the heap and returns a pointer - it does not create a pointer. Every time you create a object on the heap, you must use new.
Node * newNode = new Node; assigns pointer to newly created Node to newNode. Node * is pointer to object, not pointer to pointer, which would be Node **. You cannot do Node newNode = new Node;, as Node * (pointer) is not convertible to Node (object).
When you take parameter as reference, you are (generally) guaranteed that the parameter is not not. You also does not need to dereference the reference and you can just use . instead of ->.
"The compiler seems to think I'm declaring a Node named BinTree::copyHelper instead of defining a function that returns a Node. How can I prevent this?" - how did you come to such a conclusion?

After much fretting and pulling of hair, I have rewritten my copy constructor and gotten it to compile. There may well still be errors in it, but I'm feeling a lot better about it now thanks in part to the tips posted here. Thanks again! Here's the modified code:
BinTree::BinTree(const BinTree &otherTree)
{
root = copyHelper(otherTree.root); // Starts recursively copying Nodes from
// otherTree, starting with root Node.
}
BinTree::Node* BinTree::copyHelper(const Node* other) // Needed BinTree at beginning
// due to nested Node struct
// in BinTree class.
{
if(other == NULL)
{
return NULL; // If there's no Node to copy, return NULL.
}
Node* newNode = new Node; // Dynamically allocated memory will remain after
// function is no longer in scope. Previous newNode
// object here was destroyed upon function return.
if(other->data == NULL) // Other is a pointer to data, which is also a pointer.
// -> dereferences other, and provides the item at the
// memory address which other normally points to. It
// just so happens that since data is a pointer itself,
// it can still be treated as such. I had stupidly been
// attempting to use . instead of ->, because I was afraid
// the -> would dereference data as well. If I actually
// wanted to DO this, I'd have to use *other->data, which
// would dereference all of other->data, as if there were
// parenthesis around it like this: *(other->data).
// Misunderstanding this was the source of most of the
// problems with my previous implementation.
{
newNode->data = NULL; // The other Node doesn't contain data,
// so neither will this one.
}
else
{
NodeData* newNodeData = new NodeData; // This needed to be dynamically
// allocated as well.
*newNodeData = *other->data; // Copies data from other node.
newNode->data = newNodeData;
}
newNode->left = copyHelper(other->left); // Recursive call to left node.
newNode->right = copyHelper(other->right); // Recursive call to right node.
return newNode; // Returns after child nodes have been linked to newNode.
}

Related

C++: Will my pointer be dangling? How to assign a pointer a variable that exists in a function?

Im having difficulty describing this problem succinctly so be kind.
I have a Tree object that has an attribute root which is a pointer to a node object. When I initialize the Tree object the root is unknown so i assign it to a nullptr.
In a function after some computation I find the root node of a complete binary tree. I now want to hand this value over to my Tree.root pointer. However since this function is removed from the stack after execution and Tree.root pointer appears empty when I run it later.
class Tree{
public:
Node *root;
Tree(){
root = nullptr;
}
};
void worker(Tree *t){
// Perform some computation
// Since the var rootFound only exists in this function.
// After executing doesn't the memory address reallocated
// and therefore the root points to an unknown memory address?
t-> root = &rootFound;
}
int main(){
Tree t{};
Tree *ptr = &t;
worker(t);
// t pointer is null
return 0;
}
I was thinking I could assign the root pointer found by the function to the heap (use new ) and then assign my Tree pointer to it but Im not sure how to go about deleting this value. Also since any node has left and right Node pointer Im not sure if the pointers lose the memory address they are pointing too or if they too will be added to the heap.
I could also just be overthinking this.
Since you posted an incomplete code in your question, I will be forced to assume things.
I will assume your implementation look like this:
void worker(Tree *t){
Node rootFound;
// do stuff where eventually rootFound = something
t->root = &rootFound;
}
In this case, yes, t->root will point to a dead object once worker is finished.
I was thinking I could assign the root pointer found by the function to the heap (use new ) and then assign my Tree pointer to it but Im not sure how to go about deleting this value.
There is two kind of raw pointer in C++: owning pointer and non owning pointer.
If t->root is a owning pointer, it means you will call delete on it.
If t->root is not a owning pointer, it means you will not call delete on it.
Then if it is owning, you can totally do new Node{...} an assign to it.
If on the contrary it is not and you want to create an new tree in this function and delete it later, you will need to give an owning pointer back to the caller, something like this:
Node* worker(Tree* t) {
Node* rootFound = new Node{}; // create a whole new tree here
t->root = rootFound; // assign it to the tree
return rootFound; // return a owning pointer to the caller
}
Then, in your main:
int main(){
Tree t{};
Tree *ptr = &t;
Node* owning = worker(ptr);
// do stuff with t
// delete the owning pointer.
delete owning;
return 0;
}
Of course, there is a better way to separate owning and non owning pointer.
In modern C++, owning pointer are declared like this: std::unique_ptr<T> and non owning are written T* and assume you don't have to delete it.
If we change your data structure just a little bit to express what pointer is which, it would look something like this:
class Tree{
public:
// Here! Root is owning.
std::unique_ptr<Node> root;
Tree(){
root = nullptr;
}
};
// Tree is non-owning, so we write it just like before.
void worker(Tree* t){
// ...
}
The neat thing about std::unique_ptr is that is clears memory in its destructor so you don't have to worry about deleting:
int main() {
// make_unique will call new
std::unique_ptr<Node> node = std::make_unique<Node>();
// Here, unique_ptr will call delete
}
So in the end, Tree will clear up itself at the end of main:
int main(){
Tree t{};
Tree *ptr = &t;
// worker can do t->root = std::make_unique<Node>();
worker(ptr);
return 0;
// Here, t.root will call delete if not null
}

How to write destructor for deleting a tree where each node is a dynamically allocated structure containing several arrays?

I have a class that contains a structure called Node and memory for this is dynamically allocated. Using the add function I am creating more Node and connectin them through the next array pointers. I am only saving my head pointer, which points to my first node. I am trying to write a destructor like below. Is it ok?
struct Node{
bool arr[30];
bool end[30];
Node* next[30];
};
class ClassName{
Node *head;
Node* newNode(){
Node * cur = (Node*)malloc(sizeof(Node));
return cur;
}
public:
ClassName(){
head = newNode();
}
~ClassName(){
free(head);
}
void add(string s,int pos,Node *cur){// 1 base index
// adding new node and next array pointers will connect them
// So after adding some nodes it will form like a tree
}
};
~ClassName(){
free(head);
}
This frees the head node. It doesn't also free any of the nodes referred to by head->next[0..29]. So no, it's not OK if you actually allocated those nodes - you will have a memory leak.
Next problem, your next array is uninitialized, so unless it's always all populated (which is obviously impossible as your tree would have no leaves), there's no way to figure out which entries are real pointers and which are garbage values. So it's impossible to fix this leak with the code as shown.
We could fix the existing malloc code to properly initialize your objects, but it brings us on to the next oddity, which is using malloc and free for objects in C++ at all.
Using new and delete would be a modest improvement (at least Node could have a constructor and destructor to properly initialize and destroy itself), but switching to owning smart pointers instead of raw pointers would be best: they initialize and destroy themselves automatically with no extra work on your part.
struct Node{
// value-initialize all those bools to false
bool arr[30] {};
bool end[30] {};
// this will initialize all entries to nullptr, and
// also takes care of deleting them on destruction
std::array<std::unique_ptr<Node>, 30> next;
};
std::unique_ptr<Node> ClassName::newNode() {
return std::make_unique<Node>();
}
The best way is to use smart pointers. But if you would like to stick to built-in pointers, it is also very easy to achieve the goal. You only need to add a simple one-liner destructor for every class, and make sure every destructor frees memory at its own level. Then the class relationship guarantees that the memory are freed in a recursive postorder manner. The beauty of this approach is that you don't have to write a recursive postorder traversal yourself. The destructors recurse themselves.
Add the following destructor for Node class:
~Node() {
for (Node *p : next) {
delete p;
}
}
Use the following destructor for ClassName class:
~ClassName(){
delete head;
}
A note: To verify that the nodes on the tree are freed in a postorder manner, you may add a print statement after each delete statement.

Is using of delete operator correct in here I'm confused

I'm just a beginner at C++ and today while we are learning about linked list my teacher showed us how to delete from front at the linked list.The problem is I didn't understand why we are deleting pointer p which is static memory can't we just do it with second code that uses dynamic memory pointer which is temp? Head is dynamic memory pointer for object of type Node
//My teacher code
template <class T>
void DeleteFront(Node<T>* & head)
{
// save the address of node to be deleted
Node<T> *p = head;
// make sure list is not empty
if (head != NULL)
{
// move head to second node and delete original
head = head->NextNode();
delete p;//I didn't understand this line because our p declaration is static
}
}
//My suggestion
template <class T>
void DeleteFront(Node<T>* & head)
{
// save the address of node to temporary dynamic pointer
Node<T> *temp ;
temp=new Node<T>(head);
// make sure list is not empty
if (head != NULL)
{
// move temp to second node which will be showed by head
temp = head->NextNode();
delete head;//delete front item
head=temp;//assign the address of second node to head
}
}
p is not declared static. In fact it's a local variable. Static variables can only be declared in class scope, and are marked with the keyword static.
A static variable is a variable that is accessible independently of any objects and is valid for the class itself, having the same value for any object, or even without any objects. This is not the case, it's a very normal local variable, which resides on the stack and holds the pointer to a Node object.
When you call delete on that pointer, you delete what the pointer points to, not the pointer object on the stack itself. The pointer itself of course also has an address, which you would get with &p and could be saved into a Node<T>** variable. Though there is no need to do that. Calling delete on an object that is on the stack doesn't work.
Now regarding the code:
You are creating a new node with
Node<T> *temp ;
temp=new Node<T>(head);
which is not only unnecessary but in fact you are leaking memory, since you override the pointer value or exit the function without calling delete.
Every new should have a corresponding delete, at least when the pointer goes out of scope, such as the end of your function. The line is completely superfluous. A better way to initialize your a pointer is:
Node<T> *temp = nullptr;
Other than that, your code does the same thing. your teacher saves what is to be deleted into p, which works but might seem unintuitive, while you save what is to be saved (head->NextNode()). Both work.
Also NextNode() is a function and should have a lower case name. A function's name should also imply what it does. While this is easy enough to know in this case, NextNode isn't really a verb/action. getNextNode() would be a better name.
To improve on your teacher's code you could put the declaration of p into the if-block and save the p = head as well as the stack operations if the list is not empty, but I'm sure the compiler would do that for you.
Node<T> *p = head;
// make sure list is not empty
if (head != NULL)
{
// move head to second node and delete original
head = head->NextNode();
delete p;
}
This section of code assigns the pointer p to the head passed into it. It then checks if the head is NULL, and if it is not then it moves the head to the next node before deleting p -- which is not a static. It is local, but not static. This removes the original head from memory and removes most chances at a dangling pointer.
Node<T> *temp ;
temp=new Node<T>(head);
if (head != NULL)
{
temp = head->NextNode();
delete head;
head=temp;
}
This code deletes the original head after assigning the head to a temp object, then passes the object back to head. It does not delete the temp Node object. This would be a potential dangling reference, as the head is getting passed by address and the original address was never deleted when held by temp.

Clarification on passing a pointer by reference

This is kind of silly, but I can't really explain why this is happening. As an exercise, I wanted to reverse a singly-linkedlist and I did this by defining the method:
class solution {
void reverseLinkedList(Node*& head) {
Node* curr = head;
Node* prev = NULL;
while (curr != NULL) {
Node* _next = curr->next;
curr->next = prev;
prev = curr;
curr = _next;
}
head = prev;
}
In my main function, I make the call
solution s;
s.reverseLinkedList(head);
Node* iterator = head;
while (iterator != NULL) {
std::cout<<iterator->data<<std::endl;
iterator = iterator->next;
}
Where I previously defined my head pointer to some linkedlist. The while loop is for printing my linkedlist and the function does it job. This only worked after I passed the head node by reference; I initially tried to pass Node* head instead of Node*& head in the beginning, and it only printed the first element of my linkedlist (and without reversing it). For example, if I didn't pass by reference for a list 1->2->3, I would print out just 1.
I thought passing a pointer would be enough? Why did I get such weird behaviour without passing by reference>
Local variables in C++ (stored in the stack) have block scope, i.e., they run out of scope after the block in which they are defined is executed.
When you are passing in a pointer to the function, a copy of the pointer is created and that copy is what is passed. Once the function is executed, the variables in the function workspace run out of scope. Any non-static Automatic variables created within the function are destroyed.
When you pass in by reference you don't pass in a copy of the variable but you pass in the actual variable, thereby any changes made to the variable are reflected on the actual variable passed into the function(by reference).
I would like to point out that the pointer to the next node is stored in memory and has an address to the location it is stored. So if you want to not pass in by reference you can do this:
Use a pointer to the pointer, which points to the memory location at which pointer variable(address) to the next node is stored
Pass in this to the function (not by reference)
Dereference the pointer and store the new address you want to point to.
I know this is a bit confusing, but look into this small piece of code that adds a node to a linked list.
void addNode(Node** head, int newData)
{
Node* newNode = new Node;
newNode->data = newData; // Can also be done using (*newNode).data
newNode->next = *head;
*head = newNode;
}
I thought passing a pointer would be enough?
void reverseLinkedList(Node* head) // pass pointer by value
// head is a copy here
Passing a pointer by value creates a copy to be used inside the function.
Any changes made to that pointer inside the function is only reflected in the function scope. Since those changes are only reflected in the pointer's copy and not in the original pointer.
Once the pointer (copy) goes out of scope, those changes are "discarded" due to end of life.
Thus, you need a reference.
void reverseLinkedList(Node&* head) // changes made in head will be
// reflected in original head pointer
When you pass a pointer regularly (IE by value) it creates a copy of the pointer. Any changes made to this pointer do not effect the original pointer.
Passing a pointer by reference is sending a reference to that pointer (very similar to passing a pointer to a pointer) and therefor any changes made to that pointer are effecting its 'original' state.
For example:
//WRONG does not modify the original pointer, causes memory-leak.
void init(Object* ptr, int sz) {
ptr = new T[sz];
}
vs
//Correct original pointer is a set to a new block of memory
void init(Object*& ptr, int sz) {
ptr = new T[sz];
}
In
void reverseLinkedList(Node* head)
The pointer is passed by value.
This sounds silly, it's a freaking pointer, right? Kind-of the definition of pass by reference. Well, the Node that's being pointed at is passed by reference. The pointer itself, head is just another variable that happens to contain the address of some other variable, and it's not being passed by reference.
So head contains a copy of the Node pointer used to call reverseLinkedList, and as with all parameters passed by value, any modifications to the copy, pointing head somewhere else, are not represented in the calling function.

Destructor for a doubly-linked list that points to its value

Suppose I have a doubly-linked list defined by the class
class list
{
/*...*/
private:
struct node
{
node* prev;
node* next;
int* value;
}
node* first; //NULL if none
node* last; //NULL if none
/*...*/
}
If I wanted to make a destructor for this list do I have to explicitly delete the value?
list::~list()
{
node* move = first;
while(first)
{
first = move->next;
delete move;
move = first;
}
}
Would the above work to ensure that no memory is leaked? Or do I have to do:
list::~list()
{
node* move = first;
while(first)
{
first = move->next;
delete move->value;
delete move->prev;
delete move;
move = first;
}
}
I'm confused as to how I can make sure that no memory is leaked in this case. How do I deal with the pointers in the nodes specifically? If I delete move does it automatically take care of these?
You need to pair each new with exactly one delete. That is, you probably don't want to delete prev (this node already was deleted) but you want to delete value. Well, I'd embed the value into the object and not point to it:
struct node
{
node* prev;
node* next;
int value;
};
If the value absolutely needs to be a pointer, I'd use a std::unique_ptr<int> (or, if you need to use C++ 2003, a std::auto_ptr<int>).
For each successful new expression, call delete exactly once on that object.
For each successful new[] expression, call delete[] exactly once on that object.
That means that neither of your cleanup functions are OK:
The first function forgets to delete the value, which means a memory leak.
The second function, by deleting both move and move->prev at each node in the list, risks deleting most nodes twice, which is Undefined Behavior.
To avoid the memory leak for the first function, simply store the integer directly instead of allocating it dynamically.
Whether you have to delete the memory pointer by the value member - only you can know. It is a question of memory ownership, a question of your design. If the list owns the data memory pointed by the value members, then you have to delete it in the list destructor (i.e. when the list dies, the data it owned dies with it).
If the list does not own the value memory, then you are not supposed to delete it. Again, only you can answer the question of whether your list is supposed to own the value memory. It is a matter of your intent.
Now, as for the memory occupied by node objects, it is obviously owned by the list, so it has to be carefully deallocated in the destructor. The first version of your destcructor is close to being correct (the second makes no sense at all), except that it written in a slightly obfuscated fashion. This should be sufficient
list::~list()
{
while (first)
{
node *move = first;
first = first->next;
delete move;
}
}
(Again, if you have to delete value, then delete move->value should be added to your cycle before delete move.)
P.S. Once you get this, you might want to look into various smart pointer classes, which allow you to explicitly express memory ownership relationsips, thus making them known to the compiler. When used properly, they will make memory management almost automatic.