Custom iterator on a tree structure in c++ - c++

I am implementing a tree structure in c++ with a node class like this:
class Node {
protected:
// relations
Node *_parent;
std::vector<Node*> _children;
public:
// some example method
void someMethod(Node *node) {
// do something with *node
for (int i = 0; i < node->_children; i++) {
_children[i]->myFunction;
}
}
}
Now, to work on the nodes in my tree I am implementing recursive functions like someMethod in my example.
It works, but I end up writing the same recursion code over and over again for every new function that works on my tree.
Is there a generic way to iterate a tree structure like I would on a plain array?
Some method that returns the next object, until I'm done with the whole branch.
EDIT:
Thanks to everybody who has commented so far, with your help I could narrow down the problem.
From my understanding (I'm new to c++), I need an iterator class that encapsulates the code for traversing my tree.
Accessing all tree members should be as simple as that:
for (Node<Node*>::iterator it = _node.begin(); it != _node.end(); ++it) {
Node *node = *it;
// do something with *node
}
Now the question is:
How do I implement such an iterator?

Pass a function pointer to the recursive function that returns the node that you are seeking.
This is the power of function pointers and function pointer arrays in C/C++.

Many function do not simply iterate over all nodes, if the tree is (normally) sorted, then to find the largest value you will only look in the right subtree.
If you search the minimum it is in the left most subtree.
Therefore not always it makes sense, to have an iterator that iterates the whole tree.
But if you need exactly to iterate over all nodes, you can use function pointers, or the Visitor Pattern (Erich Gamma, Design Patterns).

Related

Tree traversal falls into infinite loop (with huffman algorithm implementation)

I am trying implementing the huffman algorithm following the steps described in this tutorial: https://www.programiz.com/dsa/huffman-coding, and so far I got this code:
void encode(string filename) {
List<HuffmanNode> priorityQueue;
List<Node<HuffmanNode>> encodeList;
BinaryTree<HuffmanNode> toEncode;
//Map<char, string> encodeTable;
fstream input;
input.open(filename, ios_base::in);
if (input.is_open()) {
char c;
while (!input.eof()) {
input.get(c);
HuffmanNode node;
node.data = c;
node.frequency = 1;
int pos = priorityQueue.find(node);
if(pos) {
HuffmanNode value = priorityQueue.get(pos)->getData();
value++;
priorityQueue.update(pos, value);
} else {
priorityQueue.insert(node);
}
}
}
input.close();
priorityQueue.sort();
for(int i=1; i<=priorityQueue.size(); i++)
encodeList.insert( priorityQueue.get(i) );
while(encodeList.size() > 1) {
Node<HuffmanNode> * left = new Node<HuffmanNode>(encodeList.get(1)->getData());
Node<HuffmanNode> * right = new Node<HuffmanNode>(encodeList.get(2)->getData());
HuffmanNode z;
z.data = 0;
z.frequency = left->getData().frequency + right->getData().frequency;
Node<HuffmanNode> z_node;
z_node.setData(z);
z_node.setPrevious(left);
z_node.setNext(right);
encodeList.remove(1);
encodeList.remove(1);
encodeList.insert(z_node);
}
Node<HuffmanNode> node_root = encodeList.get(1)->getData();
toEncode.setRoot(&node_root);
}
full code for the main.cpp here: https://pastebin.com/Uw5g9s7j.
When I try run this, the program read the bytes from the file, group each character by frequency and order the list, but when I try generate the huffman tree, I am unable to traverse this tree, always falling into a infinte loop (the method get stuck in the nodes containing the 2 first items from the priorityQueue above).
I tried the tree class with BinaryTree<int>, and everything works fine in this case, but with the code above the issue happens. The code for the tree is this (in the code, previous == left and next == right - I am using here the same Node class already implemented for my List class): https://pastebin.com/ZKLjuBc8.
The code for the List used in this example is: https://pastebin.com/Dprh1Pfa. And the code for the Node class used for both the List and the BinaryTree classes is: https://pastebin.com/ATLvYyft. Anyone can tell me what I am missing here? What I am getting wrong here?
UPDATE
I have tried a version using only c++ stl (with no custom List or BinaryTree implementations),but the same problem happened. The code is that: https://pastebin.com/q0wrVYBB.
Too many things to mention as comments so I'm using an answer, sorry:
So going top to bottom through the code:
Why are you defining all methods outside the class? That just makes the code so much harder to read and is much more work to type.
Node::Node()
NULL is C code, use nullptr. And why not use member initialization in the class?
class Node {
private:
T data{};
Node * previous{nullptr};
Node * next{nullptr};
...
Node::Node(Node * node) {
What is that supposed to be? You create a new node, copy the value and attach it to the existing list of Nodes like a Remora.
Is this supposed to replace the old Node? Be a move constructor?
Node::Node(T data)
Write
Node<T>::Node(T data_ = T{}) : data{data_} { }
and remove the default constructor. The member initialization from (1) initializes the remaining members.
Node::Node(T data, Node * previous, Node * next)
Again creating a Remora. This is not inserting into an existing list.
T Node::getData(), void Node::setData(T value)
If everyone can get and set data then just make it public. That will also mean it will work with cons Node<T>. Your functions are not const correct because you lack all the const versions.
Same for previous and next. But those should actually do something when you set the member. The node you point to should point back to you or made to do so:
void Node::setPrevious(Node * previous) {
// don't break an existing list
assert(this->previous == nullptr);
assert(previous->next == nullptr);
this->previous = previous;
previous->next = this;
}
Think about the copy and move constructors and assignment.
Follow the rule of 0/3/5: https://en.cppreference.com/w/cpp/language/rule_of_three . This goes for Node, List, ... all the classes.
List::List()
Simpler to use
Node<T> * first{nullptr};
List::~List()
You are deleting the elements of the list front to back, each time traversing the list from front till you find index number i. While horrible inefficient the front nodes have also already been deleted. This is "use after free".
void List::insert(T data)
this->first = new Node<T>();
this->first->setData(data);
just write
first = new Node<T>(data);
And if insert will append to the tail of the list then why not keep track of the tail so the insert runs in O(1)?
void List::update(int index, T data)
If you need access to a list by index that is a clear sign that you are using the wrong data structure. Use a vector, not a list, if you need this.
void List::remove(int index)
As mentioned in comments there are 2 memory leaks here. Also aux->next->previous still points at the deleted aux likely causing "use after free" later on.
int List::size()
Nothing wrong here, that's a first. But if you need this frequently you could keep track of the size of the list in the List class.
Node * List::get(int index)
Nothing wrong except the place where you use this has already freed the nodes so this blows up. Missing the const counterpart. And again a strong indication you should be using a vector.
void List::set(int index, Node * value)
What's this supposed to do? Replace the n-th node in a list with a new node? Insert the node at a specific position? What it actually does it follow the list for index steps and then assign the local variable aux the value of value. Meaning it does absolutely nothing, slowly.
int List::find(T data)
Why return an index? Why not return a reference to the node? Also const and non-const version.
void List::sort()
This code looks like a bubblesort. Assuming it wasn't totaly broken by all the previous issues, would be O(n^4). I'm assuming the if(jMin != i) is supposed to swap the two elements in the list. Well, it's not.
I'm giving up now. This is all just the support classes to implement the BinaryTree, which itself is just support. 565 lines of code before you even start with your actual problem and it seems a lot of it broken one way or another. None of it can work with the state Node and List are in. Especially with copy construction / copy assignment of lists.

C++ Tree Data Structure

Background:
So I've been porting some of my older Java code to C++, and I've come across an issue that's making proceeding quite difficult. My project uses a tree data-structure to represent the node hierarchy for 3D animation.
Java:
public final class Node {
private final Node mParent;
private final ArrayList<Node> mChildren;
//private other data, add/remove children / parents, etc ...
}
In Java, its quite simple to create a tree that allows for modification etc.
Problem:
I'm running into issues is with C++, arrays cannot easily be added to without manually allocating a new chunk of memory and having the existing ones moved over so I switched to std::vector. Vectors have the issue of doing what I just described internally making any pointers to there elements invalid. So basically if you wan't to use pointers you need a way to back them so memory holding the actual nodes doesn't move. I herd you can use std::shared_ptr/std::unique_ptr to wrap the nodes in the std::vector, and I tried to play around with that approach but it becomes quite unwieldy. Another option would be to have a "tree" class that wraps the node class and is the interface to manipulate it, but than (for my use case) it would be quite annoying to deal with cutting branches off and making them into there own trees and possibly attaching different branches.
Most examples I see online are Binary trees that have 2 nodes rather than being dynamic, or they have many comments about memory leaks / etc. I'm hoping there's a good C++ alternative to the java code shown above (without memory leak issues etc). Also I won't be doing ANY sorting, the purpose of the tree is to maintain the hierarchy not to sort it.
Honestly I'm really unsure of what direction to go, I've spent the last 2 days trying different approaches but none of them "feel" right, and are usually really awkward to manage, any help would be appreciated!
Edit:
An edit as to why shared_ptrs are unwieldy:
class tree : std::enable_shared_from_this<tree> {
std::shared_ptr<tree> parent;
std::vector<std::shared_ptr<tree>> children;
public:
void set_parent(tree& _tree) {
auto this_shared_ptr = shared_from_this();
if (parent != nullptr) {
auto vec = parent->children;
auto begin = vec.begin();
auto end = vec.end();
auto index = std::distance(begin, std::find_if(begin, end, [&](std::shared_ptr<tree> const& current) -> bool {
return *current == this_shared_ptr;
}));
vec.erase(std::remove(begin, end, index), end);
}
parent = std::shared_ptr<tree>(&_tree);
if (parent != nullptr) {
parent->children.push_back(this_shared_ptr);
}
}
};
working with pointers like above becomes really quite verbose, and I was hoping for a more simple solution.
You could store your nodes in a single vector and use relative pointers that are not changed when the vectors are resized:
typedef int32_t Offset;
struct Node {
Node(Offset p) : parent(p) {}
Offset parent = 0; // 0 means no parent, so root node
std::vector<Offset> children;
};
std::vector<Node> tree;
std::vector<uint32_t> free_list;
To add a node:
uint32_t index;
if (free_list.empty()) {
index = tree.size();
tree.emplace_back(parent_index - tree.size());
} else {
index = free_list.back();
free_list.pop_back();
tree[index].parent = parent_index - index;
}
tree[parent_index].children.push_back(index - parent_index);
To remove a node:
assert(node.children.empty());
if (node.parent) {
Node* parent = &node + node.parent;
auto victim = find(parent->children.begin(), parent->children.end(), -node.parent);
swap(*victim, parent->children.back()); // more efficient than erase from middle
parent->children.pop_back();
}
free_list.push_back(&node - tree.data());
The only reason for the difference you're seeing is if you put the objects directly in the vector itself in c++ (which you cannot do in Java.) Then their addresses are bound to the current allocated buffer in the vector. The difference is in Java, all the objects themselves are allocated, so only an "object reference" is actually in the array. The equivalent in c++ would be to make a vector of pointers (hopefully wrapped in smart pointer objects) so the vector elements only are an address, but the objects live in fixed memory. It adds an extra pointer hop, but then would behave more like what you expect in java.
struct X {
char buf[30];
};
std::vector<X> myVec{ X() };
Given the above, the X elements in myVec are contiguous, in the allocation. sizeof(myVec[0]) == sizeof(X). But if you put pointers in the vector:
std::vector<unique_ptr<X>> myVec2{ make_unique<X>() };
This should behave more like what you want, and the pointers will not become invalid when the vector resizes. The pointers will merely be copied.
Another way you could do this would be to change things a little in your design. Consider an alternate to pointers entirely, where your tree contains a vector of elements, and your nodes contain vectors of integers, which are the index into that vector.
vector, forward_list, ..., any std container class (other than built-in array or std::array) may be used.
Your trouble seems to be that java classes are refrence types, while C++ classes are value types. The snippet below triggers "infinite recursion" or "use of incomplete type" error at compiletime:
class node{
node mParent;//trouble
std::vector<node> children;
//...
};
the mParent member must be a reference type. In order to impose reference semantics you can make it a raw pointer:
node* mParent;
you may also use pointer as the argument type to the container, but as a C++ beginer that would most probably lead to memory leaks and wierd runtime errors. we should try to stay away from manual memory management for now. So the I modify your snippet to:
class node{
private:
node* const mParent;
std::vector<node> children;
public:
//node(node const&)=delete;//do you need copies of nodes? you have to properly define this if yes.
node(node *parent):
mParent{parent}{};
void addChild(/*???*/){
children.emplace_back(this);
//...
};
//...
};

[c++ / pointers]: having objects A and B (B has vector member, which stores pointer to A), knowing A is it possible to retrieve pointer to B?

While trying to learn c++, I tried to implement class representing very basic trie. I came up with the following:
class Trie {
public:
char data;
vector<Trie* > children;
Trie(char data);
Trie* addChild(Trie* ch); // adds child node
(skipped others members/methods)
};
Method addChild checks if child ch with the same data is present in vector children, if not then it inserts it there, if yes - returns pointer to already existing child.
Now, considering this code snippet:
Trie t('c');
Trie* firstchild = new Trie('b');
Trie* secondchild = new Trie('a');
firstchild->addChild(secondchild);
t.addChild(firstchild);
if I only have pointer to secondchild, is it possible to somehow return pointers to firstchild or maybe even t?
I would like to know if it possible to do so, because the logic of my working code needs to traverse the trie "up" (from lower nodes to upper ones), to the parent of current object. Currently I am just using recursive function to travel down - but I am wondering if there exists any other way?
I am sorry if above is unclear or if I messed up somewhere, I am rather inexperienced and writing from my memory, without the working code.
You need to add something like
Trie* parent;
or
Trie* previoussibling;
Trie* nextsibling;
to the class to get directly from firstchild to secondchild or vice-versa, or to go up from one of the children to t.
Note that if you need this kind of relationship then you will require more maintenance when adding and removing nodes to keep all the links correct.
The Trie object does not keep track of parent object.
Its basically similar to single linked list and you can not traverse back unless you "know" the parent.
class Trie {
public:
char data;
vector<Trie* > children;
Trie* parent;
Trie(char data):parent(NULL){}
Trie* addChild(Trie* ch)
{ //set the parent
ch->parent = this;
}
(skipped others members/methods)
};
Then traverse would look something like:
traverse(Trie* pPtr)
{
Trie* currentPtr = pPtr;
while(currentPtr)
{
currentPtr = currentPtr->parent;
}
}
I only have pointer to secondchild,
is it possible to somehow return
pointers to firstchild or maybe even
t?
No. You have to establish that relationship your self by passing the firstChild as a parent of the second child.

How could I create a list in c++?

How can I create a list in C++? I need it to create a linked list. How would I go about doing that? Are there good tutorials or examples I could follow?
I take it that you know that C++ already has a linked list class, and you want to implement your own because you want to learn how to do it.
First, read Why do we use arrays instead of other data structures? , which contains a good answer of basic data-structures. Then think about how to model them in C++:
struct Node {
int data;
Node * next;
};
Basically that's all you need to implement a list! (a very simple one). Yet it has no abstractions, you have to link the items per hand:
Node a={1}, b={20, &a}, c={35, &b} d={42, &c};
Now, you have have a linked list of nodes, all allocated on the stack:
d -> c -> b -> a
42 35 20 1
Next step is to write a wrapper class List that points to the start node, and allows to add nodes as needed, keeping track of the head of the list (the following is very simplified):
class List {
struct Node {
int data;
Node * next;
};
Node * head;
public:
List() {
head = NULL;
}
~List() {
while(head != NULL) {
Node * n = head->next;
delete head;
head = n;
}
}
void add(int value) {
Node * n = new Node;
n->data = value;
n->next = head;
head = n;
}
// ...
};
Next step is to make the List a template, so that you can stuff other values (not only integers).
If you are familiar with smart pointers, you can then replace the raw pointers used with smart pointers. Often i find people recommend smart pointers to starters. But in my opinion you should first understand why you need smart pointers, and then use them. But that requires that you need first understand raw pointers. Otherwise, you use some magic tool, without knowing why you need it.
You should really use the standard List class. Unless, of course, this is a homework question, or you want to know how lists are implemented by STL.
You'll find plenty of simple tutorials via google, like this one. If you want to know how linked lists work "under the hood", try searching for C list examples/tutorials rather than C++.
If you are going to use std::list, you need to pass a type parameter:
list<int> intList;
list<int>* intListPtr = new list<int>;
If you want to know how lists work, I recommending googling for some C/C++ tutorials to gain an understanding of that subject. Next step would then be learning enough C++ to create a list class, and finally a list template class.
If you have more questions, ask back here.
Why reinvent the wheel. Just use the STL list container.
#include <list>
// in some function, you now do...
std::list<int> mylist; // integer list
More information...
I'm guessing this is a homework question, so you probably want to go here. It has a tutorial explaining linked lists, gives good pseudocode and also has a C++ implementation you can download.
I'd recommend reading through the explanation and understanding the pseudocode before blindly using the implementation. This is a topic that you really should understand in depth if you want to continue on in CS.
Boost ptr_list
http://www.boost.org/doc/libs/1_37_0/libs/ptr_container/doc/ptr_list.html
HTH
Create list using C++ templates
i.e
template <class T> struct Node
{
T data;
Node * next;
};
template <class T> class List
{
Node<T> *head,*tail;
public:
void push(T const&); // push element
void pop(); // pop element
bool empty() // return true if empty.
};
Then you can write the code like:
List<MyClass>;
The type T is not dynamic in run time.It is only for the compile time.
For complete example click here.
For C++ templates tutorial click here.
We are already in 21st century!!
Don't try to implement the already existing data structures.
Try to use the existing data structures.
Use STL or Boost library

Stuck on a Iterator Implementation of a Trie

I have to implement a homemade Trie and I'm stuck on the Iterator part. I can't seem to figure out the increment method for the trie.
I hope someone can help me clear things out.
Here's the code for the Iterator:
template <typename T> class Trie<T>::IteratorPrefixe{
friend class Trie<T>;
public:
IteratorPrefixe() : tree(NULL), currentNode(NULL), currentKey("") {};
pair<string, T*> operator*() {return make_pair(currentKey, currentNode -> element);} ;
IteratorPrefixe operator++()throw(runtime_error);
void operator=(IteratorPrefixe iter) {tree = iter.tree; currentNode = iter.currentNode; currentKey = iter.currentKey;};
bool operator==(IteratorPrefixe iter) {return tree == iter.tree && currentNode == iter.currentNode;};
bool operator!=(IteratorPrefixe iter) {return tree != iter.tree || currentNode != iter.currentNode;};
private:
Trie<T> * tree;
Trie<T> * currentNode;
string currentKey;
};
And here's my Trie:
template <typename T> class Trie {
friend class IteratorPrefixe;
public:
// Create a Trie<T> from the alphabet of nbletters, where nbletters must be
// between 1 and NBLETTERSMAX inclusively
Trie(unsigned nbletters) throw(runtime_error);
// Add a key element of which is given in the first argument and content second argument
// The content must be defined (different from NULL pointer)
// The key is to be composed of valid letters (the letters between A + inclusive and exclusive nbletters
// Eg if nblettres is 3, a, b and c are the only characters permitted;
// If nblettres is 15, only the letters between a and o inclusive are allowed.
// Returns true if the insertion was achieved, returns false otherwise.
bool addElement(string, T*) throw(runtime_error);
// Deletes a key element of which is given as an argument and returns the contents of the node removed
// The key is to be composed of letters valid (see above)
// Can also delete at the same time the reference of the ancestors, if these ancestors are no longer used.
// Returns NULL if the item has no delete
T* removeElement(string cle) throw(runtime_error);
// Find a key element of which is given as an argument and returns the associated content
// The key is to be composed of letters valid (see above)
// Returns NULL if the key does not exist
T* searchElement(string cle) throw();
// Iterator class to browse the Trie <T> in preorder mode
class IteratorPrefixe;
// Returns an iterator pointing to the first element
IteratorPrefixe pbegin() throw(runtime_error);
// Returns an iterator pointing beyond the last item
IteratorPrefixe pend() throw();
private:
unsigned nbLetters;
T* element;
vector<Trie<T> *> childs;
Trie<T> * parent;
// This function removes a node and its ancestors if became unnecessary. It is essentially the same work
// as deleteElement that is how to designate remove a node that is changing. Moreover, unlike
// deleteElement, it does not return any information on the node removed.
void remove(Trie<T> * node) throw();
// This function is seeking a node based on a given key. It is essentially the same work
// searchElement but that returns a reference to the node found (or null if the node does not exist)
// The key is to be composed of letters valid (see above)
Trie<T>* search(string key) throw(runtime_error);
};
I'm glad to see Tries are still taught, they're an important data structure that is often neglected.
There may be a design problem in your code since you should probably have a Trie class and a Node class. The way you wrote it it looks like each node in your Trie is it's own trie, which can work, but will make some things complicated.
It's not really clear from your question what it is that you are having the problem with: figuring the order, or figuring the actual code?
From the name of the iterator, it sounds like it would have to work in prefix order. Since your trie stores words and its child nodes are organized by letters, then you are essentially expected to go over all the words in an alphabetic order. Every incrementation will bring you to the next word.
THe invariant about your iterator is that at any point (as long as it is valid), it should be pointing at a node with a "terminator character" for a valid word. Figuring that word merely involves scanning upwards through the parent chain till you find your entire string. Moving to the next word means doing a DFS search: go up once, scan for links in later "brothers", see if you find a word, if not recursively go up, etc.
You may want to see my modified trie implementations at:
jdkoftinoff's trie
Specifically, you may find the discussion I had on comp.lang.c++.moderated about implementing iterators for trie's in a STL compliant way, which is a problem since all stl containers unfortunately are forced to use std::pair<>, and the iterator therefor must contain the value instead of just a reference to the single node in the trie.
For one thing, the code shown does not actually describe a trie. Rather, it appears to be a tree containing a pair of elements in each node (T* and unsigned). You can by discipline use a tree of tuples as a trie, but it's only by convention, not enforcement. This is part of why you're having such a hard time implementing operator++.
What you need to do is have each Trie contain a left-right disjoint ADT, rather than just the raw elements. It's a layer of abstraction which is more commonly found in functional languages (e.g. Scala's Either). Unfortunately, C++'s type system isn't quite powerful enough to do something that elegant. However, there's nothing preventing you from doing this:
template <class L, class R>
class Either
{
public:
Either(L *l) : left(l), right(0)
{}
Either(R *r) : left(0), right(r)
{}
L *get_left() const
{
return left;
}
R *get_right() const
{
return right;
}
bool is_left() const
{
return left != 0;
}
bool is_right() const
{
return right != 0;
}
private:
L *left;
R *right;
};
Then your Trie's data members would be defined as follows:
private:
Either<unsigned, T*> disjoint;
vector<Trie<T> *> children; // english pluralization
Trie<T> * parent;
I'm playing fast and loose with your pointers, but you get the gist of what I'm saying. The important bit is that no given node can contain both an unsigned and a T*.
Try this, and see if that helps. I think you'll find that being able to easily determine whether you are on a leaf or a branch will help you tremendously in your attempt to iterate.