Generalized (non-slicing) pointer to templated tree nodes? (C++) - c++

I'm working on an octree implementation where the tree nodes are templated with their dimensional length (as a power of 2):
template<long N>
struct node_t {
enum { DIM = 1 << N };
node_t<N+1> * parent;
node_t<N-1> * children[8];
long count;
}
And specialized for N = 0 (leaves) to point to data.
struct node_t<0> {
enum { DIM = 1 };
node_t<1> * parent;
data_t data;
long count;
}
(Aside: I suppose I probably also need a specialization for N_MAX that excludes a parent pointer, or else C++ will generate types of increasing N ad finitum? But that's not really relevant to my question.)
I'd like to create a function that steps along a ray in the 3D space that my octree occupies, so ostensibly I could just keep a pointer to the root node (which has a known type) and traverse the octree from the root at every step. However, I would prefer a more 'local' option, in which I can keep track of the current node so that I can start lower in the tree when possible and thus avoid unnecessarily traversing the upper nodes of the octree.
But I don't know what that type pointer could be (or any other way of implementing this) so that I don't experience slicing.
I'm not tied down to templates, as the dimension can simply be implemented as a long const. But then I don't know how to make it so that the leaves have a different child type than inodes.
Thanks for your help!
Update
The reason I'd like to do it this way rather than something similar to this is because of the count variable in each node: if the count is 0, I'd like to jump through the whole cube, rather wasting time going through leaves that I know to be empty. (This is for a raytracing voxel engine.)

As much as I love templates, your code might actually be simpler with:
class node {
node* parent; // NULL for root node
long dim;
long count;
virtual rayTrace(Ray) = 0;
};
class leafNode : node {
data_t data;
virtual rayTrace(Ray);
};
class nonLeafNode : node {
vector<node*> children;
virtual rayTrace(Ray);
};
This has the advantage that the tree can be whatever depth you want, including some subtrees can be deeper than others. It has the downside that dim must be computed at runtime, but even that has the silver lining that you can make it a double if your tree gets really big.

Related

Binary tree depth

Trying to find depth of binary search tree. Have done something with a bit of google search but my code is crashing:
int treeDepth(int depth) const
{
int l = this->left->treeDepth(depth);
int d = this->right->treeDepth(depth);
return depth + std::max(l, d);
}
calling this function with: root->treeDepth(1);
First of all, I think that you may be confusing the depth of a tree with the height of a tree. Refer to What is the difference between tree depth and height? for an explanation.
For example, following are the depth and height of the nodes in this (binary) tree ('0' being the root):
0 -- depth 0, height 2
/ \
1 2 -- depth 1, height 1 and 2, respectively, for nodes 1 and 2
/ \
3 4 -- depth 2, height 0
As you can see, that the depth of the tree root is '0', which is O(1) computation. The height of the tree, however, is more interesting with regards to recursion and can be computed using the following:
struct TreeNode {
T _value;
typedef TreeNode* node_ptr;
node_ptr _left, _right;
};
...
int treeHeight(Node* node)
{
return std::max(node->_left? 1 + height(node->_left) : 0,
node->_right? 1 + height(node->_right) : 0);
}
...
std::cout << treeHeight(root);
...
The fundamental idea is this:
During the recursive (depth-first) traversal, if a leaf node is reached, return a value of '0' (which is the height of every leaf node). Otherwise, compute the height of the tree rooted at the non-leaf node (which is the max of the heights of the left subtree and the right subtree of that node + 1 for the node itself). Start from the root of the tree.
The second aspect that I want to address in your question is regarding what I gather from your function signature:
int treeDepth(int depth) const
and the way you are calling it:
root->treeDepth(1);
The depth of a tree is not conducive to being a property of the root. Instead, it is a property of the tree, which is composed of tree nodes, one of which is the root node. So, I would define the class (C++ struct shown here) as follows:
template<class T>
struct BinaryTree {
struct TreeNode {
T _value;
typedef TreeNode* node_ptr;
node_ptr _left, _right;
};
TreeNode _root_node;
typedef typename TreeNode::node_ptr node_ptr;
node_ptr _root;
...
int height(node_ptr node) const;
...
};
Finally, finding the depth of a given node in a tree:
int depth(node_ptr node) const;
// returns depth of node if it is in the tree, else -1
is a problem where you may apply recursion. However, a recursive approach is not natural and a breadth-first (or level-order) traversal will be more suited for this.
there are actually many problems in your code.
Fist of all you need to implement a recursion termination, that is a condition needed to stop your function from calling its self forever
in your case you need to write something like
if(left==nullptr && rigth==nullptr)
return depth;
The recursion termination is VERY IMPORTANT! you always need it when you're writing a recursive function. Without you're 100% going into a never ending loop (in the best case).
then, when you re-call your function you need to change the depth value
i mean, if you're going down on another node of the tree it means that the tree is tall at least "depth+1" so you need to pass depth + 1 not just depth,
and at the and of the function just write
return std::max(l,d);
also,you're using pointers, always write a condition to check if you're really trying to access a well defined address, that is, before even trying to access an address (ex. this->left) you need to write a condition like if( this->left!=nullptr) (nullptr from c++ 11 otherwise NULL will as well get the work done)
You need to read up a little bit about recursion.
One of the fundamental tenets of recursion is that there must be a "stop condition" which will at some point terminate the recursion.
Otherwise, you have what is known as "runaway recursion", your stack fills up, and your program crashes and burns.
In your case, a "stop condition" would be reaching a this->left or this->right which happens to be NULL.
So, for the future readers (as suggested by Barmar in a comment)
int l = left == NULL? 0 : left->treeDepth(depth);
int d = right == NULL? 0 : right->treeDepth(depth);
try something like this:
int treeDepth() const
{
int l = left == NULL? 0 : left->treeDepth();
int d = right== NULL? 0 : right->treeDepth();
return 1 + std::max(l, d);
}
In this case you don't need additional paramether "depth"

Overloading [] operator : Must be non-static member function [duplicate]

This question already has answers here:
Rationale of enforcing some operators to be members
(2 answers)
What are the basic rules and idioms for operator overloading?
(8 answers)
Closed 7 years ago.
I am working with Graphs, and writing code some well known algorithms. Currently I am working on the Dijkstra Algorithm.
So, I have made my own Heap class which works as a min-priority queue for the implementation of the Dijkstra algorithm. Since, in the Dijkstra algorithm, you need to update the distance of a vertex (which is already in the heap), from the source vertex in the Graph, if its distance is less than it's current distance, and then accordingly adjust the values in a Heap.
For this, I need to keep an int position[] array to keep a track of the positions at which the elements are currently in the Heap
This is my Vertex Class ::
class Node{
public:
int data;
int weight;
.....
friend int& operator [](int *a, Node i) {
return a[i.data];
}
};
My minPriorityQueue class ::
template <typename t>
class minPriorityQueue {
int size, currentPosition;
int *position;
t *data;
bool (*compare)(t data1, t data2);
public:
minPriorityQueue(int size, bool (*func1)(t data1, t data2), int *position) {
this->size = size;
currentPosition = 1;
data = new t[size];
compare = func1;
this->position = position;
}
........
void swapPositionValue(int parent, int temp) {
int tempPosition = position[data[parent]];
position[data[parent]] = position[data[temp]];
position[data[temp]] = tempPosition;
}
......
};
Since my vertices are 0,1,2,3, ... So, I try to overload the []operator of my Vertex class so that it returns me the data of the current vertex (which is one from 0,1,2,3 ..., so I can use it to access that index of the position array.
I get the compilation error :: error: 'int operator[](int*, graph::Vertex)' must be a nonstatic member function
Well, since I get this error I assume that it must have been specified in the standard that I cannot overload the []operator using a friend function, but why I cannot do it?? Does it lead to any ambiguity? I don't see what can be ambiguous in what I am currently using.
Secondly, is there any way I can swap the values in my position array? My minPriorityQueue class is a generic class which I am using at several other places at my code as well. In the function swapPositionValue if I change my swap statements to this ::
int tempPosition = position[data[parent].data];
position[data[parent].data] = position[data[temp].data];
position[data[temp].data] = tempPosition;
Then the whole idea of "generic" priority queue will be sacrificed! Since, it won't work with other classes!
Is there a way that I can achieve this functionality??
EDIT1 :: The complete Code :: http://ideone.com/GRQHHZ
(Using Ideone to paste the code, Because the code is still very large, containing 2 classes)
This is what I am trying to achieve :: http://www.geeksforgeeks.org/greedy-algorithms-set-7-dijkstras-algorithm-for-adjacency-list-representation/
(I am just using the algorithm)
Explanation of what is the functionality of operator[]::
In my Dijkstra implementation all the nodes are initially inserted into the Heap, with the start node, having the weight = 0, and all the other nodes with weight = INFINITY (which means I cannot reach the vertices) so start vertex is the topmost element of the heap! Now when I remove the topmost element, all the Node that have a path from the removed Node will get modified, their weight will be modified from INFINITY to some finite value. So, I need to update the Nodes in the Heap, and then I need to move them to their correct positions, according to their new weights!! To update their weights, I need to know at what position are the Nodes located in the Heap, and the position is decided by the data of the Node. So, overloading the []operator was just a small way out for me, so that when I do position[Node], I can access position[Node.data].
Why this is not a duplicate:: The linked question is a broad operator overloading post, it just mentions 1 point where it states that []operator can only be overloaded with member functions and not otherwise, does not state why! And this is a specific problem I am facing where I do not want to sacrifice the generic property of my self made Heap, and use it for the Dijkstra as well.
EDIT2 :: While writing this explanation I realize I had made a big mistake in my overloaded function. I have changed it! Please check it. Probably it makes more sense now. Apologies!!
The overloaded function now looks like ::
friend int& operator [](int *a, Node i) {
return a[i.data];
}
EDIT3 :: In implement my Graph class with Adjacency Matrix, and it is a boolean 2D array, because my current implementation is for Unweighted graphs, and accordingly the shortest path becomes the least number of edges traversed! (Just in case that mattered!)
Thanks for reading all of this huge question, and for any help! :)

separate chaining in hashing

I am reading about hashing in Robert Sedwick book on Algorithms in C++
We might be using a header node to streamline the code for insertion
into an ordered list, but we might not want to use M header nodes for
individual lists in separate chaining. Indeed, we could even eliminate
the M links to the lists by having the first nodes in the lists
comprise the table
.
class ST
{
struct node
{
Item item;
node* next;
node(Item x, node* t)
{ item = x; next = t; }
};
typedef node *link;
private:
link* heads;
int N, M;
Item searchR(link t, Key v)
{
if (t == 0) return nullItem;
if (t->item.key() == v) return t->item;
return searchR(t->next, v);
}
public:
ST(int maxN)
{
N = 0; M = maxN/5;
heads = new link[M];
for (int i = 0; i < M; i++) heads[i] = 0;
}
Item search(Key v)
{ return searchR(heads[hash(v, M)], v); }
void insert(Item item)
{ int i = hash(item.key(), M);
heads[i] = new node(item, heads[i]); N++; }
};
My two questions on above text what does author mean by
"We could even eliminate the M links to the lists by having the first nodes in the lists comprise the table." How can we modify above code for this?
"we might not want to use M header nodes for individual lists in separate chaining." What does this statement mean.
"We could even eliminate the M links to the lists by having the first nodes in the lists comprise the table."
Consider Node* x[n] vs Node x[n]: the former needs an extra pointer and on-insertion memory allocated for the head Node of every non-empty element, and an extra indirection for every hash table operation, while the latter eliminates the n pointers but requires that any unused elements will be able to be put in some discernable not-in-use state (tracking of which may or may not require extra memory), and if sizeof(Node) size is greater than sizeof(Node*), it may be more wasteful of memory anyway. The difference in memory use can also affect efficiency of cache use: if the table has a high element to buckets ratio then a Node[] gets the Node data into fewer contiguous memory pages, and if you're iterating (in unsorted order) then it's very cache efficient, whereas Node*[] will jump to separate memory allocations that might be all over the place (or on the other hand, might actually be quite close together in some actually useful: e.g. if both access patterns and dynamic memory allocation addresses correlate to chronological time of object creation.
How can we modify above code for this?
First, your existing code has a problem: heads[i] = new node(item, heads[i]); overwrites an entry in the hash table without first checking if it's empty... if there's anything there then you should be adding to the list, not overwriting the array.
The design change discussed needs:
link* heads;
...changed to...
node* head;
You'd initialise it like this:
head = new node[M];
Which needs an extra node constructor (if item has an equivalent default constructor, you can leave out its initialisation below)
node() : item(nullItem), next(nullptr) { }
Then there's some knock on changes to the rest of your code that are easy to work through. Basically, you're getting rid of a layer of pointers.
"we might not want to use M header nodes for individual lists in separate chaining." What does this statement mean.
I didn't write it so can't say authoritatively, but it appears to be saying that when designing the list code, a decision might have been made to have an initial Node even in an empty list, as this simplifies code for several list operations. While the extra data-less Node might seem a reasonable price when contemplating "usual" uses of a list, hash tables are unusual in that you want most of the lists chained of the buckets to have 0 or 1 element, and exponentially fewer should be longer and longer. So, such a list implementation is poorly suited to use in a hash table.

Stackoverflow exception when traversing BST

I have implement a link-based BST (binary search tree) in C++ for one of my assignment. I have written my whole class and everything works good, but my assignment asks me to plot the run-times for:
a. A sorted list of 50000, 75000, and 100000 items
b. A random list of 50000, 75000, and 100000 items
That's fine, I can insert the numbers but it also asks me to call the FindHeight() and CountLeaves() methods on the tree. My problem is that I've implemented the two functions using recursion. Since I have a such a big list of numbers I'm getting getting a stackoverflow exception.
Here's my class definition:
template <class TItem>
class BinarySearchTree
{
public:
struct BinarySearchTreeNode
{
public:
TItem Data;
BinarySearchTreeNode* LeftChild;
BinarySearchTreeNode* RightChild;
};
BinarySearchTreeNode* RootNode;
BinarySearchTree();
~BinarySearchTree();
void InsertItem(TItem);
void PrintTree();
void PrintTree(BinarySearchTreeNode*);
void DeleteTree();
void DeleteTree(BinarySearchTreeNode*&);
int CountLeaves();
int CountLeaves(BinarySearchTreeNode*);
int FindHeight();
int FindHeight(BinarySearchTreeNode*);
int SingleParents();
int SingleParents(BinarySearchTreeNode*);
TItem FindMin();
TItem FindMin(BinarySearchTreeNode*);
TItem FindMax();
TItem FindMax(BinarySearchTreeNode*);
};
FindHeight() Implementation
template <class TItem>
int BinarySearchTree<TItem>::FindHeight()
{
return FindHeight(RootNode);
}
template <class TItem>
int BinarySearchTree<TItem>::FindHeight(BinarySearchTreeNode* Node)
{
if(Node == NULL)
return 0;
return 1 + max(FindHeight(Node->LeftChild), FindHeight(Node->RightChild));
}
CountLeaves() implementation
template <class TItem>
int BinarySearchTree<TItem>::CountLeaves()
{
return CountLeaves(RootNode);
}
template <class TItem>
int BinarySearchTree<TItem>::CountLeaves(BinarySearchTreeNode* Node)
{
if(Node == NULL)
return 0;
else if(Node->LeftChild == NULL && Node->RightChild == NULL)
return 1;
else
return CountLeaves(Node->LeftChild) + CountLeaves(Node->RightChild);
}
I tried to think of how I can implement the two methods without recursion but I'm completely stumped. Anyone have any ideas?
Recursion on a tree with 100,000 nodes should not be a problem if it is balanced. The depth would only be maybe 17, which would not use very much stack in the implementations shown. (log2(100,000) = 16.61). So it seems that maybe the code that is building the tree is not balancing it correctly.
I found this page very enlightening because it talks about the mechanics of converting a function that uses recursion to one that uses iteration.
It has examples showing code as well.
May be you need to calculate this while doing the insert. Store the heights of nodes, i.e add an integer field like height in the Node object. Also have counters height and leaves for the tree. When you insert a node, if its parent is (was) a leaf, the leaf count doesnt change, but if not, increase leaf count by 1. Also the height of the new node is parent's height + 1, hence if that is greater than the current height of the tree, then update it. Its a homework, so i wont help with the actual code
Balance your tree occasionally. If your tree is getting stackoverflow on FindHeight(), that means your tree is way unbalanced. If the tree is balanced it should only have a depth of about 20 nodes for 100000 elements.
The easiest (but fairly slow) way of re-balancing unbalanced binary tree is to allocate an array of TItem big enough to hold all of the data in the tree, insert all of your data into it in sorted order, and delete all of the nodes. Then rebuild the tree from the array recursively. The root is the node in the middle. root->left is the middle of the left half, root->right is the middle of the right half. Repeat recursively. This is the easiest way to rebalance, but it is slowish and takes lots of memory temporarily. On the other hand, you only have to do this when you detect that the tree is very unbalanced, (depth on insert is more than 100).
The other (better) option is to balance during inserts. The most intuitive way to do this is to keep track of how many nodes are beneath the current node. If the right child has more than twice as many "child" nodes as the left child, "rotate" left. And vice-versa. There's instrcutions on how to do tree rotates all over the internet. This makes inserts slightly slower, but then you don't have occassional massive stalls that the first option creates. On the other hand, you have to constantly update all of the "children" counts as you do the rotates, which isn't trivial.
In order to count the leaves without recursion, use the concept of an iterator like the STL uses for the RB-tree underlying std::set and std::map ... Create a begin() and end() function for you tree that indentifies the ordered first and last node (in this case the left-most node and then the right-most node). Then create a function called
BinarySearchTreeNode* increment(const BinarySearchTreeNode* current_node)
that for a given current_node, will return a pointer to the next node in the tree. Keep in mind for this implementation to work, you will need an extra parent pointer in your node type to aid in the iteration process.
Your algorithm for increment() would look something like the following:
Check to see if there is a right-child to the current node.
If there is a right-child, use a while-loop to find the left-most node of that right subtree. This will be the "next" node. Otherwise go to step #3.
If there is no right-child on the current node, then check to see if the current node is the left-child of its parent node.
If step #3 is true, then the "next" node is the parent node, so you can stop at this point, otherwise go the next step.
If the step #3 was false, then the current node is the right-child of the parent. Thus you will need to keep moving up to the next parent node using a while loop until you come across a node that is a left-child of its parent node. The parent of this left-child node will then be the "next" node, and you can stop.
Finally, if step #5 returns you to the root, then the current node is the last node in the tree, and the iterator has reached the end of the tree.
Finally you'll need a bool leaf(const BinarySearchTreeNode* current_node) function that will test whether a given node is a leaf node. Thus you counter function can simply iterate though the tree and find all the leaf nodes, returning a final count once it's done.
If you want to measure the maximum depth of an unbalanced tree without recursion, you will, in your tree's insert() function, need to keep track of the depth that a node was inserted at. This can simply be a variable in your node type that is set when the node is inserted in the tree. You can then iterate through the three, and find the maximum depth of a leaf-node.
BTW, the complexity of this method is unfortunately going to be O(N) ... nowhere near as nice as O(log N).

Binary Search Tree Implementation in C++ STL?

Do you know, please, if C++ STL contains a Binary Search Tree (BST) implementation, or if I should construct my own BST object?
In case STL conains no implementation of BST, are there any libraries available?
My goal is to be able to find the desired record as quickly as possible: I have a list of records (it should not be more few thousands.), and I do a per-frame (its a computer game) search in that list. I use unsigned int as an identifier of the record of my interest. Whatever way is the fastest will work best for me.
What you need is a way to look up some data given a key. With the key being an unsigned int, this gives you several possibilities. Of course, you could use a std::map:
typedef std::map<unsigned int, record_t> my_records;
However, there's other possibilities as well. For example, it's quite likely that a hash map would be even faster than a binary tree. Hash maps are called unordered_map in C++, and are a part of the C++11 standard, likely already supported by your compiler/std lib (check your compiler version and documentation). They were first available in C++TR1 (std::tr1::unordered_map)
If your keys are rather closely distributed, you might even use a simple array and use the key as an index. When it comes to raw speed, nothing would beat indexing into an array. OTOH, if your key distribution is too random, you'd be wasting a lot of space.
If you store your records as pointers, moving them around is cheap, and an alternative might be to keep your data sorted by key in a vector:
typedef std::vector< std::pair<unsigned int, record_t*> > my_records;
Due to its better data locality, which presumably plays nice with processor cache, a simple std::vector often performs better than other data structures which theoretically should have an advantage. Its weak spot is inserting into/removing from the middle. However, in this case, on a 32bit system, this would require moving entries of 2*32bit POD around, which your implementation will likely perform by calling CPU intrinsics for memory move.
std::set and std::map are usually implemented as red-black trees, which are a variant of binary search trees. The specifics are implementation dependent tho.
A clean and simple BST implementation in CPP:
struct node {
int val;
node* left;
node* right;
};
node* createNewNode(int x)
{
node* nn = new node;
nn->val = x;
nn->left = nullptr;
nn->right = nullptr;
return nn;
}
void bstInsert(node* &root, int x)
{
if(root == nullptr) {
root = createNewNode(x);
return;
}
if(x < root->val)
{
if(root->left == nullptr) {
root->left = createNewNode(x);
return;
} else {
bstInsert(root->left, x);
}
}
if( x > root->val )
{
if(root->right == nullptr) {
root->right = createNewNode(x);
return;
} else {
bstInsert(root->right, x);
}
}
}
int main()
{
node* root = nullptr;
int x;
while(cin >> x) {
bstInsert(root, x);
}
return 0;
}
STL's set class is typically implemented as a BST. It's not guaranteed (the only thing that is is it's signature, template < class Key, class Compare = less<Key>, class Allocator = allocator<Key> > class set;) but it's a pretty safe bet.
Your post says you want speed (presumably for a tighter game loop).
So why waste time on these slow-as-molasses O(lg n) structures and go for a hash map implementation?