Tree traversal with fixed size stack in C/C++

Tree traversal with fixed size stack in C/C++ - c++

Is it possible to traverse a tree structure (specifically an octree, the 3-D version of a binary tree) by using a fixed sized stack? I do not want to use recursion, since my octree is
quite deep.
I am traversing the tree to do a range search problem, to find all the points closest to a queried point. So in my traversal, I do not walk down those subtrees rooted at nodes which my search region does not intersect.

If your octree has parent pointers, I think you can traverse it without a stack at all (see this thread, for example). Without that, you will need a stack that is as deep as the depth of your tree, regardless of how many branches are skipped.

Of course you can traverse a tree without using a deep native call stack, using continuation passing style techniques, or (and this is grossly the same) by making a virtual machine, with its call stack implemented as a heap data, or (yet another point of view) by coding a stack automata with the stack implemented as an explicit heap data structure (e.g. a std::stack).
Think of it otherwise, your C++ naive code could run on a Turing machine, and these beasts don't have any stack.
As Ted Hopp's answer suggests, you might be inspired by Deutsch-Schorr-Waite's Garbage Collection techniques (with a few additional bits per node to temporarily flip the reference direction and remember that) to have a "stack-less" traversal (but you need additional bits in each node). But I believe that having your own stack inside a std::stack or std::vector is probably simpler.

Yes, you can traverse the octree with a fixed-size stack.
The fixed-size just needs to be as big as the longest possible octree depth.
Bear in mind that with an octree, each depth traversal can be recorded with only 3 bits of memory. For each of the three dimensions, you only need to record whether you went in a positive or negative direction.
So even if your octree goes 1000-deep, you can store the recursion with 375 bytes.

Related

Implement a heap not using an array

I'm prepping for a Google developer interview and have gotten stuck on a question about heaps. I need to implement a heap as a dynamic binary tree (not array) where each node has a pointer to the parent and two children and there is a global pointer to the root node. The book asks "why won't this be enough?"
How can the standard tree implementation be extended to support heap operations add() and deleteMin()? How can these operations be implemented in this data structure?

Can you keep the size of total nodes ? if so, it's easy to know where you should add new element, because that's an almost full tree.
About deleteMin, I think that it will be less effective because you can't access directly to all leaves, as in array (N/2).
You should travel through all paths till you get leaf and then compare them, probably it will cost O(n)

augmenting/index priority_queue in STL

I am using STL priority_queue as an data structure in my graph application. You can safely assume it like a advance version of Prim's spanning tree algorithm.
With in the Algorithm I want to find a node in the priority queue (not just a minimum node) efficiently.[ this is needed because cost of node might get changed and need to be fixed in priority_queue]
All i have to do is augment the priority_queue and index it based on my node key's also. I don't find any way this can be done in STL. Can anyone have better idea how to do it in STL?

The std::priority_queue<T> doesn't support efficient look-up of nodes: it uses a d-ary heap, typically with d == 2. This representation doesn't keep nodes put. If you really want to use a std::priority_queue<T> with Prim's algorithm, the only way is to just add nodes with their current shortest distance and possibly add each node multiple times. This turns the size of the into O(E) instead of O(N), though, i.e., for graphs with many edges it will result in a much higher complexity.
You can use something like std::map<...> but that really suffers from pretty much the same problem: you can either locate the next node to extract efficiently or you can locate the nodes to update efficiently.
The "proper" approach is to use a node-based priority queue, e.g., a Fibanocci-heap: Since the nodes stay put, you can get a handle from the heap when inserting a node and efficiently update the distance of a node through the handle. Access to the closest node is efficient using the few top nodes in the heap's set of trees. The overall performance of basic heap operations (push(), top(), and pop()) are slower for Fibonacci heaps than for d-ary heaps but the efficient update of individual nodes makes their use worthwhile. I seem to recall that Prim's algorithm actually required Fibonacci-heaps anyway to achieve the tight complexity bound.
I know that there is an implementation of Fibonacci-heaps at Boost. An efficient implementation of Fibonacci heaps isn't entirely trivial but they are more efficient than just being of theoretical interest.

Kd tree: data stored only in leaves vs stored in leaves and nodes

I am trying to implement a Kd tree to perform the nearest neighbor and approximate nearest neighbor search in C++. So far I came across 2 versions of the most basic Kd tree.
The one, where data is stored in nodes and in leaves, such as here
The one, where data is stored only in leaves, such as here
They seem to be fundamentally the same, having the same asymptotic properties.
My question is: are there some reasons why choose one over another?
I figured two reasons so far:
The tree which stores data in nodes too is shallower by 1 level.
The tree which stores data only in leaves has easier to
implement delete data function
Are there some other reasons I should consider before deciding which one to make?

You can just mark nodes as deleted, and postpone any structural changes to the next tree rebuild. k-d-trees degrade over time, so you'll need to do frequent tree rebuilds. k-d-trees are great for low-dimensional data sets that do not change, or where you can easily afford to rebuild an (approximately) optimal tree.
As for implementing the tree, I recommend using a minimalistic structure. I usually do not use nodes. I use an array of data object references. The axis is defined by the current search depth, no need to store it anywhere. Left and right neighbors are given by the binary search tree of the array. (Otherwise, just add an array of byte, half the size of your dataset, for storing the axes you used). Loading the tree is done by a specialized QuickSort. In theory it's O(n^2) worst-case, but with a good heuristic such as median-of-5 you can get O(n log n) quite reliably and with minimal constant overhead.
While it doesn't hold as much for C/C++, in many other languages you will pay quite a price for managing a lot of objects. A type*[] is the cheapest data structure you'll find, and in particular it does not require a lot of management effort. To mark an element as deleted, you can null it, and search both sides when you encounter a null. For insertions, I'd first collect them in a buffer. And when the modification counter reaches a threshold, rebuild.
And that's the whole point of it: if your tree is really cheap to rebuild (as cheap as resorting an almost pre-sorted array!) then it does not harm to frequently rebuild the tree.
Linear scanning over a short "insertion list" is very CPU cache friendly. Skipping nulls is very cheap, too.
If you want a more dynamic structure, I recommend looking at R*-trees. They are actually desinged to balance on inserts and deletions, and organize the data in a disk-oriented block structure. But even for R-trees, there have been reports that keeping an insertion buffer etc. to postpone structural changes improves performance. And bulk loading in many situations helps a lot, too!

STL Implementation of reheapify

In a graph algorithm, I need to find the node with the smallest value.
In a step of the algorithm the value of this node or its neighbors can be decreased and a few of its neightbors can be removed dependent on their value.
Also, I don't want to search the whole graph for this node each time (although it is not so big (<1000 nodes)).
Therefore I looked at the STL library and found the heap structure which almost does what I want. I can insert and delete nodes very fast, but is there a method to update the heap fast when I only changed the value of one node without resorting the whole heap? I feel it would be a huge bottleneck in the program.

First the conceptual part:
If you use the heap insertion method with the element that decreased it's value as the starting point for insertion instead of starting at the back of the collection everything just works.
I haven't done that in C++ yet, but std::push_heap looks fine for that purpose.

Indexing: Implementing Tree data structures with Arrays/Vectors

I have been implementing a heap in C++ using a vector. Since I have to access the children of a node (2n, 2n+1) easily, I had to start at index 1. Is it the right way? As per my implementation, there is always a dummy element at zeroth location.

Your way works. Alternatively you can have root at index 0 and have children at 2n+1 and 2n+2

While this works well for heaps, you end up using a huge amount of redundant memory for other tree data structures that do not necessarily have a full and complete Binary tree. For example, this means that if you have a Binary search tree of 20 nodes with a depth of 5, you end up having to use an array of 2^5=32 instead of 20. Now imagine if you need a tree of 25 nodes with a depth of 22. You end up using a huge array of 4194304, whereas you could have used a linked representation to store just the 25 nodes.
You can still use an array and not incur such a memory hit. Just allocate a large block of memory as an array and use array indices as pointers to the children.
Thus, where you had
node.left = (node.index*2)
node.right = (node.index*2+1)
You simply use
node.left = <index of left child>
node.right = <index of right child>
Or you can just use pointers/references instead of integer indices to an array if your language supports it.
Edit:
It might not be obvious to everyone that a complete binary search tree takes up O(2^d) memory. There are d levels and every level has twice as many nodes as the level its parent is in (because every node except those at the bottom has exactly two children - never one). A binary heap is a binary tree (but not a Binary Search Tree) that is always complete by definition, so an array based implementation outlined by the OP does not incur any real memory overhead. For a heap, that is the best way to implement it in code. OTOH, most other binary trees (esp. Binary Search Trees) are not guaranteed to be complete. So trying to use this approach on would need O(2^depth) memory where depth can be as large as n, where we only need O(n) memory in a linked implementation.
So my answer is: yes, this is the best way for a heap. Just don't try it for other binary trees (unless you're sure they will always be complete).

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js