Maintaining Heap Property - c++

Brief Background: I am studying the steps for maintaining a heap property when insertion occurs. Here is an interesting problem:
Question: There are two general strategies that could be used to maintain the heap properties:
Make sure that the tree is complete and then fix the ordering or
Make sure the ordering is correct first and then check for completeness.
Which is better (1 or 2)?
Reference: http://www.cs.sfu.ca/CourseCentral/225/johnwill/cmpt225_09heaps.pdf (Heap Insertion - slide 16) written by Dr. John Edgar.
It would be great if you guys can clarify why one of the methods above is better?

With a binary heap implemented as an array, there are in general two ways to implement insertion: top-down or bottom-up. Slide 17 of the linked PDF describes the bottom-up way of doing things. It adds the new item at the end of the heap (bottom-most, left-most position), and bubbles it up. That is an implementation of strategy 1 shown on Slide 16.
From a performance standpoint, this is the better method simply because, on average, it requires fewer iterations to fix the ordering. See Argument for O(1) average-case complexity of heap insertion for a detailed explanation of why that is.
The top-down approach, which corresponds to strategy 2 on Slide 16, requires that every insertion make O(log n) comparisons. This strategy starts at the root and sifts the item down through the heap. If the new item is smaller (in a min-heap) than the node it's being compared against, it replaces the item at that node, and the just-replaced item has to be pushed down. This continues until you reach the bottom of the heap. There is no "early out" possible because you have to end up putting a new item in the heap at the leaf level.
I never really thought of it as making sure of the ordering first, and then ensuring completeness, but that's essentially what the top-down method is doing.
The second strategy requires more iterations per insertion, and also does more work during each iteration to maintain the ordering.

Related

What is the most efficient data structure for designing a PRIM algorithm?

I am designing a Graph in c++ using a hash table for its elements. The hashtable is using open addressing and the Graph has no more than 50.000 edges. I also designed a PRIM algorithm to find the minimum spanning tree of the graph. My PRIM algorithm creates storage for the following data:
A table named Q to put there all the nodes in the beginning. In every loop, a node is visited and in the end of the loop, it's deleted from Q.
A table named Key, one for each node. The key is changed when necessary (at least one time per loop).
A table named Parent, one for each node. In each loop, a new element is inserted in this table.
A table named A. The program stores here the final edges of the minimum spanning tree. It's the table that is returned.
What would be the most efficient data structure to use for creating these tables, assuming the graph has 50.000 edges?
Can I use arrays?
I fear that the elements for every array will be way too many. I don't even consider using linked lists, of course, because the accessing of each element will take to much time. Could I use hash tables?
But again, the elements are way to many. My algorithm works well for Graphs consisting of a few nodes (10 or 20) but I am sceptical about the situation where the Graphs consist of 40.000 nodes. Any suggestion is much appreciated.
(Since comments were getting a bit long): The only part of the problem that seems to get ugly for very large size, is that every node not yet selected has a cost and you need to find the one with lowest cost at each step, but executing each step reduces the cost of a few effectively random nodes.
A priority queue is perfect when you want to keep track of lowest cost. It is efficient for removing the lowest cost node (which you do at each step). It is efficient for adding a few newly reachable nodes, as you might on any step. But in the basic design, it does not handle reducing the cost of a few nodes that were already reachable at high cost.
So (having frequent need for a more functional priority queue), I typically create a heap of pointers to objects and in each object have an index of its heap position. The heap methods all do a callback into the object to inform it whenever its index changes. The heap also has some external calls into methods that might normally be internal only, such as the one that is perfect for efficiently fixing the heap when an existing element has its cost reduced.
I just reviewed the documentation for the std one
http://en.cppreference.com/w/cpp/container/priority_queue
to see if the features I always want to add were there in some form I hadn't noticed before (or had been added in some recent C++ version). So far as I can tell, NO. Most real world uses of priority queue (certainly all of mine) need minor extra features that I have no clue how to tack onto the standard version. So I have needed to rewrite it from scratch including the extra features. But that isn't actually hard.
The method I use has been reinvented by many people (I was doing this in C in the 70's, and wasn't first). A quick google search found one of many places my approach is described in more detail than I have described it.
http://users.encs.concordia.ca/~chvatal/notes/pq.html#heap

Is this how I combine two min-heaps together?

I am currently creating a source code to combine two heaps that satisfy the min heap property with the shape invariant of a complete binary tree. However, I'm not sure if what I'm doing is the correct accepted method of merging two heaps satisfying the requirements I laid out.
Here is what I think:
Given two priority queues represented as min heaps, I insert the nodes of the second tree one by one into the first tree and fix the heap property. Then I continue this until all of the nodes in the second tree is in the first tree.
From what I see, this feels like a nlogn algorithm since I have to go through all the elements in the second tree and for every insert it takes about logn time because the height of a complete binary tree is at most logn.. But I think there is a faster way, however I'm not sure what other possible method there is.
I was thinking that I could just insert the entire tree in, but that break the shape invariant and order invariant..Is my method the only way?
In fact building a heap is possible in linear time and standard function std::make_heap guarantees linear time. The method is explained in Wikipedia article about binary heap.
This means that you can simply merge heaps by calling std::make_heap on range containing elements from both heaps. This is asymptotically optimal if heaps are of similar size. There might be a way to exploit preexisting structure to reduce constant factor, but I find it not likely.

How to improve Boost Fibonacci Heap performance

I am implementing the Fast Marching algorithm, which is some kind of continuous Dijkstra. As I read in many papers, the Fibonacci heap is the most adequate heap for this purpose.
However, when profiling with callgrind my code I see that the following function is taking 58% of the execution time:
int popMinIdx () {
const int idx = heap_.top()->getIndex();
heap_.pop();
return idx;
}
Concretely, the pop() is taking 57.67% of the whole execution time.
heap_is defined as follows:
boost::heap::fibonacci_heap<const FMCell *, boost::heap::compare<compare_cells>> heap_;
Is it normal that it takes "that much" time or is there something I can do to improve performance?
Sorry if not enough information is given. I tried to be as brief as possible. I will add more info if needed.
Thank you!
The other answers aren't mentioning the big part: of course pop() takes the majority of your time: it's the only function that performs any real work!
As you may have read, the bounds on the operations of a Fibonacci Heap are amortized bounds. This means that if you perform enough operations in a good sequence, the bounds will average out to that. However, the actual costs are completely hidden.
Every time you insert an element, nothing happens. It is just thrown into the root list. Boom, O(1) time. Every time you merge two trees, its root is just linked into the root list. Boom, O(1) time. But hold on, your structure is not a valid Fibonacci Heap! That's where pop() (or extract-root) comes in: every time this operation is called, the entire Heap is restructured back into a correct shape. The Root is removed, its children are cut to the root list, and then we start merging trees in the root list so that no two trees with the same degree (number of children) exist in the root list.
So all of the work of Insert(e) and Merge(t) is actually delayed until Pop() is called, which then does all the work. What about the other operations?
Delete(e) is beautiful. We perform Decrease-Key(e, -inf) to make the element e become the root. And now we perform Pop()! Again, the work is done by Pop().
Decrease-Key(e, v) does its work by itself: it cuts e to the root list and starts a cutting cascade to put its children into the root list as well (which can cut their childlists too). So Decrease-Key puts a whole lot of elements into the root list. Can you guess which function has to fix that?
TL;DR: Pop() is the work horse of the Fibonacci Heap. All other operations are done efficiently because they create work for the Pop() operation. Pop() gathers the work and performs it in one go (which can take up to O(n)). This is actually really efficient because the "grouped" work can be done faster than each operation separately.
So yes, it is natural that Pop() takes up the majority of your time!
The Fibanacci Heap's pop() has an amortized runtime of O(log n) and worst case of O(n). If your heap is large, it could easily be consuming a majority of the CPU time in your algorithm, especially since most of the other operations you're likely using have O(1) runtimes (insert, top, etc.)
One thing I'd recommend is to try callgrind with your preferred optimization level (such as -O3) with debug info (-g), because the templatized datastructures/containers such as the fibonacci_heap are heavy on the inlined function usage. It could be that most of the CPU cycles you're measuring don't even exist in your optimized executable.

Algorithms on merging two heaps

As I know, there exists a binomial heap or a so called mergeable heap, which is used to merge two heaps. My question is, instead of merging these heaps into one heap dynamically, if I copy these two heaps into one big array and then perform a heap building procedure, would that be a good approach or not?
Because I don't know how to create one heap using two heaps using by just heap operation. Please tell me if it is not a good way, or if you can, please give me some link, where a binomial heap with merge operation is implemented.
If you think about it, creating one heap by throwing away all the info embedded in the ordering of the other heaps can't possibly be optimal. Worst case, you should add all the items in heap 2 to heap 1, and that will be just half the work of creating a brand new heap from scratch.
But in fact, you can do way better than that. Merging two well-formed heaps involves little more than finding the insertion point for one of the roots in the other heap's tree, and inserting it at that point. No further work is necessary, and you've done no more than ln N work! See here for the detailed algorithm.
It will solve the problem, and it will give you a correct heap - but it will not be efficient.
Creating a [binary] heap of n elements from scratch is O(n), while merging 2 existing binomial heaps is O(logn).
The process of merging 2 binomial heaps is pretty much similar to the merge operation in merge sort. If not knowing the merge - heap procedure is the problem, following steps might help.
repeat steps 1 through 4 until one of the heaps is empty
If the heads (which are binomial trees) of the 2 heaps are of same degree then you assign the head of the heap with greater key as the child of the child of head of heap with smaller key. Consequentely the degree of the head of the latter heap will be increased by 1 and make the head of the former heap the next element of its current head and go to step 2 else if they are of different degree, then go to step 4
If the head and the next binomial tree in the latter heap in step 1 are of same degree, then go to step 3 else go to step 1
Combine the head and its next element in the heap, in the same manner as you did in step 1 and assign the new combined Binomial tree as head and go to step 2.
See which of the 2 heaps have head with lower degree. Assign head of this heap as the head of other heap and delete it from the heap where it was initially present
Brodal queues and Brodal-Okasaki queues (bootstrapped skew binomial heaps) give the best worst-case asymptotic bounds for mergeable heaps, supporting O(1) insert, merge, and findMin, and O(log n) deleteMin. Brodal queues are ephemeral, and support efficient delete and decreaseKey. Brodal-Okasaki queues are confluently persistent (in fact purely functional), but don't support delete or decreaseKey. Unfortunately, Brodal and Okasaki say both these implementations are inefficient in practice, and Brodal considers his queues too complicated to be practical in any case.
Fibonacci heaps give similar amortized (but not worst-case) bounds, and are likely more efficient and practical in an amortized context. Pairing heaps are another good option: according to Wikipedia, their exact bounds are unknown, but they perform very well in practice.

How to keep a large priority queue with the most relevant items?

In an optimization problem I keep in a queue a lot of candidate solutions which I examine according to their priority.
Each time I handle one candidate, it is removed form the queue but it produces several new candidates making the number of cadidates to grow exponentially. To handle this I assign a relevancy to each candidate, whenever a candidate is added to the queue, if there is no more space avaliable, I replace (if appropiate) the least relevant candidate currently in the queue with the new one.
In order to do this efficiently I keep a large (fixed size) array with the candidates and two linked indirect binary heaps: one handles the candidates in decreasing priority order, and the other in ascending relevancy.
This is efficient enough for my purposes and the supplementary space needed is about 4 ints/candidate which is also reasonable. However it is complicated to code, and it doesn't seem optimal.
My question is if you know of a more adequate data structure or of a more natural way to perform this task without losing efficiency.
Here's an efficient solution that doesn't change the time or space complexity over a normal heap:
In a min-heap, every node is less than both its children. In a max-heap, every node is greater than its children. Let's alternate between a min and max property for each level making it: every odd row is less than its children and its grandchildren, and the inverse for even rows. Then finding the smallest node is the same as usual, and finding the largest node requires that we look at the children of the root and take the largest. Bubbling nodes (for insertion) becomes a bit tricker, but it's still the same O(logN) complexity.
Keeping track of capacity and popping the smallest (least relevant) node is the easy part.
EDIT: This appears to be a standard min-max heap! See here for a description. There's a C implementation: header, source and example. Here's an example graph:
(source: chonbuk.ac.kr)
"Optimal" is hard to judge (near impossible) without profiling.
Sometimes a 'dumb' algorithm can be the fastest because intel CPUs are incredibly fast at dumb array scans on contiguous blocks of memory especially if the loop and the data can fit on-chip. By contrast, jumping around following pointers in a larger block of memory that doesn't fit on-chip can be tens or hundreds or times slower.
You may also have the issues when you try to parallelize your code if the 'clever' data structure introduces locking thus preventing multiple threads from progressing simultaneously.
I'd recommend profiling both your current, the min-max approach and a simple array scan (no linked lists = less memory) to see which performs best. Odd as it may seem, I have seen 'clever' algorithms with linked lists beaten by simple array scans in practice often because the simpler approach uses less memory, has a tighter loop and benefits more from CPU optimizations. You also potentially avoid memory allocations and garbage collection issues with a fixed size array holding the candidates.
One option you might want to consider whatever the solution is to prune less frequently and remove more elements each time. For example, removing 100 elements on each prune operation means you only need to prune 100th of the time. That may allow a more asymmetric approach to adding and removing elements.
But overall, just bear in mind that the computer-science approach to optimization isn't always the practical approach to the highest performance on today and tomorrow's hardware.
If you use skip-lists instead of heaps you'll have O(1) time for dequeuing elements while still doing searches in O(logn).
On the other hand a skip list is harder to implement and uses more space than a binary heap.