I'm working on an assignment for one of my courses, and one question asks to show that the decrease-key operation, for a pairing heap, takes O(1) time.
Obviously, if you have a pointer to the key you want to decrease, then the operation will take O(1) time (just delete link, change key value, then merge).
However, no where in the assignment does it say that we are given a pointer to the key. If we're not given a pointer, then there is no way decrease-key would take O(1) time (you'd have to look for the key in the heap first, and this doesn't take constant time). I looked at literature, and all say that decrease key takes O(logn) time.
Am I missing something here?
The amortized cost of a decrease-key in a pairing heap is not O(1) even if you have a pointer to the element in question. It's been proven that there is an Ω(log log n) lower-bound on the amortized cost of a decrease-key operation in a pairing heap. This isn't easy to prove; see this paper for details.
Related
I implemented an algorithm where I make use of an priority queue.
I was motivated by this question:
Transform a std::multimap into std::priority_queue
I am going to store up to 10 million elements with their specific priority value.
I then want to iterate until the queue is empty.
Every time an element is retrieved it is also deleted from the queue.
After this I recalculate the elements pririty value, because of previous iterations it can change.
If the value did increase I am inserting the element againg into the queue.
This happens more often dependent on the progress. (at the first 25% it does not happen, in the next 50% it does happen, in the last 25% it will happen multiple times).
After receiving the next element and not reinserting it, I am going to process it. This for I do not need the priority value of this element but the technical ID of this element.
This was the reason I intuitively had chosen a std::multimap to achieve this, using .begin() to get the first element, .insert() to insert it and .erase() to remove it.
Also, I did not intuitively choose std::priority_queue directly because of other questions to this topic answering that std::priority_queue most likely is used for only single values and not for mapped values.
After reading the link above I reimplemented it using priority queue analogs to the other question from the link.
My runtimes seem to be not that unequal (about an hour on 10 mio elements).
Now I am wondering why std::priority_queue is faster at all.
I actually would expect to be the std::multimap faster because of the many reinsertions.
Maybe the problem is that there are too many reorganizations of the multimap?
To summarize: your runtime profile involves both removing and inserting elements from your abstract priority queue, with you trying to use both a std::priority_queue and a std::multimap as the actual implementation.
Both the insertion into a priority queue and into a multimap have roughly equivalent complexity: logarithmic.
However, there's a big difference with removing the next element from a multimap versus a priority queue. With a priority queue this is going to be a constant-complexity operation. The underlying container is a vector, and you're removing the last element from the vector, which is going to be mostly a nothing-burger.
But with a multimap you're removing the element from one of the extreme ends of the multimap.
The typical underlying implementation of a multimap is a balanced red/black tree. Repeated element removals from one of the extreme ends of a multimap has a good chance of skewing the tree, requiring frequent rebalancing of the entire tree. This is going to be an expensive operation.
This is likely to be the reason why you're seeing a noticeable performance difference.
I think the main difference comes form two facts:
Priority queue has a weaker constraint on the order of elements. It doesn't have to have sorted whole range of keys/priorities. Multimap, has to provide that. Priority queue only have to guarantee the 1st / top element to be largest.
So, while, the theoretical time complexities for the operations on both are the same O(log(size)), I would argue that erase from multimap, and rebalancing the RB-tree performs more operations, it simply has to move around more elements. (NOTE: RB-tree is not mandatory, but very often chosen as underlying container for multimap)
The underlying container of priority queue is contiguous in memory (it's a vector by default).
I suspect the rebalancing is also slower, because RB-tree relies on nodes (vs contiguous memory of vector), which makes it prone to cache misses, although one has to remember that operations on heap are not done in iterative manner, it is hopping through the vector. I guess to be really sure one would have to profile it.
The above points are true for both insertions and erasues. I would say the difference is in the constant factors lost in the big-O notation. This is intuitive thinking.
The abstract, high level explanation for map being slower is that it does more. It keeps the entire structure sorted at all times. This feature comes at a cost. You are not paying that cost if you use a data structure that does not keep all elements sorted.
Algorithmic explanation:
To meet the complexity requirements, a map must be implemented as a node based structure, while priority queue can be implemented as a dynamic array. The implementation of std::map is a balanced (typically red-black) tree, while std::priority_queue is a heap with std::vector as the default underlying container.
Heap insertion is usually quite fast. The average complexity of insertion into a heap is O(1), compared to O(log n) for balanced tree (worst case is the same, though). Creating a priority queue of n elements has worst case complexity of O(n) while creating a balanced tree is O(n log n). See more in depth comparison: Heap vs Binary Search Tree (BST)
Additional, implementation detail:
Arrays usually use CPU cache much more efficiently, than node based structures such as trees or lists. This is because adjacent elements of an array are adjacent in memory (high memory locality) and therefore may fit within a single cache line. Nodes of a linked structure however exist in arbitrary locations (low memory locality) in memory and usually only one or very few are within a single cache line. Modern CPUs are very very fast at calculations but memory speed is a bottle neck. This is why array based algorithms and data structures tend to be significantly faster than node based.
While I agree with both #eerorika and #luk32, it is worth mentioning that in the real world, when using default STL allocator, memory management cost easily out-weights a few data structure maintenance operations such as updating pointers to perform tree rotation. Depending on the implementation the memory allocation itself could involve tree maintenance operation and potentially triggers system-call where it would become even more costly.
In multi-map, there is memory allocation and deallocation associated with each insert() and erase() respectively which often contributes to slowness in a higher order of magnitude than the extra steps in the algorithm.
priority-queue however, by default uses vector which only triggers memory allocation (a much more expansive one though, which involves moving all stored objects to the new memory location) once the capacity is exhausted. In your case pretty much all allocation only happens in the first iteration for priority-queue whereas multi-map keeps paying memory management cost with each insert and erase.
The downside around memory management for map could be mitigated by using a memory-pool based custom allocator. This also gives you cache hit rate comparable to priority queue. It might even out-perform priority-queue when your object is expansive to move or copy.
I need a priority queue that will store a value for every key, not just the key. I think the viable options are std::multi_map<K,V> since it iterates in key order, or std::priority_queue<std::pair<K,V>> since it sorts on K before V. Is there any reason I should prefer one over the other, other than personal preference? Are they really the same, or did I miss something?
A priority queue is sorted initially, in O(N) time, and then iterating all the elements in decreasing order takes O(N log N) time. It is stored in a std::vector behind the scenes, so there's only a small coefficient after the big-O behavior. Part of that, though, is moving the elements around inside the vector. If sizeof (K) or sizeof (V) is large, it will be a bit slower.
std::map is a red-black tree (in universal practice), so it takes O(N log N) time to insert the elements, keeping them sorted after each insertion. They are stored as linked nodes, so each item incurs malloc and free overhead. Then it takes O(N) time to iterate over them and destroy the structure.
The priority queue overall should usually have better performance, but it's more constraining on your usage: the data items will move around during iteration, and you can only iterate once.
If you don't need to insert new items while iterating, you can use std::sort with a std::vector, of course. This should outperform the priority_queue by some constant factor.
As with most things in performance, the only way to judge for sure is to try it both ways (with real-world testcases) and measure.
By the way, to maximize performance, you can define a custom comparison function to ignore the V and compare only the K within the pair<K,V>.
So I am trying to understand the data types and Big O notation of some functions for a BST and Hashing.
So first off, how are BSTs and Hashing stored? Are BSTs usually arrays, or are they linked lists because they have to point to their left and right leaves?
What about Hashing? I've had the most trouble finding clear information regarding Hashing in terms of computation-based searching. I understand that Hashing is best implemented with an array of chains. Is this for faster searching or to decrease overhead on creating the allocated data type?
This following question might be just bad interpretation on my part, but what makes a traversal function different from a search function in BSTs, Hashing, and STL containers?
Is traversal Big O(N) for BSTS because you're actually visiting each node/data member, whereas search() can reduce its time by eliminating half the searching field?
And somewhat related, why is it that in the STL, list.insert() and list.erase() have a Big O(1) whereas the vector and deque counterparts are O(N)?
Lastly, why would a vector.push_back() be O(N)? I thought the function could be done something along the lines of this like O(1), but I've come across text saying it is O(N):
vector<int> vic(2,3);
vector<int>::const iterator IT = vic.end();
//wanna insert 4 to the end using push_back
IT++;
(*IT) = 4;
hopefully this works. I'm a bit tired but I would love any explanations why something similar to that wouldn't be efficient or plausible. Thanks
BST's (Ordered Binary Trees) are a series of nodes where a parent node points to its two children, which in turn point to their max-two children, etc. They're traversed in O(n) time because traversal visits every node. Lookups take O(log n) time. Inserts take O(1) time because internally they don't need to a bunch of existing nodes; just allocate some memory and re-aim the pointers. :)
Hashes (unordered_map) use a hashing algorithm to assign elements to buckets. Usually buckets contain a linked list so that hash collisions just result in several elements in the same bucket. Traversal will again be O(n), as expected. Lookups and inserts will be amortized O(1). Amortized means that on average, O(1), though an individual insert might result in a rehashing (redistribution of buckets to minimize collisions). But over time the average complexity is O(1). Note, however, that big-O notation doesn't really deal with the "constant" aspect; only order of growth. The constant overhead in the hashing algorithms can be high enough that for some data-sets the O(log n) binary trees outperform the hashes. Nevertheless, the hash's advantage is that its operations are constant time-complexity.
Search functions take advantage (in the case of binary trees) of the notion of "order"; a search through a BST has the same characteristics as a basic binary search over an ordered array. O(log n) growth. Hashes don't really "search". They compute the bucket, and then quickly run through the collisions to find the target. That's why lookups are constant time.
As for insert and erase; in array-based sequence containers, all elements that come after the target have to be bumped over to the right. Move semantics in C++11 can improve upon the performance, but the operation is still O(n). For linked sequence containers (list, forward_list, trees), insertion and erasing just means fiddling with some pointers internally. It's a constant-time process.
push_back() will be O(1) until you exceed the existing allocated capacity of the vector. Once the capacity is exceeded, a new allocation takes place to produce a container that is large enough to accept more elements. All the elements need to then be moved into the larger memory region, which is an O(n) process. I believe Move Semantics can help here as well, but it's still going to be O(n). Vectors and strings are implemented such that as they allocate space for a growing data set, they allocate more than they need, in anticipation of additional growth. This is an efficiency safeguard; it means that the typical push_back() won't trigger a new allocation and move of the entire data set into a larger container. But eventually after enough push_backs, the limit will be reached, and the vector's elements will be copied into a larger container, which again has some extra headroom left over for more efficient push_backs.
Traversal refers to visiting every node, whereas search is only to find a particular node, so your intuition is spot on there. O(N) complexity because you need to visit N nodes.
std::vector::insert is for insert in the middle, and it involves copying all subsequent elements over by one slot, inorder to make room for the element being inserted, hence O(N). Linked list doesnt have this issue, hence O(1). Similar logic for erase. deque properties are similar to vector
std::vector::push_back is a O(1) operation, for the most part, only deviates if capacity is exceeded and reallocations + copy are needed.
I am trying to compare stl map and stl unordered_map for certain operations. I looked on the net and it only increases my doubts regarding which one is better as a whole. So I would like to compare the two on the basis of the operation they perform.
Which one performs faster in
Insert, Delete, Look-up
Which one takes less memory and less time to clear it from the memory. Any explanations are heartily welcomed !!!
Thanks in advance
Which one performs faster in Insert, Delete, Look-up? Which one takes less memory and less time to clear it from the memory. Any explanations are heartily welcomed !!!
For a specific use, you should try both with your actual data and usage patterns and see which is actually faster... there are enough factors that it's dangerous to assume either will always "win".
implementation and characteristics of unordered maps / hash tables
Academically - as the number of elements increases towards infinity, those operations on an std::unordered_map (which is the C++ library offering for what Computing Science terms a "hash map" or "hash table") will tend to continue to take the same amount of time O(1) (ignoring memory limits/caching etc.), whereas with a std::map (a balanced binary tree) each time the number of elements doubles it will typically need to do an extra comparison operation, so it gets gradually slower O(log2n).
std::unordered_map implementations necessarily use open hashing: the fundamental expectation is that there'll be a contiguous array of "buckets", each logically a container of any values hashing thereto.
It generally serves to picture the hash table as a vector<list<pair<key,value>>> where getting from the vector elements to a value involves at least one pointer dereference as you follow the list-head-pointer stored in the bucket to the initial list node; the insert/find/delete operations' performance depends on the size of the list, which on average equals the unordered_map's load_factor.
If the max_load_factor is lowered (the default is 1.0), then there will be less collisions but more reallocation/rehashing during insertion and more wasted memory (which can hurt performance through increased cache misses).
The memory usage for this most-obvious of unordered_map implementations involves both the contiguous array of bucket_count() list-head-iterator/pointer-sized buckets and one doubly-linked list node per key/value pair. Typically, bucket_count() + 2 * size() extra pointers of overhead, adjusted for any rounding-up of dynamic memory allocation request sizes the implementation might do. For example, if you ask for 100 bytes you might get 128 or 256 or 512. An implementation's dynamic memory routines might use some memory for tracking the allocated/available regions too.
Still, the C++ Standard leaves room for real-world implementations to make some of their own performance/memory-usage decisions. They could, for example, keep the old contiguous array of buckets around for a while after allocating a new larger array, so rehashing values into the latter can be done gradually to reduce the worst-case performance at the cost of average-case performance as both arrays are consulted during operations.
implementation and characteristics of maps / balanced binary trees
A map is a binary tree, and can be expected to employ pointers linking distinct heap memory regions returned by different calls to new. As well as the key/value data, each node in the tree will need parent, left, and right pointers (see wikipedia's binary tree article if lost).
comparison
So, both unordered_map and map need to allocate nodes for key/value pairs with the former typically having two-pointer/iterator overhead for prev/next-node linkage, and the latter having three for parent/left/right. But, the unordered_map additionally has the single contiguous allocation for bucket_count() buckets (== size() / load_factor()).
For most purposes that's not a dramatic difference in memory usage, and the deallocation time difference for one extra region is unlikely to be noticeable.
another alternative
For those occasions when the container's populated up front then repeatedly searched without further inserts/erases, it can sometimes be fastest to use a sorted vector, searched using Standard algorithms binary_search, equal_range, lower_bound, upper_bound. This has the advantage of a single contiguous memory allocation, which is much more cache friendly. It always outperforms map, but unordered_map may still be faster - measure if you care.
The reason there is both is that neither is better as a whole.
Use either one. Switch if the other proves better for your usage.
std::map provides better space for worse time.
std::unordered_map provides better time for worse space.
The answer to your question is heavily dependent on the particular STL implementation you're using. Really, you should look at your STL implementation's documentation – it'll likely have a good amount of information on performance.
In general, though, according to cppreference.com, maps are usually implemented as red-black trees and support operations with time complexity O(log n), while unordered_maps usually support constant-time operations. cppreference.com offers little insight into memory usage; however, another StackOverflow answer suggests maps will generally use less memory than unordered_maps.
For the STL implementation Microsoft packages with Visual Studio 2012, it looks like map supports these operations in amortized O(log n) time, and unordered_map supports them in amortized constant time. However, the documentation says nothing explicit about memory footprint.
Map:
Insertion:
For the first version ( insert(x) ), logarithmic.
For the second
version ( insert(position,x) ), logarithmic in general, but
amortized constant if x is inserted right after the element pointed
by position.
For the third version ( insert (first,last) ),
Nlog(size+N) in general (where N is the distance between first and
last, and size the size of the container before the insertion), but
linear if the elements between first and last are already sorted
according to the same ordering criterion used by the container.
Deletion:
For the first version ( erase(position) ), amortized constant.
For the second version ( erase(x) ), logarithmic in container size.
For the last version ( erase(first,last) ), logarithmic in container size plus linear in the distance between first and last.
Lookup:
Logarithmic in size.
Unordered map:
Insertion:
Single element insertions:
Average case: constant.
Worst case: linear in container size.
Multiple elements insertion:
Average case: linear in the number of elements inserted.
Worst case: N*(size+1): number of elements inserted times the container size plus one.
Deletion:
Average case: Linear in the number of elements removed ( constant when you remove just one element )
Worst case: Linear in the container size.
Lookup:
Average case: constant.
Worst case: linear in container size.
Knowing these, you can decide which container to use according to the type of the implementation.
Source: www.cplusplus.com
I'm trying to implement a priority queue as an sorted array backed minimum binary heap. I'm trying to get the update_key function to run in logarithmic time, but to do this I have to know the position of the item in the array. Is there anyway to do this without the use of a map? If so, how? Thank you
If you really want to be able to change the key of an arbitrary element, a heap is not the best choice of data structure. What it gives you is the combination of:
compact representation (no pointers, just an array and an implicit
indexing scheme)
logarithmic insertion, rebalancing
logarithmic removal of the smallest (largest) element.
O(1) access to the value of the smallest (largest) element. -
A side benefit of 1. is that the lack of pointers means you do substantially fewer calls to malloc/free (new/delete).
A map (represented in the standard library as a balanced binary tree) gives you the middle two of these, adding in
logarithmic find() on any key.
So while you could attach another data structure to your heap, storing pointers in the heap and then making the comparison operator dereference through the pointer, you'd pretty soon find yourself with the complexity in time and space of just using a map in the first place.
Your find key function should operate in log(n) time. Your updating (changing the key) should be constant time. Your remove function should run in log(n) time. Your insert function should be log(n) time.
If these assumptions are true try this:
1) Find your item in your heap (IE: binary search, since it is a sorted array).
2) Update your key (you're just changing a value, constant time)
3) Remove the item from the heap log(n) to reheapify.
4) Insert your item into the heap log(n).
So, you'd have log(n) + 1 + log(n) + log(n) which reduces to log(n).
Note: this is amortized, because if you have to realloc your array, etc... that adds overhead. But you shouldn't do that very often anyway.
That's the tradeoff of the array-backed heap: you get excellent memory use (good locality and minimal overhead), but you lose track of the elements. To solve it, you have to add back some overhead.
One solution would be this. The heap contains objects of type C*. C is a class with an int member heap_index, which is the index of the object in the heap array. Whenever you move an element inside the heap array, you'll have to update its heap_index to set it to the new index.
Update_key (as well as removal of an arbitrary element) is then log(n) time because it takes constant time to find the element (via heap_index), and log(n) time to bubble it into the correct position.