Moving minheap.top to maxheap.top where maxheap.top <= minheap.top - c++

I have a maxheap and a minheap where the maximum element of the maxheap is less than or equal to the minimum element of the minheap.
I now want to move the minimum element of the minheap to become the maximum element of the maxheap.
One way to do this would be to pop the top element of the minheap and push it onto the maxheap.
Is there a more efficient way to do this?
Here is what I ended up doing:
I actually had to insert an element into minheap and then do the operation described above, I did the following:
// place value to insert at end of minheap
mintoph[mintoph_size] = R;
// use std::pop_heap, minimum element now at end
pop_heap(mintoph.begin(), mintoph.begin() + mintoph_size + 1, greater<int>());
// (*) make room in maxheap at top
for (int pos = maxboth_size++; pos > 0;)
{
int parent = (pos - 1) / 2;
maxboth[pos] = maxboth[parent];
pos = parent;
}
// move element from back of minheap to maxheap head
maxboth[0] = mintoph[mintoph_size];
There is a waste of already paid-for comparisons at step (*) above, as parents are demoted to children, but I think this is unavoidable.

What you really need is an efficient way to insert into a priority queue when you know that the element being inserted is smaller/bigger than the min/max, depending on whether this is a min-heap or a max-heap. For the traditional "heap" data structure, this takes O(log n) time.
But if you are willing to use a different representation for your priority queues than the traditional "heap" data structure, then such an insert can trivially be made to run in O(1) time. Many different kinds of priority queues can do this, such as leftist heaps, skew heaps, or pairing heaps.
Edit: Of course, you'll still need to pay the cost of removing from the original priority queue, which will likely be O(log n) anyway, although there are approaches that may help there as well, such as "lazy deletes".

You'd probably be best served by using a min-max-heap or treap. min-max-heap seems tailor-made for what you're doing, but treaps are so dang well-rounded that they might work well too - especially if you need a little more than just looking up minimums and maximums and adding values.
http://en.wikipedia.org/wiki/Min-max_heap
http://en.wikipedia.org/wiki/Treap

Use min-max heap, it's a double ended priority queue, you can do this thing in o(lgn), you can not do it in less than O(lgn)
but you can use amortized algorithms that they do insertion in o(1) like fibonacci heap,

Related

Multiple insertions and deletion in priority_queue

Problem
In my algorithm I have to implement a "Ordered queue" (I choose this name to distinguish the idea in my mind from existing implementations). I have to insert some values into a queue where the value represents the order in the queue, and then I have to digest the queue following the order. I have the feeling that the best data structure for what I need to do is std::priority_queue, but I have some concerns about the efficiency of my program, in particular due to:
Interface which does not provide methods for insertions/deletions of multiple elements
(Possibly) Internal design of the class and its algorithms
From the documentation, both priority_queue::push and priority_queue::pop internally call std::push_heap and std::pop_heap, which both have complexity O(log(N)). I think it is very inefficient to insert/delete one element at a time, resorting the underlying container at every call.
Why is it implemented in this way? Maybe when you call std::push_heap and std::pop_heap in a sequence the underlying heap structure is in the optimal case and the complexity is reduced with respect to O(log(N))?
Otherwise, is there a best data structure which fits my needs that I have not considered? I thought also std::forward_list could fulfil my needs on deletion (through forward_list::pop_front), but I fear that the insertion becomes too expensive as I should find the iterator for the correct place to insert, which should be O(N).
I would prefer not to rely on any external library (Boost included) because the project must be lightweight and dependency-free.
Implementation
The program is equivalent to:
struct MyType{
double when;
int who;
MyType(double t, double i) : when(t), who(i) {};
bool operator<(const MyType & other) const{ return when < other.when; }
};
using OrderedQueue = priority_queue<MyType,std::vector<MyType>,std::less<MyType>>;
const double TMax = 1e9; // some BIG stopping condition
double some_time(){/*routine to generate the time*/ return TMax * rand(); }
int some_number(){/*routine to generate the number*/ return 100 * rand(); }
void populate(OrderedQueue & q){
unsigned Ni = 10; // number of insertions: it is not fixed in the real program
for (auto i = 0; i < Ni; ++i){
q.emplace(some_time(), some_number());
}
}
void use_MyType(MyType m){/*routine that uses the top value*/ return; }
void remove(double t, OrderedQueue & q){
while(q.top().when < t){
use_MyType(q.top());
q.pop();
}
}
int main(){
double t = 0;
OrderedQueue q;
while(t < TMax){
populate(q);
remove(t, q);
t += 1;
}
}
I am particularly interested in the efficiency of populate() and remove() because the loop when they are called has very many iterations.
std::priority_queue is an adaptor for a heap structure. Given your requirements of consuming elements in order one by one, a heap is the most efficient structure.
Heap insertions are worst case O(log(N)), but are on average O(1). This is faster than e.g. a binary tree (std::map) insertion, which is always O(log(N)). Similarly, removing the top element from a heap is worst case O(log(N)), but on average much faster since a heap is partially sorted.
With that said, the effects of branch prediction and caching in modern computers cannot be neglected. The best way to answer a performance question is to benchmark it with your actual data and a representative number of elements. I would suggest to benchmark using these 3 queue structures:
std::priority_queue<MyType, std::vector<MyType>>
std::priority_queue<MyType, std::deque<MyType>>
std::map<MyType>
std::deque as the backing store may offer improved pop_front performance, at the expense of slower random access. So it should be benchmarked.
I would disregard std::list (std::forward_list) at this point - inserting into a linked list at the right place is O(N), plus a linked list isn't cache-friendly, so is definitely going to be a much slower solution.
For more details on Heap vs Binary Tree performance, see this related question.
To address your concerns:
Interface which does not provide methods for insertions/deletions of multiple elements
Inserting an element into a heap involves appending the element at the end and "repairing" the heap structure. This is what the std::push_heap algorithm does. It is entirely feasible to implement an algorithm to insert multiple elements this way and/or simply invoke std::make_heap after appending multiple elements to repair the entire heap.
Removing multiple elements from a heap isn't possible, since a heap is only sorted with respect to the first (top) element. After removing it, the heap structure needs to be adjusted to find the next top element. This is what the std::pop_heap algorithm does.
Internal design of the class and its algorithms
std::priority_queue is just an adapter around heap algorithms. It's a convenience class that wraps a sequential container and invokes heap algorithms on it. You don't have to use it, you can use std::vector, std::push_heap and std::pop_heap with exact same results (though the code might be less readable and more error-prone).

Which container is most efficient for multiple insertions / deletions in C++?

I was set a homework challenge as part of an application process (I was rejected, by the way; I wouldn't be writing this otherwise) in which I was to implement the following functions:
// Store a collection of integers
class IntegerCollection {
public:
// Insert one entry with value x
void Insert(int x);
// Erase one entry with value x, if one exists
void Erase(int x);
// Erase all entries, x, from <= x < to
void Erase(int from, int to);
// Return the count of all entries, x, from <= x < to
size_t Count(int from, int to) const;
The functions were then put through a bunch of tests, most of which were trivial. The final test was the real challenge as it performed 500,000 single insertions, 500,000 calls to count and 500,000 single deletions.
The member variables of IntegerCollection were not specified and so I had to choose how to store the integers. Naturally, an STL container seemed like a good idea and keeping it sorted seemed an easy way to keep things efficient.
Here is my code for the four functions using a vector:
// Previous bit of code shown goes here
private:
std::vector<int> integerCollection;
};
void IntegerCollection::Insert(int x) {
/* using lower_bound to find the right place for x to be inserted
keeps the vector sorted and makes life much easier */
auto it = std::lower_bound(integerCollection.begin(), integerCollection.end(), x);
integerCollection.insert(it, x);
}
void IntegerCollection::Erase(int x) {
// find the location of the first element containing x and delete if it exists
auto it = std::find(integerCollection.begin(), integerCollection.end(), x);
if (it != integerCollection.end()) {
integerCollection.erase(it);
}
}
void IntegerCollection::Erase(int from, int to) {
if (integerCollection.empty()) return;
// lower_bound points to the first element of integerCollection >= from/to
auto fromBound = std::lower_bound(integerCollection.begin(), integerCollection.end(), from);
auto toBound = std::lower_bound(integerCollection.begin(), integerCollection.end(), to);
/* std::vector::erase deletes entries between the two pointers
fromBound (included) and toBound (not indcluded) */
integerCollection.erase(fromBound, toBound);
}
size_t IntegerCollection::Count(int from, int to) const {
if (integerCollection.empty()) return 0;
int count = 0;
// lower_bound points to the first element of integerCollection >= from/to
auto fromBound = std::lower_bound(integerCollection.begin(), integerCollection.end(), from);
auto toBound = std::lower_bound(integerCollection.begin(), integerCollection.end(), to);
// increment pointer until fromBound == toBound (we don't count elements of value = to)
while (fromBound != toBound) {
++count; ++fromBound;
}
return count;
}
The company got back to me saying that they wouldn't be moving forward because my choice of container meant the runtime complexity was too high. I also tried using list and deque and compared the runtime. As I expected, I found that list was dreadful and that vector took the edge over deque. So as far as I was concerned I had made the best of a bad situation, but apparently not!
I would like to know what the correct container to use in this situation is? deque only makes sense if I can guarantee insertion or deletion to the ends of the container and list hogs memory. Is there something else that I'm completely overlooking?
We cannot know what would make the company happy. If they reject std::vector without concise reasoning I wouldn't want to work for them anyway. Moreover, we dont really know the precise requirements. Were you asked to provide one reasonably well performing implementation? Did they expect you to squeeze out the last percent of the provided benchmark by profiling a bunch of different implementations?
The latter is probably too much for a homework challenge as part of an application process. If it is the first you can either
roll your own. It is unlikely that the interface you were given can be implemented more efficiently than one of the std containers does... unless your requirements are so specific that you can write something that performs well under that specific benchmark.
std::vector for data locality. See eg here for Bjarne himself advocating std::vector rather than linked lists.
std::set for ease of implementation. It seems like you want the container sorted and the interface you have to implement fits that of std::set quite well.
Let's compare only isertion and erasure assuming the container needs to stay sorted:
operation std::set std::vector
insert log(N) N
erase log(N) N
Note that the log(N) for the binary_search to find the position to insert/erase in the vector can be neglected compared to the N.
Now you have to consider that the asymptotic complexity listed above completely neglects the non-linearity of memory access. In reality data can be far away in memory (std::set) leading to many cache misses or it can be local as with std::vector. The log(N) only wins for huge N. To get an idea of the difference 500000/log(500000) is roughly 26410 while 1000/log(1000) is only ~100.
I would expect std::vector to outperform std::set for considerably small container sizes, but at some point the log(N) wins over cache. The exact location of this turning point depends on many factors and can only reliably determined by profiling and measuring.
Nobody knows which container is MOST efficient for multiple insertions / deletions. That is like asking what is the most fuel-efficient design for a car engine possible. People are always innovating on the car engines. They make more efficient ones all the time. However, I would recommend a splay tree. The time required for a insertion or deletion is a splay tree is not constant. Some insertions take a long time and some take only a very a short time. However, the average time per insertion/deletion is always guaranteed to be be O(log n), where n is the number of items being stored in the splay tree. logarithmic time is extremely efficient. It should be good enough for your purposes.
The first thing that comes to mind is to hash the integer value so single look ups can be done in constant time.
The integer value can be hashed to compute an index in to an array of bools or bits, used to tell if the integer value is in the container or not.
Counting and and deleting large ranges could be sped up from there, by using multiple hash tables for specific integer ranges.
If you had 0x10000 hash tables, that each stored ints from 0 to 0xFFFF and were using 32 bit integers you could then mask and shift the upper half of the int value and use that as an index to find the correct hash table to insert / delete values from.
IntHashTable containers[0x10000];
u_int32 hashIndex = (u_int32)value / 0x10000;
u_int32int valueInTable = (u_int32)value - (hashIndex * 0x10000);
containers[hashIndex].insert(valueInTable);
Count for example could be implemented as so, if each hash table kept count of the number of elements it contained:
indexStart = startRange / 0x10000;
indexEnd = endRange / 0x10000;
int countTotal = 0;
for (int i = indexStart; i<=indexEnd; ++i) {
countTotal += containers[i].count();
}
Not sure if using sorting really is a requirement for removing the range. It might be based on position. Anyway, here is a link with some hints which STL container to use.
In which scenario do I use a particular STL container?
Just FYI.
Vector maybe a good choice, but it does a lot of re allocation, as you know. I prefer deque instead, as it doesn't require big chunk of memory to allocate all items. For such requirement as you had, list probably fit better.
Basic solution for this problem might be std::map<int, int>
where key is the integer you are storing and value is the number of occurences.
Problem with this is that you can not quickly remove/count ranges. In other words complexity is linear.
For quick count you would need to implement your own complete binary tree where you can know the number of nodes between 2 nodes(upper and lower bound node) because you know the size of tree, and you know how many left and right turns you took to upper and lower bound nodes. Note that we are talking about complete binary tree, in general binary tree you can not make this calculation fast.
For quick range remove I do not know how to make it faster than linear.

Single 'Heapify' call when removing max and adding a new element c++ std::make_heap

Is it possible to pop the max, push a new element, and then call push_heap, maintaining the heap and only calling the bubble down algorithm once?
// What I think is correct:
heap.push_back(new_element);
std::push_heap(heap.begin(), heap.end());
std::pop_heap(heap.begin(), heap.end());
heap.pop_back();
// What I hope is correct:
heap.pop_back();
heap.push_back(new_element);
std::push_heap(heap.begin(), heap.end());
Is it 'safe' to remove the max and push a new element to a heap? I have tested it and it seems to be the case. Is there a reason I should not do this?
Very related question:
How to change max element in a heap in C++ standard library?
In a max heap created via make_heap, the maximum element will be on the front, and the appropriate way to remove it is with std::pop_heap, period.
A pop_front (not pop_back like you posit) would indeed remove the maximum element from the heap, but you are no longer guaranteed to have a heap. That is, popping the first element effectively moves the heaps left child into the root. If the right child was bigger, then the heap property is lost (and even if it worked for the first set of children, it may invalidate any subtree on account of the shift in the array).
Additionally, if you're using a contiguous container like a std::vector, then a pop_front (calling std::vector::erase on vector::begin) has the misfortune to cost you O(N) time, which is much worse than the O(log N) complexity of pop_heap.
The one scenario that you may (prematurely) optimize for is if you simply want to swap the maximum element with another element that is at least as big as the existing maximum element, or failing that, >= to the maximum element's largest child. In that case you may achieve O(1) complexity, but this is unlikely in the general case.
It's better to just have two O(log N) operations. The function grows so slowly that there will hardly be a difference between 1000 and 1 million element heaps.
What you are asking for is nearly an exact duplicate of the linked question in your post body.
// Precondition is that v is already a heap.
int get_max_element_add_new (std::vector<int> &v, int value) {
v.push_back(value);
std::pop_heap(v.begin(), v.end());
value = v.back();
v.pop_back();
return value;
}
Since pop_heap will swap the first and last values, the back will contain the max that you want to remove.

Is there any array-like data structure that can grow in size on both sides?

I'm a student working on a small project for an high performance computing course, hence efficiency it's a key issue.
Let say that I have a vector of N floats and I want to remove the smallest n elements and the biggest n elements. There are two simple ways of doing this:
A
sort in ascending order // O(NlogN)
remove the last n elements // O(1)
invert elements order // O(N)
remove the last n elements // O(1)
B
sort in ascending order // O(NlogN)
remove the last n elements // O(1)
remove the first n elements // O(N)
In A inverting the elements order require swapping all the elements, while in B removing the first n elements require moving all the others to occupy the positions left empty. Using std::remove would give the same problem.
If I could remove the first n elements for free then solution B would be cheaper. That should be easy to achieve, if instead of having a vector, i.e. an array with some empty space after vector::end(), I would have a container with some free space also before vector::begin().
So the question is: does exist already an array-like (i.e. contiguous memory, no linked lists) in some libraries (STL, Boost) that allows for O(1) inserting/removing on both sides of the array?
If not, do you think that there are better solutions than creating such a data structure?
Have you thought of using std::partition with a custom functor like the example below:
#include <iostream>
#include <vector>
#include <algorithm>
template<typename T>
class greaterLess {
T low;
T up;
public:
greaterLess(T const &l, T const &u) : low(l), up(u) {}
bool operator()(T const &e) { return !(e < low || e > up); }
};
int main()
{
std::vector<double> v{2.0, 1.2, 3.2, 0.3, 5.9, 6.0, 4.3};
auto it = std::partition(v.begin(), v.end(), greaterLess<double>(2.0, 5.0));
v.erase(it, v.end());
for(auto i : v) std::cout << i << " ";
std::cout << std::endl;
return 0;
}
This way you would erase elements from your vector in O(N) time.
Try boost::circular_buffer:
It supports random access iterators, constant time insert and erase operations at the beginning or the end of the buffer and interoperability with std algorithms.
Having looked at the source, it seems (and is only logical) that data is kept as a continuous memory block.
The one caveat is that the buffer has fixed capacity and after exhausting it elements will get overwritten. You can either detect such cases yourself and resize the buffer manually, or use boost::circular_buffer_space_optimized with a humongous declared capacity, since it won't allocate it if not needed.
To shrink & grow a vector at both ends, you can use idea of slices, reserving extra memory to expand into ahead of time at front and back, if efficient growth is needed.
Simply, make a class with not only a length but indices for first & last elements and a suitably sized vector, to create a window of data on the underlying block of stored floats. A C++ class can provide inlined functions, for things like deleting items, address into the array, find the nth largest value, shift the slice values down or up to insert new elements maintaining sorted order. Should no spare elements be available, then dynamic allocation of a new larger float store, permits continuing growth at the cost of an array copy.
A circular buffer is designed as a FIFO, with new elements added at end, removal at front, and not allowing insertion in the middle, a self defined class can also (trivially) support array subscript values different from 0..N-1
Due to memory locality, avoiding excessive indirection due to pointer chains, and the pipelining of subscript calculations on a modern processor, a solution based on an array (or a vector), is likely to be most efficicent, despite element copying on insertion. Deque would be suitable but it fails to guarantee contiguous storage.
Additional supplementary info. Researching classes providing slices, finds some plausible alternatives to evaluate :
A) std::slice which uses slice_arrays
B) Boost Class Range
Hope this is the kind of specific information you were hoping for, in general a simpler clearer solution is more maintainable, than a tricky one. I would expect slices and ranges on sorted data sets, being quite common, for example filtering experimental data where "outliers" are excluded as faulty readings.
I think a good solution, should actually be - O(NlogN), 2xO(1), with any binary searches O(logN +1) for filtering on outlying values, in place of deleting a fixed number of small or large values; it matters that the "O" is relatively fast to, sometimes an O(1) algorithmn can be in practice slower for practical values of N than an O(N) one.
as a complementary to #40two 's answer, before partitioning the array, you will need to find the partitioning pivot, which is you will need to find the nth smallest number, and the nth greatest number in an unsorted array.
There is a discussion on that in SO: How to find the kth largest number in unsorted array
There are several algorithms to solve this problem. Some are deterministic O(N) - on of them is a variation on finding the median (median of medians). There are some non-deterministic algorithms with O(N) average-case.
A good source book to find those algorithms is Introduction to algorithms.
Also in books like
So eventually, your code will run in an O(N) time

Sorted data structure for in-order iteration, ordered push, and removal (N elements only from top)

What is considered an optimal data structure for pushing something in order (so inserts at any position, able to find correct position), in-order iteration, and popping N elements off the top (so the N smallest elements, N determined by comparisons with threshold value)? The push and pop need to be particularly fast (run every iteration of a loop), while the in-order full iteration of the data happens at a variable rate but likely an order of magnitude less often. The data can't be purged by the full iteration, it needs to be unchanged. Everything that is pushed will eventually be popped, but since a pop can remove multiple elements there can be more pushes than pops. The scale of data in the structure at any one time could go up to hundreds or low thousands of elements.
I'm currently using a std::deque and binary search to insert elements in ascending order. Profiling shows it taking up the majority of the time, so something has got to change. std::priority_queue doesn't allow iteration, and hacks I've seen to do it won't iterate in order. Even on a limited test (no full iteration!), the std::set class performed worse than my std::deque approach.
None of the classes I'm messing with seem to be built with this use case in mind. I'm not averse to making my own class, if there's a data structure not to be found in STL or boost for some reason.
edit:
There's two major functions right now, push and prune. push uses 65% of the time, prune uses 32%. Most of the time used in push is due to insertion into the deque (64% out of 65%). Only 1% comes from the binary search to find the position.
template<typename T, size_t Axes>
void Splitter<T, Axes>::SortedData::push(const Data& data) //65% of processing
{
size_t index = find(data.values[(axis * 2) + 1]);
this->data.insert(this->data.begin() + index, data); //64% of all processing happens here
}
template<typename T, size_t Axes>
void Splitter<T, Axes>::SortedData::prune(T value) //32% of processing
{
auto top = data.begin(), end = data.end(), it = top;
for (; it != end; ++it)
{
Data& data = *it;
if (data.values[(axis * 2) + 1] > value) break;
}
data.erase(top, it);
}
template<typename T, size_t Axes>
size_t Splitter<T, Axes>::SortedData::find(T value)
{
size_t start = 0;
size_t end = this->data.size();
if (!end) return 0;
size_t diff;
while (diff = (end - start) >> 1)
{
size_t mid = diff + start;
if (this->data[mid].values[(axis * 2) + 1] <= value)
{
start = mid;
}
else
{
end = mid;
}
}
return this->data[start].values[(axis * 2) + 1] <= value ? end : start;
}
With your requirements, a hybrid data-structure tailored to your needs will probably perform best. As others have said, continuous memory is very important, but I would not recommend keeping the array sorted at all times. I propose you use 3 buffers (1 std::array and 2 std::vectors):
1 (constant-size) Buffer for the "insertion heap". Needs to fit into the cache.
2 (variable-sized) Buffers (A+B) to maintain and update sorted arrays.
When you push an element, you add it to the insertion heap via std::push_heap. Since the insertion heap is constant size, it can overflow. When that happens, you std::sort it backwards and std::merge it with the already sorted-sequence buffer (A) into the third (B), resizing them as needed. That will be the new sorted buffer and the old one can be discarded, i.e. you swap A and B for the next bulk operation. When you need the sorted sequence for iteration, you do the same. When you remove elements, you compare the top element in the heap with the last element in the sorted sequence and remove that (which is why you sort it backwards, so that you can pop_back instead of pop_front).
For reference, this idea is loosely based on sequence heaps.
Have you tried messing around with std::vector? As weird as it may sound it could be actually pretty fast because it uses continuous memory. If I remember correctly Bjarne Stroustrup was talking about this at Going Native 2012 (http://channel9.msdn.com/Events/GoingNative/GoingNative-2012/Keynote-Bjarne-Stroustrup-Cpp11-Style but I'm not 100% sure that it's in this video).
You save time with the binary search, but the insertion in random positions of the deque is slow. I would suggest an std::map instead.
From your edit, it sounds like the delay is in copying - is it a complex object? Can you heap allocate and store pointers in the structure so each entry is created once only; you'll need to provide a custom comparitor that takes pointers, as the objects operator<() wouldn't be called. (The custom comparitor can simply call operator<())
EDIT:
Your own figures show it's the insertion that takes the time, not the 'sorting'. While some of that insertion time is creating a copy of your object, some (possibly most) is creation of the internal structure that will hold your object - and I don't think that will change between list/map/set/queue etc. IF you can predict the likely eventual/maximum size of your data set, and can write or find your own sorting algorithm, and the time is being lost in allocating objects, then vector might be the way to go.