I created a program that finds the median of a list of numbers. The list of numbers is dynamic in that numbers can be removed and inserted (duplicate numbers can be entered) and during this time, the new median is re-evaluated and printed out.
I created this program using a multimap because
1) the benefit of it being already being sorted,
2) easy insertion, deletion, searching (since multimap implements binary search)
3) duplicate entries are allowed.
The constraints for the number of entries + deletions (represented as N) are: 0 < N <= 100,000.
The program I wrote works and prints out the correct median, but it isn't fast enough. I know that the unsorted_multimap is faster than multimap, but then the problem with unsorted_multimap is that I would have to sort it. I have to sort it because to find the median you need to have a sorted list. So my question is, would it be practical to use an unsorted_multimap and then quick sort the entries, or would that just be ridiculous? Would it be faster to just use a vector, quicksort the vector, and use a binary search? Or maybe I am forgetting some fabulous solution out there that I haven't even thought of.
Though I'm not new to C++, I will admit, that my skills with time-complexity are somewhat medicore.
The more I look at my own question, the more I'm beginning to think that just using a vector with quicksort and binary search would be better since the data structures basically already implement vectors.
the more I look at my own question, the more I'm beginning to think that just using vector with quicksort and binary search would be better since the data structures basically already implement vectors.
If you have only few updates - use unsorted std::vector + std::nth_element algorithm which is O(N). You don't need full sorting which is O(N*ln(N)).
live demo of nth_element:
#include <algorithm>
#include <iterator>
#include <iostream>
#include <ostream>
#include <vector>
using namespace std;
template<typename RandomAccessIterator>
RandomAccessIterator median(RandomAccessIterator first,RandomAccessIterator last)
{
RandomAccessIterator m = first + distance(first,last)/2; // handle even middle if needed
nth_element(first,m,last);
return m;
}
int main()
{
vector<int> values = {5,1,2,4,3};
cout << *median(begin(values),end(values)) << endl;
}
Output is:
3
If you have many updates and only removing from middle - use two heaps as comocomocomocomo suggests. If you would use fibonacci_heap - then you would also get O(N) removing from arbitary position (if don't have handle to it).
If you have many updates and need O(ln(N)) removing from arbitary places - then use two multisets as ipc suggests.
If your purpose is to keep track of the median on the fly, as elements are inserted/removed, you should use a min-heap and a max-heap. Each one would contain one half of the elements... There was a related question a couple of days ago: How to implement a Median-heap
Though, if you need to search for specific values in order to remove elements, you still need some kind of map.
You said that it is slow. Are you iterating from the beginning of the map to the (N/2)'th element every time you need the median? You don't need to. You can keep track of the median by maintaining an iterator pointing to it at all times and a counter of the number of elements less than that one. Every time you insert/remove, compare the new/old element with the median and update both iterator and counter.
Another way of seeing it is as two multimaps containing half the elements each. One holds the elements less than the median (or equal) and the other holds those greater. The heaps do this more efficiently, but they don't support searches.
If you only need the median a few times you can use the "select" algorithm. It is described in Sedgewick's book. It takes O(n) time on average. It is similar to quick sort but it does not sort completely. It just partitions the array with random pivots until, eventually, it gets to "select" on one side the smaller m elements (m=(n+1)/2). Then you search for the greatest of those m elements, and this is the median.
Here is how you could implement that in O(log N) per update:
template <typename T>
class median_set {
public:
std::multiset<T> below, above;
// O(log N)
void rebalance()
{
int diff = above.size() - below.size();
if (diff > 0) {
below.insert(*above.begin());
above.erase(above.begin());
} else if (diff < -1) {
above.insert(*below.rbegin());
below.erase(below.find(*below.rbegin()));
}
}
public:
// O(1)
bool empty() const { return below.empty() && above.empty(); }
// O(1)
T const& median() const
{
assert(!empty());
return *below.rbegin();
}
// O(log N)
void insert(T const& value)
{
if (!empty() && value > median())
above.insert(value);
else
below.insert(value);
rebalance();
}
// O(log N)
void erase(T const& value)
{
if (value > median())
above.erase(above.find(value));
else
below.erase(below.find(value));
rebalance();
}
};
(Work in action with tests)
The idea is the following:
Keep track of the values above and below the median in two sets
If a new value is added, add it to the corresponding set. Always ensure that the set below has exactly 0 or 1 more then the other
If a value is removed, remove it from the set and make sure that the condition still holds.
You can't use priority_queues because they won't let you remove one item.
Can any one help me what is Space and Time complexity of my following C# program with details.
//Passing Integer array to Find Extreme from that Integer Array
public int extreme(int[] A)
{
int N = A.Length;
if (N == 0)
{
return -1;
}
else
{
int average = CalculateAverage(A);
return FindExtremes(A, average);
}
}
// Calaculate Average of integerArray
private int CalculateAverage(int[] integerArray)
{
int sum = 0;
foreach (int value in integerArray)
{
sum += value;
}
return Convert.ToInt32(sum / integerArray.Length);
}
//Find Extreme from that Integer Array
private int FindExtremes(int[] integerArray, int average) {
int Index = -1; int ExtremeElement = integerArray[0];
for (int i = 0; i < integerArray.Length; i++)
{
int absolute = Math.Abs(integerArray[i] - average);
if (absolute > ExtremeElement)
{
ExtremeElement = integerArray[i];
Index = i;
}
}
return Index;
}
You are almost certainly better off using a vector. Possibly maintaining an auxiliary vector of indexes to be removed between median calculations so you can delete them in batches. New additions can also be put into an auxiliary vector, sorted, then merged in.
Related
I am solving a problem on LeetCode:
Given an unsorted array of integers nums, return the length of the longest consecutive elements sequence. You must write an algorithm that runs in O(n) time. So for nums = [100,4,200,1,3,2], the output is 4.
The Union Find solution to solve this is as below:
class Solution {
public:
vector<int> parent, sz;
int find(int i) {
if(parent[i]==i) return i;
return parent[i]=find(parent[i]);
}
void merge(int i, int j) {
int p1=find(i);
int p2=find(j);
if(p1==p2) return;
if(sz[p1]>sz[p2]) {
sz[p1]+=sz[p2];
parent[p2]=p1;
} else {
sz[p2]+=sz[p1];
parent[p1]=p2;
}
}
int longestConsecutive(vector<int>& nums) {
sz.resize(nums.size(),1);
parent.resize(nums.size(),0);
iota(begin(parent),end(parent),0);
unordered_map<int, int> m;
for(int i=0; i<nums.size(); i++) {
int n=nums[i];
if(m.count(n)) continue;
if(m.count(n-1)) merge(i,m[n-1]);
if(m.count(n+1)) merge(i,m[n+1]);
m[n]=i;
}
int res=0;
for(int i=0; i<parent.size(); i++) {
if(parent[i]==i && sz[i]>res) {
res=sz[i];
}
}
return res;
}
};
This gets accepted by the OJ (Runtime: 80 ms, faster than 76.03% of C++ online submissions for Longest Consecutive Sequence), but is this really O(n), as claimed by many answers, such as this one? My understanding is that Union Find is an O(NlogN) algorithm.
Are they right? Or, am I missing something?
They are right. A properly implemented Union Find with path compression and union by rank has linear run time complexity as a whole, while any individual operation has an amortized constant run time complexity. The exact complexity of m operations of any type is O(m * alpha(n)) where alpha is the inverse Ackerman function. For any possible n in the physical world, the inverse Ackerman function doesn't exceed 4. Thus, we can state that individual operations are constant and algorithm as a whole linear.
The key part for path compression in your code is here:
return parent[i]=find(parent[i])
vs the following that doesn't employ path compression:
return find(parent[i])
What this part of the code does is that it flattens the structure of the nodes in the hierarchy and links each node directly to the final root. Only in the first run of find will you traverse the whole structure. The next time you'll get a direct hit since you set the node's parent to its ultimate root. Notice that the second code snippet works perfectly fine, but it just does redundant work when you are not interested in the path itself and only in the final root.
Union by rank is evident here:
if(sz[p1]>sz[p2]) {...
It makes sure that the node with more children becomes the root of the node with less children. Therefore, less nodes need to be reassigned a new parent, hence less work.
Note: The above was updated and corrected based on feedback from #Matt-Timmermans and #kcsquared.
I was writing a program to input the marks of n students in four subjects and then find the rank of one of them based on the total scores (from codeforces.com: https://codeforces.com/problemset/problem/1017/A). I thought storing the marks in a structure would help keeping track of the various subjects.
Now, what I did is simply implement a bubble sort on the vector while checking the total value. I want to know, is there a way that I can sort the vector based on just one of the members of the struct using std::sort()? Also, how do we make it descending?
Here is what the code looks like right now:
//The Structure
struct scores
{
int eng, ger, mat, his, tot, rank;
bool tommyVal;
};
//The Sort (present inside the main function)
bool sorted = false;
while (!sorted)
{
sorted = true;
for (int i = 0; i < n-1; i++)
{
if (stud[i].tot < stud[i + 1].tot)
{
std::swap(stud[i], stud[i + 1]);
sorted = false;
}
}
}
Just in case you're interested, I need to find the rank of a student named Thomas. So, for that, I set the value of tommyVal true for his element, while I set it as false for the others. This way, I can easily locate Thomas' marks even though his location in the vector has changed after sorting it based on their total marks.
Also nice to know that std::swap() works for swapping entire structs as well. I wonder what other data structures it can swap.
std::sort() allows you to give it a predicate so you can perform comparisons however you want, eg:
std::sort(
stud.begin(),
stud.begin()+n, // <-- use stud.end() instead if n == stud.size() ...
[](const scores &a, const scores &b){ return a.tot < b.tot; }
);
Simply use return b.tot < a.tot to reverse the sorting order.
Given a sequence of data (it may have duplicates), a fixed-sized moving
window, move the window at each iteration from the start of the data
sequence, such that
(1) the oldest data element is removed from the window and a new data
element is pushed into the window
(2) find the median of the data inside the window at each moving.
The following posts are not helpful.
Effectively to find the median value of a random sequence
joining data based on a moving time window in R
My idea:
Use 2 heaps to hold median. In side the window, sort the data in the window
in the first iteration, the min heap holds the larger part and the max heap
holds the smaller part. If the window has odd number of data, the max heap
returns the median otherwise the arithmetic mean of the top elements of the
two heaps is the median.
When a new data is pushed in to the window, remove the oldest data from one
of the heap and compare the new data with the top of max and min heap so
that to decide which heap the data to be put. Then, find the median just
like in the first iteration.
But, how to find a data element in a heap is a problem. Heap is a binary
tree not a binary search tree.
Is it possible to solve it with O(n) or O(n * lg m) where m is the window size and
space: O(1) ?
Any help is really appreciated.
Thanks
O(n*lg m) is easy:
Just maintain your window as two std::sets, one for the lower half, one for the upper half. Insertion of a new element costs O(lg m), finding and removal of an old element costs the same. Determining the median using the method you described in your question costs O(1).
As you slide the window over your sequence, in each iteration you remove the item falling out of the window (O(lg m)), insert the new item (O(lg m)) and compute the median (O(1)), resulting in a total of O(n lg m).
This solution uses space O(m), of course but I don't think you can get away without storing the window's contents.
I have implemented almost exactly the algorithm you describe here: http://ideone.com/8VVEa, and described it here: Rolling median in C - Turlach implementation
The way to get around the "find oldest" problem is to keep the values in a circular buffer, so you always have a pointer to the oldest. What you store in the heap are buffer indexes.
So the space requirement is 2M, and each update is O(lg M).
Same answer as hc_ but instead of using a stock BST use a version where every node has the count of elements in that sub-tree. This way finding median is O(log(m)).
I gave this answer for the "rolling median in C" question
I couldn't find a modern implementation of a c++ data structure with order-statistic so ended up implementing both ideas in top coders link ( Match Editorial: scroll down to FloatingMedian).
Two multisets
The first idea partitions the data into two data structures (heaps, multisets etc) with O(ln N) per insert/delete does not allow the quantile to be changed dynamically without a large cost. I.e. we can have a rolling median, or a rolling 75% but not both at the same time.
Segment tree
The second idea uses a segment tree which is O(ln N) for insert/deletions/queries but is more flexible. Best of all the "N" is the size of your data range. So if your rolling median has a window of a million items, but your data varies from 1..65536, then only 16 operations are required per movement of the rolling window of 1 million!! (And you only need 65536 * sizeof(counting_type) bytes, e.g. 65536*4).
GNU Order Statistic Trees
Just before giving up, I found that stdlibc++ contains order statistic trees!!!
These have two critical operations:
iter = tree.find_by_order(value)
order = tree.order_of_key(value)
See libstdc++ manual policy_based_data_structures_test (search for "split and join").
I have wrapped the tree for use in a convenience header for compilers supporting c++0x/c++11 style partial typedefs:
#if !defined(GNU_ORDER_STATISTIC_SET_H)
#define GNU_ORDER_STATISTIC_SET_H
#include <ext/pb_ds/assoc_container.hpp>
#include <ext/pb_ds/tree_policy.hpp>
// A red-black tree table storing ints and their order
// statistics. Note that since the tree uses
// tree_order_statistics_node_update as its update policy, then it
// includes its methods by_order and order_of_key.
template <typename T>
using t_order_statistic_set = __gnu_pbds::tree<
T,
__gnu_pbds::null_type,
std::less<T>,
__gnu_pbds::rb_tree_tag,
// This policy updates nodes' metadata for order statistics.
__gnu_pbds::tree_order_statistics_node_update>;
#endif //GNU_ORDER_STATISTIC_SET_H
I attach my segment tree (see my other post) which allows the frequency distribution of counts to be queried very efficiently.
This implements the following data structure:
|-------------------------------|
|---------------|---------------|
|-------|-------|-------|-------|
|---|---|---|---|---|---|---|---|
0 1 2 3 4 5 6 7
Each segment keeps the number of counts items in the range it covers.
I use 2N segments for a range of value from 1..N.
These are placed in a single rolled out vector rather than the tree format show figuratively above.
So if you are calculating rolling medians over a set of integers which vary from 1..65536, then you only need 128kb to store them, and can insert/delete/query using O(ln N) where N = the size of the range, i.e. 2**16 operations.
This is a big win if the data range is much smaller than your rolling window.
#if !defined(SEGMENT_TREE_H)
#define SEGMENT_TREE_H
#include <cassert>
#include <array>
#include <algorithm>
#include <set>
#ifndef NDEBUG
#include <set>
#endif
template<typename COUNTS, unsigned BITS>
class t_segment_tree
{
static const unsigned cnt_elements = (1 << BITS);
static const unsigned cnt_storage = cnt_elements << 1;
std::array<COUNTS, cnt_elements * 2 - 1> counts;
unsigned count;
#ifndef NDEBUG
std::multiset<unsigned> elements;
#endif
public:
//____________________________________________________________________________________
// constructor
//____________________________________________________________________________________
t_segment_tree(): count(0)
{
std::fill_n(counts.begin(), counts.size(), 0);
}
//~t_segment_tree();
//____________________________________________________________________________________
// size
//____________________________________________________________________________________
unsigned size() const { return count; }
//____________________________________________________________________________________
// constructor
//____________________________________________________________________________________
void insert(unsigned x)
{
#ifndef NDEBUG
elements.insert(x);
assert("...............This element is too large for the number of BITs!!..............." && cnt_elements > x);
#endif
unsigned ii = x + cnt_elements;
while (ii)
{
++counts[ii - 1];
ii >>= 1;
}
++count;
}
//____________________________________________________________________________________
// erase
// assumes erase is in the set
//____________________________________________________________________________________
void erase(unsigned x)
{
#ifndef NDEBUG
// if the assertion failed here, it means that x was never "insert"-ed in the first place
assert("...............This element was not 'insert'-ed before it is being 'erase'-ed!!..............." && elements.count(x));
elements.erase(elements.find(x));
#endif
unsigned ii = x + cnt_elements;
while (ii)
{
--counts[ii - 1];
ii >>= 1;
}
--count;
}
//
//____________________________________________________________________________________
// kth element
//____________________________________________________________________________________
unsigned operator[](unsigned k)
{
assert("...............The kth element: k needs to be smaller than the number of elements!!..............." && k < size());
unsigned ii = 1;
while (ii < cnt_storage)
{
if (counts[ii - 1] <= k)
k -= counts[ii++ - 1];
ii <<= 1;
}
return (ii >> 1) - cnt_elements;
}
};
#endif
I have a list of randomly ordered unique closed-end ranges R0...Rn-1 where
Ri = [r1i, r2i] (r1i <= r2i)
Subsequently some of the ranges overlap (partially or completely) and hence require merging.
My question is, what are the best-of-breed algorithms or techniques used for merging such ranges. Examples of such algorithms or links to libraries that perform such a merging operation would be great.
What you need to do is:
Sort items lexicographically where range key is [r_start,r_end]
Iterate the sorted list and check if current item overlaps with next. If it does extend current item to be r[i].start,r[i+1].end, and goto next item. If it doesn't overlap add current to result list and move to next item.
Here is sample code:
vector<pair<int, int> > ranges;
vector<pair<int, int> > result;
sort(ranges.begin(),ranges.end());
vector<pair<int, int> >::iterator it = ranges.begin();
pair<int,int> current = *(it)++;
while (it != ranges.end()){
if (current.second > it->first){ // you might want to change it to >=
current.second = std::max(current.second, it->second);
} else {
result.push_back(current);
current = *(it);
}
it++;
}
result.push_back(current);
Boost.Icl might be of use for you.
The library offers a few templates that you may use in your situation:
interval_set — Implements a set as a set of intervals - merging adjoining intervals.
separate_interval_set — Implements a set as a set of intervals - leaving adjoining intervals separate
split_interval_set — implements a set as a set of intervals - on insertion overlapping intervals are split
There is an example for merging intervals with the library :
interval<Time>::type night_and_day(Time(monday, 20,00), Time(tuesday, 20,00));
interval<Time>::type day_and_night(Time(tuesday, 7,00), Time(wednesday, 7,00));
interval<Time>::type next_morning(Time(wednesday, 7,00), Time(wednesday,10,00));
interval<Time>::type next_evening(Time(wednesday,18,00), Time(wednesday,21,00));
// An interval set of type interval_set joins intervals that that overlap or touch each other.
interval_set<Time> joinedTimes;
joinedTimes.insert(night_and_day);
joinedTimes.insert(day_and_night); //overlapping in 'day' [07:00, 20.00)
joinedTimes.insert(next_morning); //touching
joinedTimes.insert(next_evening); //disjoint
cout << "Joined times :" << joinedTimes << endl;
and the output of this algorithm:
Joined times :[mon:20:00,wed:10:00)[wed:18:00,wed:21:00)
And here about complexity of their algorithms:
Time Complexity of Addition
A simple algorithm would be:
Sort the ranges by starting values
Iterate over the ranges from beginning to end, and whenever you find a range that overlaps with the next one, merge them
O(n*log(n)+2n):
Make a mapping of r1_i -> r2_i,
QuickSort upon the r1_i's,
go through the list to select for each r1_i-value the largest r2_i-value,
with that r2_i-value you can skip over all subsequent r1_i's that are smaller than r2_i
jethro's answer contains an error.
It should be
if (current.second > it->first){
current.second = std::max(current.second, it->second);
} else {
My algorithm does not use extra space and is lightweight as well. I have used 2-pointer approach. 'i' keeps increasing while 'j' keeps track of the current element being updated.
Here is my code:
bool cmp(Interval a,Interval b)
{
return a.start<=b.start;
}
vector<Interval> Solution::insert(vector<Interval> &intervals, Interval newInterval) {
int i,j;
sort(intervals.begin(),intervals.end(),cmp);
i=1,j=0;
while(i<intervals.size())
{
if(intervals[j].end>=intervals[i].start) //if overlaps
{
intervals[j].end=max(intervals[i].end,intervals[j].end); //change
}
else
{
j++;
intervals[j]=intervals[i]; //update it on the same list
}
i++;
}
intervals.erase(intervals.begin()+j+1,intervals.end());
return intervals;
}
Interval can be a public class or structure with data members 'start' and 'end'.
Happy coding :)
I know that this is a long time after the original accepted answer. But in
c++11, we can now construct a priority_queue in the following manner`
priority_queue( const Compare& compare, const Container& cont )
in O(n) comparisons.
Please see https://en.cppreference.com/w/cpp/container/priority_queue/priority_queue
for more details.
So we can create a priority_queue(min heap) of pairs in O(n) time. Get the lowest interval in O(1) and pop it in O(log(n)) time.
So the overall time complexity is close to O(nlog(n) + 2n) = O(nlogn)
This is not a homework.
I'm using a small "priority queue" (implemented as array at the moment) for storing last N items with smallest value. This is a bit slow - O(N) item insertion time. Current implementation keeps track of largest item in array and discards any items that wouldn't fit into array, but I still would like to reduce number of operations further.
looking for a priority queue algorithm that matches following requirements:
queue can be implemented as array, which has fixed size and _cannot_ grow. Dynamic memory allocation during any queue operation is strictly forbidden.
Anything that doesn't fit into array is discarded, but queue keeps all smallest elements ever encountered.
O(log(N)) insertion time (i.e. adding element into queue should take up to O(log(N))).
(optional) O(1) access for *largest* item in queue (queue stores *smallest* items, so the largest item will be discarded first and I'll need them to reduce number of operations)
Easy to implement/understand. Ideally - something similar to binary search - once you understand it, you remember it forever.
Elements need not to be sorted in any way. I just need to keep N smallest value ever encountered. When I'll need them, I'll access all of them at once. So technically it doesn't have to be a queue, I just need N last smallest values to be stored.
I initially thought about using binary heaps (they can be easily implemented via arrays), but apparently they don't behave well when array can't grow anymore. Linked lists and arrays will require extra time for moving things around. stl priority queue grows and uses dynamic allocation (I may be wrong about it, though).
So, any other ideas?
--EDIT--
I'm not interested in STL implementation. STL implementation (suggested by a few people) works a bit slower than currently used linear array due to high number of function calls.
I'm interested in priority queue algorithms, not implemnetations.
Array based heaps seem ideal for your purpose. I am not sure why you rejected them.
You use a max-heap.
Say you have an N element heap (implemented as an array) which contains the N smallest elements seen so far.
When an element comes in you check against the max (O(1) time), and reject if it is greater.
If the value coming in is lower, you modify the root to be the new value and sift-down this changed value - worst case O(log N) time.
The sift-down process is simple: Starting at root, at each step you exchange this value with it's larger child until the max-heap property is restored.
So, you will not have to do any deletes which you probably will have to, if you use std::priority_queue. Depending on the implementation of std::priority_queue, this could cause memory allocation/deallocation.
So you can have the code as follows:
Allocated Array of size N.
Fill it up with the first N elements you see.
heapify (you should find this in standard text books, it uses sift-down). This is O(N).
Now any new element you get, you either reject it in O(1) time or insert by sifting-down in worst case O(logN) time.
On an average, though, you probably will not have to sift-down the new value all the way down and might get better than O(logn) average insert time (though I haven't tried proving it).
You only allocate size N array once and any insertion is done by exchanging elements of the array, so there is no dynamic memory allocation after that.
Check out the wiki page which has pseudo code for heapify and sift-down: http://en.wikipedia.org/wiki/Heapsort
Use std::priority_queue with the largest item at the head. For each new item, discard it if it is >= the head item, otherwise pop the head item and insert the new item.
Side note: Standard containers will only grow if you make them grow. As long as you remove one item before inserting a new item (after it reaches its maximum size, of course), this won't happen.
Most priority queues I work are based on linked lists. If you have a pre-determined number of priority levels, you can easily create a priority queue with O(1) insertion by having an array of linked lists--one linked list per priority level. Items of the same priority will of course degenerate into either a FIFO, but that can be considered acceptable.
Adding and removal then becomes something like (your API may vary) ...
listItemAdd (&list[priLevel], &item); /* Add to tail */
pItem = listItemRemove (&list[priLevel]); /* Remove from head */
Getting the first item in the queue then becomes a problem of finding the non-empty linked-list with the highest priority. That may be O(N), but there are several tricks you can use to speed it up.
In your priority queue structure, keep a pointer or index or something to the linked list with the current highest priority. This would need to be updated each time an item is added or removed from the priority queue.
Use a bitmap to indicate which linked lists are not empty. Combined with a find most significant bit, or find least significant bit algorithm you can usually test up to 32 lists at once. Again, this would need to be updated on each add / remove.
Hope this helps.
If amount of priorities is small and fixed than you can use ring-buffer for each priority. That will lead to waste of the space if objects is big, but if their size is comparable with pointer/index than variants with storing additional pointers in objects may increase size of array in the same way.
Or you can use simple single-linked list inside array and store 2*M+1 pointers/indexes, one will point to first free node and other pairs will point to head and tail of each priority. In that cases you'll have to compare in avg. O(M) before taking out next node with O(1). And insertion will take O(1).
If you construct an STL priority queue at the maximum size (perhaps from a vector initialized with placeholders), and then check the size before inserting (removing an item if necessary beforehand) you'll never have dynamic allocation during insert operations. The STL implementation is quite efficient.
Matters Computational see page 158. The implementation itself is quite well, and you can even tweak it a little without making it less readable. For example, when you compute the left child like:
int left = i / 2;
You can compute the rightchild like so:
int right = left + 1;
Found a solution ("difference" means "priority" in the code, and maxRememberedResults is 255 (could be any (2^n - 1)):
template <typename T> inline void swap(T& a, T& b){
T c = a;
a = b;
b = c;
}
struct MinDifferenceArray{
enum{maxSize = maxRememberedResults};
int size;
DifferenceData data[maxSize];
void add(const DifferenceData& val){
if (size >= maxSize){
if(data[0].difference <= val.difference)
return;
data[0] = val;
for (int i = 0; (2*i+1) < maxSize; ){
int next = 2*i + 1;
if (data[next].difference < data[next+1].difference)
next++;
if (data[i].difference < data[next].difference)
swap(data[i], data[next]);
else
break;
i = next;
}
}
else{
data[size++] = val;
for (int i = size - 1; i > 0;){
int parent = (i-1)/2;
if (data[parent].difference < data[i].difference){
swap(data[parent], data[i]);
i = parent;
}
else
break;
}
}
}
void clear(){
size = 0;
}
MinDifferenceArray()
:size(0){
}
};
build max-based queue (root is largest)
until it is full, fill up normally
when it is full, for every new element
Check if new element is smaller than root.
if it is larger or equal than root, reject.
otherwise, replace root with new element and perform normal heap "sift-down".
And we get O(log(N)) insert as a worst case scenario.
It is the same solution as the one provided by user with nickname "Moron".
Thanks to everyone for replies.
P.S. Apparently programming without sleeping enough wasn't a good idea.
It's better to implement your own class using std::array and heap algorithms.
`template<class T, int fixed_size = 5>
class fixed_size_arr_pqueue_v2
{
std::array<T, fixed_size> _data;
int _size = 0;
int parent(int i)
{
return (i - 1)/2;
}
void heapify(int i, bool downward = false)
{
int l = 2*i + 1;
int r = 2*i + 2;
int largest = 0;
if (l < size() && _data[l] > _data[i])
largest = l;
else
largest = i;
if (r < size() && _data[r] > _data[largest])
largest = r;
if (largest != i)
{
std::swap(_data[largest], _data[i]);
if (!downward)
heapify(parent(i));
else
heapify(largest, true);
}
}
public:
void push(T &d)
{
if (_size == fixed_size)
{
//min elements in a max heap lies at leaves only.
auto minItr = std::min_element(begin(_data) + _size/2, end(_data));
auto minPos {minItr - _data.begin()};
auto min { *minItr};
if (d > min)
{
_data.at(minPos) = d;
if (_data[parent(minPos)] > d)
{
//this is unlikely to happen in our case? as this position is a leaf?
heapify(minPos, true);
}
else
heapify(parent(minPos));
}
return ;
}
_data.at(_size++) = d;
std::push_heap(_data.begin(), _data.begin() + _size);
}
T pop()
{
T d = _data.front();
std::pop_heap(_data.begin(), _data.begin() + _size);
_size--;
return d;
}
T top()
{
return _data.front();
}
int size() const
{
return _size;
}
};`