Merging Ranges In C++ - c++

I have a list of randomly ordered unique closed-end ranges R0...Rn-1 where
Ri = [r1i, r2i] (r1i <= r2i)
Subsequently some of the ranges overlap (partially or completely) and hence require merging.
My question is, what are the best-of-breed algorithms or techniques used for merging such ranges. Examples of such algorithms or links to libraries that perform such a merging operation would be great.

What you need to do is:
Sort items lexicographically where range key is [r_start,r_end]
Iterate the sorted list and check if current item overlaps with next. If it does extend current item to be r[i].start,r[i+1].end, and goto next item. If it doesn't overlap add current to result list and move to next item.
Here is sample code:
vector<pair<int, int> > ranges;
vector<pair<int, int> > result;
vector<pair<int, int> >::iterator it = ranges.begin();
pair<int,int> current = *(it)++;
while (it != ranges.end()){
if (current.second > it->first){ // you might want to change it to >=
current.second = std::max(current.second, it->second);
} else {
current = *(it);

Boost.Icl might be of use for you.
The library offers a few templates that you may use in your situation:
interval_set — Implements a set as a set of intervals - merging adjoining intervals.
separate_interval_set — Implements a set as a set of intervals - leaving adjoining intervals separate
split_interval_set — implements a set as a set of intervals - on insertion overlapping intervals are split
There is an example for merging intervals with the library :
interval<Time>::type night_and_day(Time(monday, 20,00), Time(tuesday, 20,00));
interval<Time>::type day_and_night(Time(tuesday, 7,00), Time(wednesday, 7,00));
interval<Time>::type next_morning(Time(wednesday, 7,00), Time(wednesday,10,00));
interval<Time>::type next_evening(Time(wednesday,18,00), Time(wednesday,21,00));
// An interval set of type interval_set joins intervals that that overlap or touch each other.
interval_set<Time> joinedTimes;
joinedTimes.insert(day_and_night); //overlapping in 'day' [07:00, 20.00)
joinedTimes.insert(next_morning); //touching
joinedTimes.insert(next_evening); //disjoint
cout << "Joined times :" << joinedTimes << endl;
and the output of this algorithm:
Joined times :[mon:20:00,wed:10:00)[wed:18:00,wed:21:00)
And here about complexity of their algorithms:
Time Complexity of Addition

A simple algorithm would be:
Sort the ranges by starting values
Iterate over the ranges from beginning to end, and whenever you find a range that overlaps with the next one, merge them

Make a mapping of r1_i -> r2_i,
QuickSort upon the r1_i's,
go through the list to select for each r1_i-value the largest r2_i-value,
with that r2_i-value you can skip over all subsequent r1_i's that are smaller than r2_i

jethro's answer contains an error.
It should be
if (current.second > it->first){
current.second = std::max(current.second, it->second);
} else {

My algorithm does not use extra space and is lightweight as well. I have used 2-pointer approach. 'i' keeps increasing while 'j' keeps track of the current element being updated.
Here is my code:
bool cmp(Interval a,Interval b)
return a.start<=b.start;
vector<Interval> Solution::insert(vector<Interval> &intervals, Interval newInterval) {
int i,j;
if(intervals[j].end>=intervals[i].start) //if overlaps
intervals[j].end=max(intervals[i].end,intervals[j].end); //change
intervals[j]=intervals[i]; //update it on the same list
return intervals;
Interval can be a public class or structure with data members 'start' and 'end'.
Happy coding :)

I know that this is a long time after the original accepted answer. But in
c++11, we can now construct a priority_queue in the following manner`
priority_queue( const Compare& compare, const Container& cont )
in O(n) comparisons.
Please see
for more details.
So we can create a priority_queue(min heap) of pairs in O(n) time. Get the lowest interval in O(1) and pop it in O(log(n)) time.
So the overall time complexity is close to O(nlog(n) + 2n) = O(nlogn)


QHashIterator in c++

I developed a game in C++, and want to make sure everything is properly done.
Is it a good solution to use a QHashIterator to check which item in the list has the lowest value (F-cost for pathfinding).
Snippet from my code:
while(!pathFound){ //do while path is found
QHashIterator<int, PathFinding*> iterator(openList);
PathFinding* parent;;
parent = iterator.value();
while(iterator.hasNext()){ //we take the next tile, and we take the one with the lowest value;
//checking lowest f value
if((iterator.value()->getGcost() + iterator.value()->getHcost()) < (parent->getGcost() + parent->getHcost())){
parent = iterator.value();
if(!atDestionation(parent,endPoint)){ //here we check if we are at the destionation. if we are we return our pathcost.
pathFound = true;
parent = parent->getParent();
pathcost = calculatePathCost(mylist); //we calculate what the pathcost is and return it
If no? Are there better improvements?
I also found someting about the std::priority_queue. It this mutch better then a QHashIterator?
It's maybe not a problem with gameworld where there which are not big. But i'm looking for a suitable solution when the game worlds are big (like + 10000 calculations).Any marks?
Here you basically scan the whole map to find the element that is the minimum one according to some values:
while(iterator.hasNext()){ //we take the next tile, and we take the one with the lowest value;
//checking lowest f value
if((iterator.value()->getGcost() + iterator.value()->getHcost()) < (parent->getGcost() + parent->getHcost())){
parent = iterator.value();
All this code, if you had an stl container, for instance a map, could be reduced to:
auto parent = std::min_element(iterator.begin(), iterator.end(), [](auto& lhs, auto& rhs)
{ lhs.value()->getGcost() + lhs.value()->getHcost()) < (rhs.value()->getGcost() + rhs.value()->getHcost() }
Once you have something easier to understand you can play around with different containers, for instance it might be faster to hold a sorted vector in this case.
Your code does not present any obvious problems per se, often performance gains are not conquered by optimizing little loops, it's more on how you code is organized. For instance I see that you have a lot of indirections, those cost a lot in cache misses. Or if you have to always find the minimum element, you could cache it in another structure and you would have it at a constant time, all the time.

Need suggestion to improve speed for word break (dynamic programming)

The problem is: Given a string s and a dictionary of words dict, determine if s can be segmented into a space-separated sequence of one or more dictionary words.
For example, given
s = "hithere",
dict = ["hi", "there"].
Return true because "hithere" can be segmented as "leet code".
My implementation is as below. This code is ok for normal cases. However, it suffers a lot for input like:
s = "aaaaaaaaaaaaaaaaaaaaaaab", dict = {"aa", "aaaaaa", "aaaaaaaa"}.
I want to memorize the processed substrings, however, I cannot done it right. Any suggestion on how to improve? Thanks a lot!
class Solution {
bool wordBreak(string s, unordered_set<string>& wordDict) {
int len = s.size();
if(len<1) return true;
for(int i(0); i<len; i++) {
string tmp = s.substr(0, i+1);
&& (wordBreak(s.substr(i+1), wordDict)) )
return true;
return false;
It's logically a two-step process. Find all dictionary words within the input, consider the found positions (begin/end pairs), and then see if those words cover the whole input.
So you'd get for your example
aa: {0,2}, {1,3}, {2,4}, ... {20,22}
aaaaaa: {0,6}, {1,7}, ... {16,22}
aaaaaaaa: {0,8}, {1,9} ... {14,22}
This is a graph, with nodes 0-23 and a bunch of edges. But node 23 b is entirely unreachable - no incoming edge. This is now a simple graph theory problem
Finding all places where dictionary words occur is pretty easy, if your dictionary is organized as a trie. But even an std::map is usable, thanks to its equal_range method. You have what appears to be an O(N*N) nested loop for begin and end positions, with O(log N) lookup of each word. But you can quickly determine if s.substr(begin,end) is a still a viable prefix, and what dictionary words remain with that prefix.
Also note that you can build the graph lazily. Staring at begin=0 you find edges {0,2}, {0,6} and {0,8}. (And no others). You can now search nodes 2, 6 and 8. You even have a good algorithm - A* - that suggests you try node 8 first (reachable in just 1 edge). Thus, you'll find nodes {8,10}, {8,14} and {8,16} etc. As you see, you'll never need to build the part of the graph that contains {1,3} as it's simply unreachable.
Using graph theory, it's easy to see why your brute-force method breaks down. You arrive at node 8 (aaaaaaaa.aaaaaaaaaaaaaab) repeatedly, and each time search the subgraph from there on.
A further optimization is to run bidirectional A*. This would give you a very fast solution. At the second half of the first step, you look for edges leading to 23, b. As none exist, you immediately know that node {23} is isolated.
In your code, you are not using dynamic programming because you are not remembering the subproblems that you have already solved.
You can enable this remembering, for example, by storing the results based on the starting position of the string s within the original string, or even based on its length (because anyway the strings you are working with are suffixes of the original string, and therefore its length uniquely identifies it). Then, in the beginning of your wordBreak function, just check whether such length has already been processed and, if it has, do not rerun the computations, just return the stored value. Otherwise, run computations and store the result.
Note also that your approach with unordered_set will not allow you to obtain the fastest solution. The fastest solution that I can think of is O(N^2) by storing all the words in a trie (not in a map!) and following this trie as you walk along the given string. This achieves O(1) per loop iteration not counting the recursion call.
Thanks for all the comments. I changed my previous solution to the implementation below. At this point, I didn't explore to optimize on the dictionary, but those insights are very valuable and are very much appreciated.
For the current implementation, do you think it can be further improved? Thanks!
class Solution {
bool wordBreak(string s, unordered_set<string>& wordDict) {
int len = s.size();
if(len<1) return true;
if(wordDict.size()==0) return false;
vector<bool> dq (len+1,false);
dq[0] = true;
for(int i(0); i<len; i++) {// start point
if(dq[i]) {
for(int j(1); j<=len-i; j++) {// length of substring, 1:len
if(!dq[i+j]) {
auto pos = wordDict.find(s.substr(i, j));
dq[i+j] = dq[i+j] || (pos!=wordDict.end());
if(dq[len]) return true;
return false;
Try the following:
class Solution {
bool wordBreak(string s, unordered_set<string>& wordDict)
for (auto w : wordDict)
auto pos = s.find(w);
if (pos != string::npos)
if (wordBreak(s.substr(0, pos), wordDict) &&
wordBreak(s.substr(pos + w.size()), wordDict))
return true;
return false;
Essentially one you find a match remove the matching part from the input string and so continue testing on a smaller input.

Is there any way of optimising this function?

This piece of code seems to be the worst offender in terms of time in my program. What my program is trying to do find the minimum number of individual "nodes" required to satisfy a network with two constraints:
Each node must connect to x number of other nodes
Each node must have y degrees of separation between it and each of the nodes it's connected to.
However for values of x greater than 600 this task takes a very long time, the task is on the order of exponential anyway so I expect it to take forever at some point but that also means that if any small changes could be made here it'd speed up the entire program by alot.
uniint = unsigned long long int (64-bit)
network is a vector of the form vector<vector<uniint>>
The piece of code:
/* Checks if id2 is in id1's list of connections */
inline bool CheckIfInList (uniint id1, uniint id2)
uniint id1size = network[id1].size();
for (uniint itr = 0; itr < id1size; ++itr)
if (network[id1][itr] == id2)
return true;
return false;
The only way is to sort the network[id1] array when you build it.
If you arrive here with a sorted array you can easiliy find, if exists, what you are looking for using a dichotomic search.
Use std::map or std::unordered_map for fast search. I guess it's impossible to MICRO optimize this code, std::vector is cool. But not for 600 elements search.
I'm guessing CheckIfInList() is called in a loop? Perhaps a vector is not the best choice, you could try vector<set<uniint>>. This will give you O(log n) for a look up of the inner collection instead of O(n)
For quick microoptimization, check whether your compiler optimizes the multiple calls to network[id1] away. If not, that is where you loose a lot of time, so remember the address:
vector<uniint>& connectedNodes = network[id1];
uniint id1size = connectedNodes.size();
for (uniint itr = 0; itr < id1size; ++itr)
if (connectedNodes[itr] == id2)
return true;
return false;
If your compiler already took care of that, I'm afraid that there's not much you can micro optimize about this method. The only real optimization can be achieved on the algorithmic level, starting with sorting the neighbour lists, moving on to using unordered_map<> instead of vector<>, and ending with asking yourself whether you can't somehow reduce the number of calls to CheckIfInList().
This is not as effective as HAL9000's suggestion, and is good for cases when you have an unsorted list/array. What you can do is to ask less question in each iteration if you put the value you looking for at the end of the vector.
uniint id1size = network[id1].size();
network[id1][id1size] = id2;
for (uniint itr = 0; network[id1][itr] == id2; ++itr);
//if itr != id1size return true else flase....
need to add checks if the last member in the vector was your id2.
This way you don't need to ask each time whether you get to the end of the list.

Time complexity issues with multimap

I created a program that finds the median of a list of numbers. The list of numbers is dynamic in that numbers can be removed and inserted (duplicate numbers can be entered) and during this time, the new median is re-evaluated and printed out.
I created this program using a multimap because
1) the benefit of it being already being sorted,
2) easy insertion, deletion, searching (since multimap implements binary search)
3) duplicate entries are allowed.
The constraints for the number of entries + deletions (represented as N) are: 0 < N <= 100,000.
The program I wrote works and prints out the correct median, but it isn't fast enough. I know that the unsorted_multimap is faster than multimap, but then the problem with unsorted_multimap is that I would have to sort it. I have to sort it because to find the median you need to have a sorted list. So my question is, would it be practical to use an unsorted_multimap and then quick sort the entries, or would that just be ridiculous? Would it be faster to just use a vector, quicksort the vector, and use a binary search? Or maybe I am forgetting some fabulous solution out there that I haven't even thought of.
Though I'm not new to C++, I will admit, that my skills with time-complexity are somewhat medicore.
The more I look at my own question, the more I'm beginning to think that just using a vector with quicksort and binary search would be better since the data structures basically already implement vectors.
the more I look at my own question, the more I'm beginning to think that just using vector with quicksort and binary search would be better since the data structures basically already implement vectors.
If you have only few updates - use unsorted std::vector + std::nth_element algorithm which is O(N). You don't need full sorting which is O(N*ln(N)).
live demo of nth_element:
#include <algorithm>
#include <iterator>
#include <iostream>
#include <ostream>
#include <vector>
using namespace std;
template<typename RandomAccessIterator>
RandomAccessIterator median(RandomAccessIterator first,RandomAccessIterator last)
RandomAccessIterator m = first + distance(first,last)/2; // handle even middle if needed
return m;
int main()
vector<int> values = {5,1,2,4,3};
cout << *median(begin(values),end(values)) << endl;
Output is:
If you have many updates and only removing from middle - use two heaps as comocomocomocomo suggests. If you would use fibonacci_heap - then you would also get O(N) removing from arbitary position (if don't have handle to it).
If you have many updates and need O(ln(N)) removing from arbitary places - then use two multisets as ipc suggests.
If your purpose is to keep track of the median on the fly, as elements are inserted/removed, you should use a min-heap and a max-heap. Each one would contain one half of the elements... There was a related question a couple of days ago: How to implement a Median-heap
Though, if you need to search for specific values in order to remove elements, you still need some kind of map.
You said that it is slow. Are you iterating from the beginning of the map to the (N/2)'th element every time you need the median? You don't need to. You can keep track of the median by maintaining an iterator pointing to it at all times and a counter of the number of elements less than that one. Every time you insert/remove, compare the new/old element with the median and update both iterator and counter.
Another way of seeing it is as two multimaps containing half the elements each. One holds the elements less than the median (or equal) and the other holds those greater. The heaps do this more efficiently, but they don't support searches.
If you only need the median a few times you can use the "select" algorithm. It is described in Sedgewick's book. It takes O(n) time on average. It is similar to quick sort but it does not sort completely. It just partitions the array with random pivots until, eventually, it gets to "select" on one side the smaller m elements (m=(n+1)/2). Then you search for the greatest of those m elements, and this is the median.
Here is how you could implement that in O(log N) per update:
template <typename T>
class median_set {
std::multiset<T> below, above;
// O(log N)
void rebalance()
int diff = above.size() - below.size();
if (diff > 0) {
} else if (diff < -1) {
// O(1)
bool empty() const { return below.empty() && above.empty(); }
// O(1)
T const& median() const
return *below.rbegin();
// O(log N)
void insert(T const& value)
if (!empty() && value > median())
// O(log N)
void erase(T const& value)
if (value > median())
(Work in action with tests)
The idea is the following:
Keep track of the values above and below the median in two sets
If a new value is added, add it to the corresponding set. Always ensure that the set below has exactly 0 or 1 more then the other
If a value is removed, remove it from the set and make sure that the condition still holds.
You can't use priority_queues because they won't let you remove one item.
Can any one help me what is Space and Time complexity of my following C# program with details.
//Passing Integer array to Find Extreme from that Integer Array
public int extreme(int[] A)
int N = A.Length;
if (N == 0)
return -1;
int average = CalculateAverage(A);
return FindExtremes(A, average);
// Calaculate Average of integerArray
private int CalculateAverage(int[] integerArray)
int sum = 0;
foreach (int value in integerArray)
sum += value;
return Convert.ToInt32(sum / integerArray.Length);
//Find Extreme from that Integer Array
private int FindExtremes(int[] integerArray, int average) {
int Index = -1; int ExtremeElement = integerArray[0];
for (int i = 0; i < integerArray.Length; i++)
int absolute = Math.Abs(integerArray[i] - average);
if (absolute > ExtremeElement)
ExtremeElement = integerArray[i];
Index = i;
return Index;
You are almost certainly better off using a vector. Possibly maintaining an auxiliary vector of indexes to be removed between median calculations so you can delete them in batches. New additions can also be put into an auxiliary vector, sorted, then merged in.

priority queue with limited space: looking for a good algorithm

This is not a homework.
I'm using a small "priority queue" (implemented as array at the moment) for storing last N items with smallest value. This is a bit slow - O(N) item insertion time. Current implementation keeps track of largest item in array and discards any items that wouldn't fit into array, but I still would like to reduce number of operations further.
looking for a priority queue algorithm that matches following requirements:
queue can be implemented as array, which has fixed size and _cannot_ grow. Dynamic memory allocation during any queue operation is strictly forbidden.
Anything that doesn't fit into array is discarded, but queue keeps all smallest elements ever encountered.
O(log(N)) insertion time (i.e. adding element into queue should take up to O(log(N))).
(optional) O(1) access for *largest* item in queue (queue stores *smallest* items, so the largest item will be discarded first and I'll need them to reduce number of operations)
Easy to implement/understand. Ideally - something similar to binary search - once you understand it, you remember it forever.
Elements need not to be sorted in any way. I just need to keep N smallest value ever encountered. When I'll need them, I'll access all of them at once. So technically it doesn't have to be a queue, I just need N last smallest values to be stored.
I initially thought about using binary heaps (they can be easily implemented via arrays), but apparently they don't behave well when array can't grow anymore. Linked lists and arrays will require extra time for moving things around. stl priority queue grows and uses dynamic allocation (I may be wrong about it, though).
So, any other ideas?
I'm not interested in STL implementation. STL implementation (suggested by a few people) works a bit slower than currently used linear array due to high number of function calls.
I'm interested in priority queue algorithms, not implemnetations.
Array based heaps seem ideal for your purpose. I am not sure why you rejected them.
You use a max-heap.
Say you have an N element heap (implemented as an array) which contains the N smallest elements seen so far.
When an element comes in you check against the max (O(1) time), and reject if it is greater.
If the value coming in is lower, you modify the root to be the new value and sift-down this changed value - worst case O(log N) time.
The sift-down process is simple: Starting at root, at each step you exchange this value with it's larger child until the max-heap property is restored.
So, you will not have to do any deletes which you probably will have to, if you use std::priority_queue. Depending on the implementation of std::priority_queue, this could cause memory allocation/deallocation.
So you can have the code as follows:
Allocated Array of size N.
Fill it up with the first N elements you see.
heapify (you should find this in standard text books, it uses sift-down). This is O(N).
Now any new element you get, you either reject it in O(1) time or insert by sifting-down in worst case O(logN) time.
On an average, though, you probably will not have to sift-down the new value all the way down and might get better than O(logn) average insert time (though I haven't tried proving it).
You only allocate size N array once and any insertion is done by exchanging elements of the array, so there is no dynamic memory allocation after that.
Check out the wiki page which has pseudo code for heapify and sift-down:
Use std::priority_queue with the largest item at the head. For each new item, discard it if it is >= the head item, otherwise pop the head item and insert the new item.
Side note: Standard containers will only grow if you make them grow. As long as you remove one item before inserting a new item (after it reaches its maximum size, of course), this won't happen.
Most priority queues I work are based on linked lists. If you have a pre-determined number of priority levels, you can easily create a priority queue with O(1) insertion by having an array of linked lists--one linked list per priority level. Items of the same priority will of course degenerate into either a FIFO, but that can be considered acceptable.
Adding and removal then becomes something like (your API may vary) ...
listItemAdd (&list[priLevel], &item); /* Add to tail */
pItem = listItemRemove (&list[priLevel]); /* Remove from head */
Getting the first item in the queue then becomes a problem of finding the non-empty linked-list with the highest priority. That may be O(N), but there are several tricks you can use to speed it up.
In your priority queue structure, keep a pointer or index or something to the linked list with the current highest priority. This would need to be updated each time an item is added or removed from the priority queue.
Use a bitmap to indicate which linked lists are not empty. Combined with a find most significant bit, or find least significant bit algorithm you can usually test up to 32 lists at once. Again, this would need to be updated on each add / remove.
Hope this helps.
If amount of priorities is small and fixed than you can use ring-buffer for each priority. That will lead to waste of the space if objects is big, but if their size is comparable with pointer/index than variants with storing additional pointers in objects may increase size of array in the same way.
Or you can use simple single-linked list inside array and store 2*M+1 pointers/indexes, one will point to first free node and other pairs will point to head and tail of each priority. In that cases you'll have to compare in avg. O(M) before taking out next node with O(1). And insertion will take O(1).
If you construct an STL priority queue at the maximum size (perhaps from a vector initialized with placeholders), and then check the size before inserting (removing an item if necessary beforehand) you'll never have dynamic allocation during insert operations. The STL implementation is quite efficient.
Matters Computational see page 158. The implementation itself is quite well, and you can even tweak it a little without making it less readable. For example, when you compute the left child like:
int left = i / 2;
You can compute the rightchild like so:
int right = left + 1;
Found a solution ("difference" means "priority" in the code, and maxRememberedResults is 255 (could be any (2^n - 1)):
template <typename T> inline void swap(T& a, T& b){
T c = a;
a = b;
b = c;
struct MinDifferenceArray{
enum{maxSize = maxRememberedResults};
int size;
DifferenceData data[maxSize];
void add(const DifferenceData& val){
if (size >= maxSize){
if(data[0].difference <= val.difference)
data[0] = val;
for (int i = 0; (2*i+1) < maxSize; ){
int next = 2*i + 1;
if (data[next].difference < data[next+1].difference)
if (data[i].difference < data[next].difference)
swap(data[i], data[next]);
i = next;
data[size++] = val;
for (int i = size - 1; i > 0;){
int parent = (i-1)/2;
if (data[parent].difference < data[i].difference){
swap(data[parent], data[i]);
i = parent;
void clear(){
size = 0;
build max-based queue (root is largest)
until it is full, fill up normally
when it is full, for every new element
Check if new element is smaller than root.
if it is larger or equal than root, reject.
otherwise, replace root with new element and perform normal heap "sift-down".
And we get O(log(N)) insert as a worst case scenario.
It is the same solution as the one provided by user with nickname "Moron".
Thanks to everyone for replies.
P.S. Apparently programming without sleeping enough wasn't a good idea.
It's better to implement your own class using std::array and heap algorithms.
`template<class T, int fixed_size = 5>
class fixed_size_arr_pqueue_v2
std::array<T, fixed_size> _data;
int _size = 0;
int parent(int i)
return (i - 1)/2;
void heapify(int i, bool downward = false)
int l = 2*i + 1;
int r = 2*i + 2;
int largest = 0;
if (l < size() && _data[l] > _data[i])
largest = l;
largest = i;
if (r < size() && _data[r] > _data[largest])
largest = r;
if (largest != i)
std::swap(_data[largest], _data[i]);
if (!downward)
heapify(largest, true);
void push(T &d)
if (_size == fixed_size)
//min elements in a max heap lies at leaves only.
auto minItr = std::min_element(begin(_data) + _size/2, end(_data));
auto minPos {minItr - _data.begin()};
auto min { *minItr};
if (d > min)
{ = d;
if (_data[parent(minPos)] > d)
//this is unlikely to happen in our case? as this position is a leaf?
heapify(minPos, true);
return ;
} = d;
std::push_heap(_data.begin(), _data.begin() + _size);
T pop()
T d = _data.front();
std::pop_heap(_data.begin(), _data.begin() + _size);
return d;
T top()
return _data.front();
int size() const
return _size;