Search in sorted array with few comparisons - c++

You are given a std::vector<T> of distinct items. which is already sorted.
type T only supports less-than < operator for comparisons. and it is a heavy function. so you have to use it as few times as possible.
Is there any better solution than a binary search?
If not, is there any better solution than this, that uses less-than operator fewer times?
template<typename T>
int FindKey(const std::vector<T>& list, const T& key)
{
if( list.empty() )
return -1;
int left = 0;
int right = list.size() - 1;
int mid;
while( left < right )
{
mid = (right + left) / 2;
if( list[mid] < key )
left = mid + 1;
else
right = mid;
}
if( !(key < list[left]) && !(list[left] < key) )
return left;
return -1;
}
It's not a real world situation, just a coding test.

You could trade off additional O(n) preprocessing time to get amortized O(1) query time, using a hash table (e.g. an unordered_map) to create a lookup table.
Hash tables compute hash functions of the keys and do not compare the keys themselves.
Two keys could have the same hash, resulting in a collision, explaining why it's not guaranteed that every separate operation is constant time. Amortized constant time means that if you carry out k operations that took time t in total, then the quotient t/k = O(1), for a sufficiently large k.
Live example:
#include <vector>
#include <unordered_map>
template<typename T>
class lookup {
std::unordered_map<T, int> position;
public:
lookup(const std::vector<T>& a) {
for(int i = 0; i < a.size(); ++i) position.emplace(a[i], i);
}
int operator()(const T& key) const {
auto pos = position.find(key);
return pos == position.end() ? -1 : pos->second;
}
};
This requires additional memory also.
If the values can be mapped to integers and are within a reasonable range (i.e. max-min = O(n)), you could simply use a vector as a lookup table instead of unordered_map. With the benefit of guaranteed constant query time.
See also this answer to "C++ get index of element of array by value", for a more detailed discussion, including an empirical comparison of linear, binary and hash index lookup.
Update
If the interface of type T supports no other operations than bool operator<(L, R), then using the decision tree model you can prove a lower bound for comparison-based search algorithms to be Ω(log n).

You can use std::lower_bound. It does it with log(n)+1 comparisons, which is the best possible complexity for your problem.
template<typename T>
int FindKey(const std::vector<T>& list, const T& key)
{
if(list.empty())
return -1;
typename std::vector<T>::const_iterator lb =
std::lower_bound(list.begin(), list.end(), key);
// now lb is an iterator to the first element
// which is greater or equal to key
if(key < *lb)
return -1;
else
return std::distance(list.begin(), lb);
}
With the additionnal check for equality, you do it with log(n)+2 comparisons.

You can use interpolation search in log log n time if your numbers are normally distributed. If they have some other distribution, you can modify this to take your distribution into account, though I don't know which distributions yield log log time.
https://en.wikipedia.org/wiki/Interpolation_search

Related

Union-Find leetcode question exceeding time limit

I am solving this problem on leetcode https://leetcode.com/problems/sentence-similarity-ii/description/ that involves implementing the union-find algorithm to find out if two sentences are similar or not given a list of pairs representing similar words. I implemented ranked union-find where I keep track of the size of each subset and join the smaller subtree to the bigger one but for some reason the code is still exceeding the time limit. Can someone point me to what I am doing wrong? How can it be optimized further. I saw other accepted solutions were using the same ranked union find algorithm.
Here is the code:
string root(map<string, string> dict, string element) {
if(dict[element] == element)
return element;
return root(dict, dict[element]);
}
bool areSentencesSimilarTwo(vector<string>& words1, vector<string>& words2, vector<pair<string, string>> pairs) {
if(words1.size() != words2.size()) return false;
std::map<string, string> dict;
std::map<string, int> sizes;
for(auto pair: pairs) {
if(dict.find(pair.first) == dict.end()) {
dict[pair.first] = pair.first;
sizes[pair.first] = 1;
}
if(dict.find(pair.second) == dict.end()) {
dict[pair.second] = pair.second;
sizes[pair.second] = 1;
}
auto firstRoot = root(dict, pair.first);
auto secondRoot = root(dict, pair.second);
if(sizes[firstRoot] < sizes[secondRoot]) {
dict[firstRoot] = secondRoot;
sizes[firstRoot] += sizes[secondRoot];
}
else {
dict[secondRoot] = firstRoot;
sizes[secondRoot] += sizes[firstRoot];
}
}
for(int i = 0; i < words1.size(); i++) {
if(words1[i] == words2[i]) {
continue;
}
else if(root(dict, words1[i]) != root(dict, words2[i])) {
return false;
}
}
return true;
}
Thanks!
Your union-find is broken with respect to complexity. Please read Wikipedia: Disjoint-set data structure.
For union-find to have its near O(1) complexity, it has to employ path-compaction. For that, your root method has to:
Get dict by reference, so that it can modify it.
Make path compaction to all elements on the path, so that they point to the root.
Without compaction you will have O(log N) complexity for root(), which could be OK. But for that, you'd have to fix it so that root() gets dict by reference and not by value. Passing dict by value costs O(N).
The fact that dict is an std::map makes any query cost O(log N), instead of O(1). std::unordered_map costs O(1), but in practice for N < 1000, std::map is faster. Also, even if std::unordered_map is used, hashing a string costs O(len(str)).
If the data is big, and performance is still slow, you may gain from working with indexes into pairs instead of strings, and run union-find with indexes into a vector<int>. This is error prone, since you have to correctly deal with duplicate strings.

Heapify in logarithmic time using the C++ standard library

I have a heap using std::make_heap:
std::vector<int> v{1,2,3,5,9,20,3};
std::make_heap(v.begin(), v.end());
now I update the heap by changing one random element:
v[3] = 35;
Is there a way in standard library in to adjust heap again in O(log n) time where n is size of container. Basically I am looking for heapify function. I know what element has been changed.
I understand that std::make_heap is O(n log n) time. I have also gone through duplicate question but that is different in sense that it is changing max element. For that solution is already given of O(log n) complexity in that question.
I am trying to change any random element within heap.
You can just do it yourself:
void modify_heap_element(std::vector<int> &heap, size_t index, int value)
{
//while value is too large for its position, bubble up
while(index > 0 && heap[(index-1)>>1] < value)
{
size_t parent = (index-1)>>1;
heap[index]=heap[parent];
index = parent;
}
//while value is too large for its position sift down
for (;;)
{
size_t left=index*2+1;
size_t right=left+1;
if (left >= heap.size())
break;
size_t bigchild = (right >= heap.size() || heap[right] < heap[left] ?
left : right );
if (!(value < heap[bigchild]))
break;
heap[index]=heap[bigchild];
index = bigchild;
}
heap[index] = value;
}
If we look closer at your statement:
now I disturb heap by changing one random element of heap.
For heapifying in O(log n) you can only directly "disturb" the back or the front of the vector (which corresponds somehow to inserting or deleting an element). In these cases, (re)heapification can be then achieved by means of the std::push_heap and std::pop_heap algorithms, which take logarithmic running time.
That is, the back:
v.back() = 35;
std::push_heap(v.begin(), v.end()); // heapify in O(log n)
or the front:
v.front() = 35;
// places the front at the back
std::pop_heap(v.begin(), v.end()); // O(log n)
// v.back() is now 35, but it does not belong to the heap anymore
// make the back belong to the heap again
std::push_heap(v.begin(), v.end()); // O(log n)
Otherwise you need to reheapify the whole vector with std::make_heap, which takes linear running time.
Summary
It's not possible to modify an arbitrary element of the heap and achieve the heapification in logarithmic running time with the standard library (i.e., the function templates std::push_heap and std::pop_heap). However, you can always implement the heap's swim and sink operations by yourself in order to heapify in logarithmic running time.
I have been facing this problem of wanting an "updateable heap" as well. However, in the end, instead of coding a custom updateable heap or anything like that, I solved it a bit differently.
To maintain access to the best element without needing to explicitly go through the heap, you can use versioned wrappers of the elements that you want to order. Each unique, true element has a version counter, which is increased every time the element gets changed. Each wrapper inside the heap then carries a version of the element, being the version at the time the wrapper was created:
struct HeapElemWrapper
{
HeapElem * e;
size_t version;
double priority;
HeapElemWrapper(HeapElem * elem)
: e(elem), version(elem->currentVersion), priority(0.0)
{}
bool upToDate() const
{
return version == e->currentVersion;
}
// operator for ordering with heap / priority queue:
// smaller error -> higher priority
bool operator<(const HeapElemWrapper & other) const
{
return this->priority> other.priority;
}
};
When popping the topmost element from the heap, you can then simply check this wrapper element to see if it's up-to-date with the original. If not, simply dispose it and pop the next one. This method is quite efficient, and I have it seen in other applications as well. The only thing you need to take care of is that you do a pass over the heap to clean it up from outdated elements, from time to time (say, every 1000 insertions or so).
It's not possible to modify an arbitrary element of the heap in logarithmic running time without violating the heap property by just using the function templates std::pop_heap() and std::push_heap() that the standard library provides.
However, you can define your own STL-like function template, set_heap_element(), for that purpose:
template<typename RandomIt, typename T, typename Cmp>
void set_heap_element(RandomIt first, RandomIt last, RandomIt pos, T value, Cmp cmp)
{
const auto n = last - first;
*pos = std::move(value); // replace previous value
auto i = pos - first;
using std::swap;
// percolate up
while (i > 0) { // non-root node
auto parent_it = first + (i-1)/2;
if (cmp(*pos, *parent_it))
break; // parent node satisfies the heap-property
swap(*pos, *parent_it); // swap with parent
pos = parent_it;
i = pos - first;
}
// percolate down
while (2*i + 1 < n) { // non-leaf node, since it has a left child
const auto lidx = 2*i + 1, ridx = 2*i + 2;
auto lchild_it = first + lidx;
auto rchild_it = ridx < n? first + ridx: last;
auto it = pos;
if (cmp(*it, *lchild_it))
it = lchild_it;
if (rchild_it != last && cmp(*it, *rchild_it))
it = rchild_it;
if (pos == it)
break; // node satisfies the heap-property
swap(*pos, *it); // swap with child
pos = it;
i = pos - first;
}
}
Then, you can provide the following simplified overload of set_heap_element() for a max heap:
#include <functional> // std::less
template<typename RandomIt, typename T>
void set_heap_element(RandomIt first, RandomIt last, RandomIt pos, T value) {
return set_heap_element(first, last, pos, value, std::less<T>{});
}
This overload uses a std::less<T> object as the comparison function object for the original function template.
Example
In your max-heap example, set_heap_element() could be used as follows:
std::vector<int> v{1,2,3,5,9,20,3};
std::make_heap(v.begin(), v.end());
// set 4th element to 35 in O(log n)
set_heap_element(v.begin(), v.end(), v.begin() + 3, 35);
You could use std::is_heap(), which takes linear time, whenever you want to check whether the max-heap property is still satisfied by v after setting an element with the set_heap_element() function template above:
assert(std::is_heap(v.begin(), v.end()));
What about min heaps?
You can achieve the same for a min heap by passing a std::greater<int> object as the last argument of the function calls to std::make_heap(), set_heap_element() and std::is_heap():
std::vector<int> v{1,2,3,5,9,20,3};
// create a min heap
std::make_heap(v.begin(), v.end(), std::greater<int>{});
// set 4th element to 35 in O(log n)
set_heap_element(v.begin(), v.end(), v.begin() + 3, 35, std::greater<int>{});
// is the min-heap property satisfied?
assert(std::is_heap(v.begin(), v.end(), std::greater<int>{}));

Check whether two elements have a common element in C++

I want the function to return true when there is any element matching between two vectors,
Note : My vectors are not sorted
Following is my source code,
bool CheckCommon( std::vector< long > &inVectorA, std::vector< long > &inVectorB )
{
std::vector< long > *lower, *higher;
size_t sizeL = 0, sizeH = 0;
if( inVectorA.size() > inVectorB.size() )
{
lower = &inVectorA;
sizeL = inVectorA.size();
higher = &inVectorB;
sizeH = inVectorB.size();
}
else
{
lower = &inVectorB;
sizeL = inVectorB.size();
higher = &inVectorA;
sizeH = inVectorA.size();
}
size_t indexL = 0, indexH = 0;
for( ; indexH < sizeH; indexH++ )
{
bool exists = std::binary_search( lower->begin(), lower->end(), higher->at(indexH) );
if( exists == true )
return true;
else
continue;
}
return false;
}
This is working fine when the size of vector B is less than the size of vector A , but returning false even there is match when size of vector B is greater than size of vector A .
The problem with posted code is that you should not use std::binary_search when the vector is not sorted. The behaviour is defined only for sorted range.
If the input vectors are not sorted then you can use find_first_of to check for existence of first common element found.
bool CheckCommon(std::vector<long> const& inVectorA, std::vector<long> const& nVectorB)
{
return std::find_first_of (inVectorA.begin(), inVectorA.end(),
nVectorB.begin(), nVectorB.end()) != inVectorA.end();
}
Complexity of find_first_of is up to linear in inVectorA.size()*inVectorB.size(); it compares elements until a match is found.
If you want to fix your original algorithm then you can make a copy of one of vectors and std::sort it, then std::binary_search works with it.
In actual programs that do lot of such matching between containers the containers are usually kept sorted. On such case std::set_intersection can be used. Then the complexity of search is up to linear in inVectorA.size()+inVectorB.size().
std::find_first_of is more efficient than to sort both ranges and then to search for matches with std::set_intersection when both ranges are rather short or second range is shorter than binary logarithm of length of first range.
You can use a well-defined algorithm called as std::set_intersection to check if there is any common element between these vectors.
Pre-condition :- Both vectors be sorted.
You could do something like the following. Iterate over the first vector. For each element, use std::find to see if it exists in the other vector. If you find it, they have at least one common element so return true. Otherwise, move to the next element of the first vector and repeat this process. If you make it all the way through the first vector without finding a common element, there is no intersection so return false.
bool CheckCommon(std::vector<long> const& inVectorA, std::vector<long> const& nVectorB)
{
for (auto const& num : inVectorA)
{
auto it = std::find(begin(nVectorB), end(nVectorB), num);
if (it != end(nVectorB))
{
return true;
}
}
return false;
}
Usage of std::set_intersection is one option. Since the vector's elements are sorted, the code can be simplified to this:
#include <algorithm>
#include <iterator>
bool CheckCommon( const std::vector< long > &inVectorA, const std::vector< long > &inVectorB )
{
std::vector< long > temp;
std::set_intersection(inVectorA.begin(), inVectorA.end(),
inVectorB.begin(), inVectorB.end(),
std::back_inserter(temp));
return !temp.empty()
}
The drawback is that a temporary vector is being created while the set_intersection is being executed (but maybe in the future, this can be considered a "feature" if you want to know what elements are common).
Here is an implementation which uses sorted vectors, doesn't construct a new container, and which has only linear complexity (more detailed: O(container1.size()+ container2.size()):
template< class ForwardIt1, class ForwardIt2 >
bool has_common_elements( ForwardIt1 first, ForwardIt1 last, ForwardIt2 s_first, ForwardIt2 s_last )
{
auto it=first;
auto s_it=s_first;
while(it<last && s_it<s_last)
{
if(*it==*s_it)
{
return true;
}
*it<*s_it ? ++it : ++s_it; //increase the smaller of both
}
return false;
}
DEMO
Your code uses std::binary_search, whose pre-condition is that (From http://en.cppreference.com/w/cpp/algorithm/binary_search):
For std::binary_search to succeed, the range [first, last) must be at least partially ordered, i.e. it must satisfy all of the following requirements:
partitioned with respect to element < value or comp(element, value)
partitioned with respect to !(value < element) or !comp(value, element)
for all elements, if element < value or comp(element, value) is true then !(value < element) or !comp(value, element) is also true
A fully-sorted range meets these criteria, as does a range resulting from a call to std::partition.
The sample data you used for testing (as posted at http://ideone.com/XCYdM8) do not meet that requirement. Instead of using:
vectorB.push_back(11116);
vectorB.push_back(11118);
vectorB.push_back(11112);
vectorB.push_back(11120);
vectorB.push_back(11190);
vectorB.push_back(11640);
vectorB.push_back(11740);
if you use a sorted vector like below
vectorB.push_back(11112);
vectorB.push_back(11116);
vectorB.push_back(11118);
vectorB.push_back(11120);
vectorB.push_back(11190);
vectorB.push_back(11640);
vectorB.push_back(11740);
your function will work just fine.
PS The you have designed your code, if the longer std::vector is sorted, the function will work fine.
PS2 Another option is to sort the longer std::vector before calling the function.
std::sort(B.begin(), B.end());

Time complexity issues with multimap

I created a program that finds the median of a list of numbers. The list of numbers is dynamic in that numbers can be removed and inserted (duplicate numbers can be entered) and during this time, the new median is re-evaluated and printed out.
I created this program using a multimap because
1) the benefit of it being already being sorted,
2) easy insertion, deletion, searching (since multimap implements binary search)
3) duplicate entries are allowed.
The constraints for the number of entries + deletions (represented as N) are: 0 < N <= 100,000.
The program I wrote works and prints out the correct median, but it isn't fast enough. I know that the unsorted_multimap is faster than multimap, but then the problem with unsorted_multimap is that I would have to sort it. I have to sort it because to find the median you need to have a sorted list. So my question is, would it be practical to use an unsorted_multimap and then quick sort the entries, or would that just be ridiculous? Would it be faster to just use a vector, quicksort the vector, and use a binary search? Or maybe I am forgetting some fabulous solution out there that I haven't even thought of.
Though I'm not new to C++, I will admit, that my skills with time-complexity are somewhat medicore.
The more I look at my own question, the more I'm beginning to think that just using a vector with quicksort and binary search would be better since the data structures basically already implement vectors.
the more I look at my own question, the more I'm beginning to think that just using vector with quicksort and binary search would be better since the data structures basically already implement vectors.
If you have only few updates - use unsorted std::vector + std::nth_element algorithm which is O(N). You don't need full sorting which is O(N*ln(N)).
live demo of nth_element:
#include <algorithm>
#include <iterator>
#include <iostream>
#include <ostream>
#include <vector>
using namespace std;
template<typename RandomAccessIterator>
RandomAccessIterator median(RandomAccessIterator first,RandomAccessIterator last)
{
RandomAccessIterator m = first + distance(first,last)/2; // handle even middle if needed
nth_element(first,m,last);
return m;
}
int main()
{
vector<int> values = {5,1,2,4,3};
cout << *median(begin(values),end(values)) << endl;
}
Output is:
3
If you have many updates and only removing from middle - use two heaps as comocomocomocomo suggests. If you would use fibonacci_heap - then you would also get O(N) removing from arbitary position (if don't have handle to it).
If you have many updates and need O(ln(N)) removing from arbitary places - then use two multisets as ipc suggests.
If your purpose is to keep track of the median on the fly, as elements are inserted/removed, you should use a min-heap and a max-heap. Each one would contain one half of the elements... There was a related question a couple of days ago: How to implement a Median-heap
Though, if you need to search for specific values in order to remove elements, you still need some kind of map.
You said that it is slow. Are you iterating from the beginning of the map to the (N/2)'th element every time you need the median? You don't need to. You can keep track of the median by maintaining an iterator pointing to it at all times and a counter of the number of elements less than that one. Every time you insert/remove, compare the new/old element with the median and update both iterator and counter.
Another way of seeing it is as two multimaps containing half the elements each. One holds the elements less than the median (or equal) and the other holds those greater. The heaps do this more efficiently, but they don't support searches.
If you only need the median a few times you can use the "select" algorithm. It is described in Sedgewick's book. It takes O(n) time on average. It is similar to quick sort but it does not sort completely. It just partitions the array with random pivots until, eventually, it gets to "select" on one side the smaller m elements (m=(n+1)/2). Then you search for the greatest of those m elements, and this is the median.
Here is how you could implement that in O(log N) per update:
template <typename T>
class median_set {
public:
std::multiset<T> below, above;
// O(log N)
void rebalance()
{
int diff = above.size() - below.size();
if (diff > 0) {
below.insert(*above.begin());
above.erase(above.begin());
} else if (diff < -1) {
above.insert(*below.rbegin());
below.erase(below.find(*below.rbegin()));
}
}
public:
// O(1)
bool empty() const { return below.empty() && above.empty(); }
// O(1)
T const& median() const
{
assert(!empty());
return *below.rbegin();
}
// O(log N)
void insert(T const& value)
{
if (!empty() && value > median())
above.insert(value);
else
below.insert(value);
rebalance();
}
// O(log N)
void erase(T const& value)
{
if (value > median())
above.erase(above.find(value));
else
below.erase(below.find(value));
rebalance();
}
};
(Work in action with tests)
The idea is the following:
Keep track of the values above and below the median in two sets
If a new value is added, add it to the corresponding set. Always ensure that the set below has exactly 0 or 1 more then the other
If a value is removed, remove it from the set and make sure that the condition still holds.
You can't use priority_queues because they won't let you remove one item.
Can any one help me what is Space and Time complexity of my following C# program with details.
//Passing Integer array to Find Extreme from that Integer Array
public int extreme(int[] A)
{
int N = A.Length;
if (N == 0)
{
return -1;
}
else
{
int average = CalculateAverage(A);
return FindExtremes(A, average);
}
}
// Calaculate Average of integerArray
private int CalculateAverage(int[] integerArray)
{
int sum = 0;
foreach (int value in integerArray)
{
sum += value;
}
return Convert.ToInt32(sum / integerArray.Length);
}
//Find Extreme from that Integer Array
private int FindExtremes(int[] integerArray, int average) {
int Index = -1; int ExtremeElement = integerArray[0];
for (int i = 0; i < integerArray.Length; i++)
{
int absolute = Math.Abs(integerArray[i] - average);
if (absolute > ExtremeElement)
{
ExtremeElement = integerArray[i];
Index = i;
}
}
return Index;
}
You are almost certainly better off using a vector. Possibly maintaining an auxiliary vector of indexes to be removed between median calculations so you can delete them in batches. New additions can also be put into an auxiliary vector, sorted, then merged in.

Reorder vector using a vector of indices [duplicate]

This question already has answers here:
How do I sort a std::vector by the values of a different std::vector? [duplicate]
(13 answers)
Closed 12 months ago.
I'd like to reorder the items in a vector, using another vector to specify the order:
char A[] = { 'a', 'b', 'c' };
size_t ORDER[] = { 1, 0, 2 };
vector<char> vA(A, A + sizeof(A) / sizeof(*A));
vector<size_t> vOrder(ORDER, ORDER + sizeof(ORDER) / sizeof(*ORDER));
reorder_naive(vA, vOrder);
// A is now { 'b', 'a', 'c' }
The following is an inefficient implementation that requires copying the vector:
void reorder_naive(vector<char>& vA, const vector<size_t>& vOrder)
{
assert(vA.size() == vOrder.size());
vector vCopy = vA; // Can we avoid this?
for(int i = 0; i < vOrder.size(); ++i)
vA[i] = vCopy[ vOrder[i] ];
}
Is there a more efficient way, for example, that uses swap()?
This algorithm is based on chmike's, but the vector of reorder indices is const. This function agrees with his for all 11! permutations of [0..10]. The complexity is O(N^2), taking N as the size of the input, or more precisely, the size of the largest orbit.
See below for an optimized O(N) solution which modifies the input.
template< class T >
void reorder(vector<T> &v, vector<size_t> const &order ) {
for ( int s = 1, d; s < order.size(); ++ s ) {
for ( d = order[s]; d < s; d = order[d] ) ;
if ( d == s ) while ( d = order[d], d != s ) swap( v[s], v[d] );
}
}
Here's an STL style version which I put a bit more effort into. It's about 47% faster (that is, almost twice as fast over [0..10]!) because it does all the swaps as early as possible and then returns. The reorder vector consists of a number of orbits, and each orbit is reordered upon reaching its first member. It's faster when the last few elements do not contain an orbit.
template< typename order_iterator, typename value_iterator >
void reorder( order_iterator order_begin, order_iterator order_end, value_iterator v ) {
typedef typename std::iterator_traits< value_iterator >::value_type value_t;
typedef typename std::iterator_traits< order_iterator >::value_type index_t;
typedef typename std::iterator_traits< order_iterator >::difference_type diff_t;
diff_t remaining = order_end - 1 - order_begin;
for ( index_t s = index_t(), d; remaining > 0; ++ s ) {
for ( d = order_begin[s]; d > s; d = order_begin[d] ) ;
if ( d == s ) {
-- remaining;
value_t temp = v[s];
while ( d = order_begin[d], d != s ) {
swap( temp, v[d] );
-- remaining;
}
v[s] = temp;
}
}
}
And finally, just to answer the question once and for all, a variant which does destroy the reorder vector (filling it with -1's). For permutations of [0..10], It's about 16% faster than the preceding version. Because overwriting the input enables dynamic programming, it is O(N), asymptotically faster for some cases with longer sequences.
template< typename order_iterator, typename value_iterator >
void reorder_destructive( order_iterator order_begin, order_iterator order_end, value_iterator v ) {
typedef typename std::iterator_traits< value_iterator >::value_type value_t;
typedef typename std::iterator_traits< order_iterator >::value_type index_t;
typedef typename std::iterator_traits< order_iterator >::difference_type diff_t;
diff_t remaining = order_end - 1 - order_begin;
for ( index_t s = index_t(); remaining > 0; ++ s ) {
index_t d = order_begin[s];
if ( d == (diff_t) -1 ) continue;
-- remaining;
value_t temp = v[s];
for ( index_t d2; d != s; d = d2 ) {
swap( temp, v[d] );
swap( order_begin[d], d2 = (diff_t) -1 );
-- remaining;
}
v[s] = temp;
}
}
In-place reordering of vector
Warning: there is an ambiguity about the semantic what the ordering-indices mean. Both are answered here
move elements of vector to the position of the indices
Interactive version here.
#include <iostream>
#include <vector>
#include <assert.h>
using namespace std;
void REORDER(vector<double>& vA, vector<size_t>& vOrder)
{
assert(vA.size() == vOrder.size());
// for all elements to put in place
for( int i = 0; i < vA.size() - 1; ++i )
{
// while the element i is not yet in place
while( i != vOrder[i] )
{
// swap it with the element at its final place
int alt = vOrder[i];
swap( vA[i], vA[alt] );
swap( vOrder[i], vOrder[alt] );
}
}
}
int main()
{
std::vector<double> vec {7, 5, 9, 6};
std::vector<size_t> inds {1, 3, 0, 2};
REORDER(vec, inds);
for (size_t vv = 0; vv < vec.size(); ++vv)
{
std::cout << vec[vv] << std::endl;
}
return 0;
}
output
9
7
6
5
note that you can save one test because if n-1 elements are in place the last nth element is certainly in place.
On exit vA and vOrder are properly ordered.
This algorithm performs at most n-1 swapping because each swap moves the element to its final position. And we'll have to do at most 2N tests on vOrder.
draw the elements of vector from the position of the indices
Try it interactively here.
#include <iostream>
#include <vector>
#include <assert.h>
template<typename T>
void reorder(std::vector<T>& vec, std::vector<size_t> vOrder)
{
assert(vec.size() == vOrder.size());
for( size_t vv = 0; vv < vec.size() - 1; ++vv )
{
if (vOrder[vv] == vv)
{
continue;
}
size_t oo;
for(oo = vv + 1; oo < vOrder.size(); ++oo)
{
if (vOrder[oo] == vv)
{
break;
}
}
std::swap( vec[vv], vec[vOrder[vv]] );
std::swap( vOrder[vv], vOrder[oo] );
}
}
int main()
{
std::vector<double> vec {7, 5, 9, 6};
std::vector<size_t> inds {1, 3, 0, 2};
reorder(vec, inds);
for (size_t vv = 0; vv < vec.size(); ++vv)
{
std::cout << vec[vv] << std::endl;
}
return 0;
}
Output
5
6
7
9
It appears to me that vOrder contains a set of indexes in the desired order (for example the output of sorting by index). The code example here follows the "cycles" in vOrder, where following a sub-set (could be all of vOrder) of indexes will cycle through the sub-set, ending back at the first index of the sub-set.
Wiki article on "cycles"
https://en.wikipedia.org/wiki/Cyclic_permutation
In the following example, every swap places at least one element in it's proper place. This code example effectively reorders vA according to vOrder, while "unordering" or "unpermuting" vOrder back to its original state (0 :: n-1). If vA contained the values 0 through n-1 in order, then after reorder, vA would end up where vOrder started.
template <class T>
void reorder(vector<T>& vA, vector<size_t>& vOrder)
{
assert(vA.size() == vOrder.size());
// for all elements to put in place
for( size_t i = 0; i < vA.size(); ++i )
{
// while vOrder[i] is not yet in place
// every swap places at least one element in it's proper place
while( vOrder[i] != vOrder[vOrder[i]] )
{
swap( vA[vOrder[i]], vA[vOrder[vOrder[i]]] );
swap( vOrder[i], vOrder[vOrder[i]] );
}
}
}
This can also be implemented a bit more efficiently using moves instead swaps. A temp object is needed to hold an element during the moves. Example C code, reorders A[] according to indexes in I[], also sorts I[] :
void reorder(int *A, int *I, int n)
{
int i, j, k;
int tA;
/* reorder A according to I */
/* every move puts an element into place */
/* time complexity is O(n) */
for(i = 0; i < n; i++){
if(i != I[i]){
tA = A[i];
j = i;
while(i != (k = I[j])){
A[j] = A[k];
I[j] = j;
j = k;
}
A[j] = tA;
I[j] = j;
}
}
}
If it is ok to modify the ORDER array then an implementation that sorts the ORDER vector and at each sorting operation also swaps the corresponding values vector elements could do the trick, I think.
A survey of existing answers
You ask if there is "a more efficient way". But what do you mean by efficient and what are your requirements?
Potatoswatter's answer works in O(N²) time with O(1) additional space and doesn't mutate the reordering vector.
chmike and rcgldr give answers which use O(N) time with O(1) additional space, but they achieve this by mutating the reordering vector.
Your original answer allocates new space and then copies data into it while Tim MB suggests using move semantics. However, moving still requires a place to move things to and an object like an std::string has both a length variable and a pointer. In other words, a move-based solution requires O(N) allocations for any objects and O(1) allocations for the new vector itself. I explain why this is important below.
Preserving the reordering vector
We might want that reordering vector! Sorting costs O(N log N). But, if you know you'll be sorting several vectors in the same way, such as in a Structure of Arrays (SoA) context, you can sort once and then reuse the results. This can save a lot of time.
You might also want to sort and then unsort data. Having the reordering vector allows you to do this. A use case here is for performing genomic sequencing on GPUs where maximal speed efficiency is obtained by having sequences of similar lengths processed in batches. We cannot rely on the user providing sequences in this order so we sort and then unsort.
So, what if we want the best of all worlds: O(N) processing without the costs of additional allocation but also without mutating our ordering vector (which we might, after all, want to reuse)? To find that world, we need to ask:
Why is extra space bad?
There are two reasons you might not want to allocate additional space.
The first is that you don't have much space to work with. This can occur in two situations: you're on an embedded device with limited memory. Usually this means you're working with small datasets, so the O(N²) solution is probably fine here. But it can also happen when you are working with really large datasets. In this case O(N²) is unacceptable and you have to use one of the O(N) mutating solutions.
The other reason extra space is bad is because allocation is expensive. For smaller datasets it can cost more than the actual computation. Thus, one way to achieve efficiency is to eliminate allocation.
Outline
When we mutate the ordering vector we are doing so as a way to indicate whether elements are in their permuted positions. Rather than doing this, we could use a bit-vector to indicate that same information. However, if we allocate the bit vector each time that would be expensive.
Instead, we could clear the bit vector each time by resetting it to zero. However, that incurs an additional O(N) cost per function use.
Rather, we can store a "version" value in a vector and increment this on each function use. This gives us O(1) access, O(1) clear, and an amoritzed allocation cost. This works similarly to a persistent data structure. The downside is that if we use an ordering function too often the version counter needs to be reset, though the O(N) cost of doing so is amortized.
This raises the question: what is the optimal data type for the version vector? A bit-vector maximizes cache utilization but requires a full O(N) reset after each use. A 64-bit data type probably never needs to be reset, but has poor cache utilization. Experimenting is the best way to figure this out.
Two types of permutations
We can view an ordering vector as having two senses: forward and backward. In the forward sense, the vector tell us where elements go to. In the backward sense, the vector tells us where elements are coming from. Since the ordering vector is implicitly a linked list, the backward sense requires O(N) additional space, but, again, we can amortize the allocation cost. Applying the two senses sequentially brings us back to our original ordering.
Performance
Running single-threaded on my "Intel(R) Xeon(R) E-2176M CPU # 2.70GHz", the following code takes about 0.81ms per reordering for sequences 32,767 elements long.
Code
Fully commented code for both senses with tests:
#include <algorithm>
#include <cassert>
#include <random>
#include <stack>
#include <stdexcept>
#include <vector>
///#brief Reorder a vector by moving its elements to indices indicted by another
/// vector. Takes O(N) time and O(N) space. Allocations are amoritzed.
///
///#param[in,out] values Vector to be reordered
///#param[in] ordering A permutation of the vector
///#param[in,out] visited A black-box vector to be reused between calls and
/// shared with with `backward_reorder()`
template<class ValueType, class OrderingType, class ProgressType>
void forward_reorder(
std::vector<ValueType> &values,
const std::vector<OrderingType> &ordering,
std::vector<ProgressType> &visited
){
if(ordering.size()!=values.size()){
throw std::runtime_error("ordering and values must be the same size!");
}
//Size the visited vector appropriately. Since vectors don't shrink, this will
//shortly become large enough to handle most of the inputs. The vector is 1
//larger than necessary because the first element is special.
if(visited.empty() || visited.size()-1<values.size());
visited.resize(values.size()+1);
//If the visitation indicator becomes too large, we reset everything. This is
//O(N) expensive, but unlikely to occur in most use cases if an appropriate
//data type is chosen for the visited vector. For instance, an unsigned 32-bit
//integer provides ~4B uses before it needs to be reset. We subtract one below
//to avoid having to think too much about off-by-one errors. Note that
//choosing the biggest data type possible is not necessarily a good idea!
//Smaller data types will have better cache utilization.
if(visited.at(0)==std::numeric_limits<ProgressType>::max()-1)
std::fill(visited.begin(), visited.end(), 0);
//We increment the stored visited indicator and make a note of the result. Any
//value in the visited vector less than `visited_indicator` has not been
//visited.
const auto visited_indicator = ++visited.at(0);
//For doing an early exit if we get everything in place
auto remaining = values.size();
//For all elements that need to be placed
for(size_t s=0;s<ordering.size() && remaining>0;s++){
assert(visited[s+1]<=visited_indicator);
//Ignore already-visited elements
if(visited[s+1]==visited_indicator)
continue;
//Don't rearrange if we don't have to
if(s==visited[s])
continue;
//Follow this cycle, putting elements in their places until we get back
//around. Use move semantics for speed.
auto temp = std::move(values[s]);
auto i = s;
for(;s!=(size_t)ordering[i];i=ordering[i],--remaining){
std::swap(temp, values[ordering[i]]);
visited[i+1] = visited_indicator;
}
std::swap(temp, values[s]);
visited[i+1] = visited_indicator;
}
}
///#brief Reorder a vector by moving its elements to indices indicted by another
/// vector. Takes O(2N) time and O(2N) space. Allocations are amoritzed.
///
///#param[in,out] values Vector to be reordered
///#param[in] ordering A permutation of the vector
///#param[in,out] visited A black-box vector to be reused between calls and
/// shared with with `forward_reorder()`
template<class ValueType, class OrderingType, class ProgressType>
void backward_reorder(
std::vector<ValueType> &values,
const std::vector<OrderingType> &ordering,
std::vector<ProgressType> &visited
){
//The orderings form a linked list. We need O(N) memory to reverse a linked
//list. We use `thread_local` so that the function is reentrant.
thread_local std::stack<OrderingType> stack;
if(ordering.size()!=values.size()){
throw std::runtime_error("ordering and values must be the same size!");
}
//Size the visited vector appropriately. Since vectors don't shrink, this will
//shortly become large enough to handle most of the inputs. The vector is 1
//larger than necessary because the first element is special.
if(visited.empty() || visited.size()-1<values.size());
visited.resize(values.size()+1);
//If the visitation indicator becomes too large, we reset everything. This is
//O(N) expensive, but unlikely to occur in most use cases if an appropriate
//data type is chosen for the visited vector. For instance, an unsigned 32-bit
//integer provides ~4B uses before it needs to be reset. We subtract one below
//to avoid having to think too much about off-by-one errors. Note that
//choosing the biggest data type possible is not necessarily a good idea!
//Smaller data types will have better cache utilization.
if(visited.at(0)==std::numeric_limits<ProgressType>::max()-1)
std::fill(visited.begin(), visited.end(), 0);
//We increment the stored visited indicator and make a note of the result. Any
//value in the visited vector less than `visited_indicator` has not been
//visited.
const auto visited_indicator = ++visited.at(0);
//For doing an early exit if we get everything in place
auto remaining = values.size();
//For all elements that need to be placed
for(size_t s=0;s<ordering.size() && remaining>0;s++){
assert(visited[s+1]<=visited_indicator);
//Ignore already-visited elements
if(visited[s+1]==visited_indicator)
continue;
//Don't rearrange if we don't have to
if(s==visited[s])
continue;
//The orderings form a linked list. We need to follow that list to its end
//in order to reverse it.
stack.emplace(s);
for(auto i=s;s!=(size_t)ordering[i];i=ordering[i]){
stack.emplace(ordering[i]);
}
//Now we follow the linked list in reverse to its beginning, putting
//elements in their places. Use move semantics for speed.
auto temp = std::move(values[s]);
while(!stack.empty()){
std::swap(temp, values[stack.top()]);
visited[stack.top()+1] = visited_indicator;
stack.pop();
--remaining;
}
visited[s+1] = visited_indicator;
}
}
int main(){
std::mt19937 gen;
std::uniform_int_distribution<short> value_dist(0,std::numeric_limits<short>::max());
std::uniform_int_distribution<short> len_dist (0,std::numeric_limits<short>::max());
std::vector<short> data;
std::vector<short> ordering;
std::vector<short> original;
std::vector<size_t> progress;
for(int i=0;i<1000;i++){
const int len = len_dist(gen);
data.clear();
ordering.clear();
for(int i=0;i<len;i++){
data.push_back(value_dist(gen));
ordering.push_back(i);
}
original = data;
std::shuffle(ordering.begin(), ordering.end(), gen);
forward_reorder(data, ordering, progress);
assert(original!=data);
backward_reorder(data, ordering, progress);
assert(original==data);
}
}
Never prematurely optimize. Meassure and then determine where you need to optimize and what. You can end with complex code that is hard to maintain and bug-prone in many places where performance is not an issue.
With that being said, do not early pessimize. Without changing the code you can remove half of your copies:
template <typename T>
void reorder( std::vector<T> & data, std::vector<std::size_t> const & order )
{
std::vector<T> tmp; // create an empty vector
tmp.reserve( data.size() ); // ensure memory and avoid moves in the vector
for ( std::size_t i = 0; i < order.size(); ++i ) {
tmp.push_back( data[order[i]] );
}
data.swap( tmp ); // swap vector contents
}
This code creates and empty (big enough) vector in which a single copy is performed in-order. At the end, the ordered and original vectors are swapped. This will reduce the copies, but still requires extra memory.
If you want to perform the moves in-place, a simple algorithm could be:
template <typename T>
void reorder( std::vector<T> & data, std::vector<std::size_t> const & order )
{
for ( std::size_t i = 0; i < order.size(); ++i ) {
std::size_t original = order[i];
while ( i < original ) {
original = order[original];
}
std::swap( data[i], data[original] );
}
}
This code should be checked and debugged. In plain words the algorithm in each step positions the element at the i-th position. First we determine where the original element for that position is now placed in the data vector. If the original position has already been touched by the algorithm (it is before the i-th position) then the original element was swapped to order[original] position. Then again, that element can already have been moved...
This algorithm is roughly O(N^2) in the number of integer operations and thus is theoretically worse in performance time as compare to the initial O(N) algorithm. But it can compensate if the N^2 swap operations (worst case) cost less than the N copy operations or if you are really constrained by memory footprint.
It's an interesting intellectual exercise to do the reorder with O(1) space requirement but in 99.9% of the cases the simpler answer will perform to your needs:
void permute(vector<T>& values, const vector<size_t>& indices)
{
vector<T> out;
out.reserve(indices.size());
for(size_t index: indices)
{
assert(0 <= index && index < values.size());
out.push_back(std::move(values[index]));
}
values = std::move(out);
}
Beyond memory requirements, the only way I can think of this being slower would be due to the memory of out being in a different cache page than that of values and indices.
You could do it recursively, I guess - something like this (unchecked, but it gives the idea):
// Recursive function
template<typename T>
void REORDER(int oldPosition, vector<T>& vA,
const vector<int>& vecNewOrder, vector<bool>& vecVisited)
{
// Keep a record of the value currently in that position,
// as well as the position we're moving it to.
// But don't move it yet, or we'll overwrite whatever's at the next
// position. Instead, we first move what's at the next position.
// To guard against loops, we look at vecVisited, and set it to true
// once we've visited a position.
T oldVal = vA[oldPosition];
int newPos = vecNewOrder[oldPosition];
if (vecVisited[oldPosition])
{
// We've hit a loop. Set it and return.
vA[newPosition] = oldVal;
return;
}
// Guard against loops:
vecVisited[oldPosition] = true;
// Recursively re-order the next item in the sequence.
REORDER(newPos, vA, vecNewOrder, vecVisited);
// And, after we've set this new value,
vA[newPosition] = oldVal;
}
// The "main" function
template<typename T>
void REORDER(vector<T>& vA, const vector<int>& newOrder)
{
// Initialise vecVisited with false values
vector<bool> vecVisited(vA.size(), false);
for (int x = 0; x < vA.size(); x++)
{
REORDER(x, vA, newOrder, vecVisited);
}
}
Of course, you do have the overhead of vecVisited. Thoughts on this approach, anyone?
To iterate through the vector is O(n) operation. Its sorta hard to beat that.
Your code is broken. You cannot assign to vA and you need to use template parameters.
vector<char> REORDER(const vector<char>& vA, const vector<size_t>& vOrder)
{
assert(vA.size() == vOrder.size());
vector<char> vCopy(vA.size());
for(int i = 0; i < vOrder.size(); ++i)
vCopy[i] = vA[ vOrder[i] ];
return vA;
}
The above is slightly more efficient.
It is not clear by the title and the question if the vector should be ordered with the same steps it takes to order vOrder or if vOrder already contains the indexes of the desired order.
The first interpretation has already a satisfying answer (see chmike and Potatoswatter), I add some thoughts about the latter.
If the creation and/or copy cost of object T is relevant
template <typename T>
void reorder( std::vector<T> & data, std::vector<std::size_t> & order )
{
std::size_t i,j,k;
for(i = 0; i < order.size() - 1; ++i) {
j = order[i];
if(j != i) {
for(k = i + 1; order[k] != i; ++k);
std::swap(order[i],order[k]);
std::swap(data[i],data[j]);
}
}
}
If the creation cost of your object is small and memory is not a concern (see dribeas):
template <typename T>
void reorder( std::vector<T> & data, std::vector<std::size_t> const & order )
{
std::vector<T> tmp; // create an empty vector
tmp.reserve( data.size() ); // ensure memory and avoid moves in the vector
for ( std::size_t i = 0; i < order.size(); ++i ) {
tmp.push_back( data[order[i]] );
}
data.swap( tmp ); // swap vector contents
}
Note that the two pieces of code in dribeas answer do different things.
I was trying to use #Potatoswatter's solution to sort multiple vectors by a third one and got really confused by output from using the above functions on a vector of indices output from Armadillo's sort_index. To switch from a vector output from sort_index (the arma_inds vector below) to one that can be used with #Potatoswatter's solution (new_inds below), you can do the following:
vector<int> new_inds(arma_inds.size());
for (int i = 0; i < new_inds.size(); i++) new_inds[arma_inds[i]] = i;
I came up with this solution which has the space complexity of O(max_val - min_val + 1), but it can be integrated with std::sort and benefits from std::sort's O(n log n) decent time complexity.
std::vector<int32_t> dense_vec = {1, 2, 3};
std::vector<int32_t> order = {1, 0, 2};
int32_t max_val = *std::max_element(dense_vec.begin(), dense_vec.end());
std::vector<int32_t> sparse_vec(max_val + 1);
int32_t i = 0;
for(int32_t j: dense_vec)
{
sparse_vec[j] = order[i];
i++;
}
std::sort(dense_vec.begin(), dense_vec.end(),
[&sparse_vec](int32_t i1, int32_t i2) {return sparse_vec[i1] < sparse_vec[i2];});
The following assumptions made while writing this code:
Vector values start from zero.
Vector does not contain repeated values.
We have enough memory to sacrifice in order to use std::sort
This should avoid copying the vector:
void REORDER(vector<char>& vA, const vector<size_t>& vOrder)
{
assert(vA.size() == vOrder.size());
for(int i = 0; i < vOrder.size(); ++i)
if (i < vOrder[i])
swap(vA[i], vA[vOrder[i]]);
}