Finding nth largest element inplace - c++

I need to find nth largest element in an array and currently I'm doing it the following way:
std::vector<double> buffer(sequence); // sequence is const std::vector<double>
std::nth_element(buffer.begin(), buffer.begin() + idx, buffer.end(), std::greater<double>());
nth_element = buffer[idx];
But is there any way to find the n-th largest element in an array without using an external buffer?

You can avoid copying the entire buffer without modifying the original range by using std::partial_sort_copy. Simply copy the partially sorted range into a smaller buffer of size n, and take the last element.
If you may modify the original buffer, then you can simply use std::nth_element in place.

You can use the partition function used in Quick Sort to find the nth largest elements.
The partition function will divide the original array into two parts.The first part will be smaller than A[i], The second part will be larger than A[i],partition will return i.If i is equal to n, then return A[i].If i is smaller than n,then partition the second part.If i is larger than n,then partition the first part.
It won't cost you extra buffer,and the average time cost is O(n).

Related

Can i use the normal min heap method for solving "Merge k sorted array"

we have been given k sorted arrays. Lets say k =3
a1={1,4,7} a2={3,5} a3={2,6,7} now we are supposed to merge these 3 arrays in sorted order. Hence the output will be {1,2,3,4,5,6,7,7}.
Now in the tutorial that i am following they have maintained an index and used pairs to solve this question using min heaps.
But my question is that since min heaps stores the elements in sorted order so can we just simply use the push function of min heap for all the elements from k arrays and then at the end printing the min heap?? instead of keeping index and making pairs? in c++?
Sure, but that's slow. You are throwing away the work that has already gone into the input arrays (that they are already sorted) and basically making the sorted array from the unsorted collection of all the elements. Concretely, if all the input arrays have average length n, then you perform k*n inserts into the heap, and then you extract the minimum k*n times. The heap operations have complexity O(log(k*n)). So the entire algorithm takes O(k*n*log(k*n)) time, which you may recognize as the time it takes to sort an unsorted array of size k*n. Surely there's a better way, because you know the input arrays are sorted.
I presume the given solution is to construct k "iterators" into the arrays, place them into a heap sorted by the value at each iterator, and then repeatedly remove the least iterator, consume its value, increment it, and place it back in the heap. The key is that the heap (which is where all the work is happening) is smaller: it contains only k elements instead of k*n. This makes every operation on the heap faster: now the heap operations in this algorithm are O(log k). The overall algorithm is now O(k*n*log k), an improvement.
I think this Algorithm is what you are looking for
Algorithm:
Create a min Heap and insert the first element of all k arrays. Run a
loop until the size of MinHeap is greater than zero. Remove the top
element of the MinHeap and print the element. Now insert the next
element from the same array in which the removed element belonged. If
the array doesn’t have any more elements, then replace root with
infinite.After replacing the root, heapify the tree.
And about time needed: O( n * k * log k), Insertion and deletion in a Min Heap requires log k time. So the Overall time compelxity is O( n * k * log k)
But my question is that since min heaps stores the elements in sorted
order so can we just simply use the push function of min heap for all
the elements from k arrays and then at the end printing the min heap??
The main recursive rule for min-heap: left and right child should be less than parent. It does not mean that left child should be less than parent of right side of tree. Attached image show min-heap. But this min-heap is not finally sorted array

Is it possible to iterate over an array in random order?

So, suppose I have an array:
int arr[5];
Instead of iterating through the array from 0 to 4, is it possible to do that in random order, and in a way that the iterator wouldn't go through the same position twice?
All the below assumes you want to implement your iteration in O(1) space - not using e.g. an auxiliary array of indices, which you could shuffle. If you can afford O(n) space, the solution is relatively easy, so I assume you can only afford O(1).
To iterate over an array in a random manner, you can make a random number generator, and use its output as an index into the array. You cannot use the standard RNGs this way because they can output the same index twice too soon. However, the theory of RNGs is pretty extensive, and you have at least two types of RNG which have guarantees on their period.
Linear-feedback shift register
Lehmer random number generator
To use these theoretical ideas:
Choose a number N ≤ n (where n is the size of your array), but not much greater than n, such that it's possible to construct a RNG with period N
Call your RNG N times, so it generates a permutation of numbers from 0 to N-1
For each number, if it's smaller than the size of your array, output the element of your array with that index
The techniques for making a RNG with a given period may be too annoying to implement in code. So, if you know the size of your array in advance, you can choose N = n, and create your RNG manually (factoring n or n+1 will be the first step). If you don't know the size of your array, choose some easy-to-use number for N, like some suitable power of 2.
As mentioned in the comments, there are several ways you can do this. The most obvious way would be to just shuffle the range and then iterate over it:
std::shuffle(arr, arr + 5);
for (int a : arr)
// a is a new random element from arr in each iteration
This will modify the range of course, which may not be what you want.
For an option that doesn't modify the range: you can generate a range of indices, shuffle those indices and use those indices to index into the range:
int ind[5];
std::iota(ind, ind + 5, 0);
std::shuffle(ind, ind + 5);
for (int i : ind)
// arr[i] is a new random element from arr in each iteration
You could also implement a custom iterator that iterates over a range in random order without repeats, though this might be overkill for the functionality you're trying to achieve. It's still a good thing to try out, to learn how that works.

Eigen Sparse Vector : Find max coefficient

I am working with sparse vector with Eigen, and I need to find an efficient way to compute the index of the max coefficient (or the nth max coefficient).
My initial method uses Eigen::SparseVector::InnerIterator, however it does not compute the right value in the case of vector containing only zeros and negative value because InnerIterator only iterate on non-zero values.
How to implement it in order to take into account zero values ?
To get the index of the largest non-zero element, you can use this function:
Eigen::Index maxRow(Eigen::SparseVector<double> const & v)
{
Eigen::Index nnz = v.nonZeros();
Eigen::Index rowIdx;
double value = Eigen::VectorXd::Map(v.valuePtr(), nnz).maxCoeff(&rowIdx);
// requires special handling if value <= 0.0
return v.innerIndexPtr()[rowIdx];
}
In case value <=0 (and v.nonZeros()<v.size()), you can iterate through innerIndexPtr() until you find a gap between consecutive elements (or write something more sophisticated using std::lower_bound)
For getting the nth largest element it depends on how large your n is relative to the vector size, how many non-zeros you have, if you can modify your SparseVector, etc.
Especially, if n is relatively large, consider to partition your elements into positive and negative elements, then using std::nth_element in the correct half.
Iterate over the index array (innerIndices I think) as well at the same time as the inner iterator.

How to shuffle an array so that all elements change their place

I need to shuffle an array so that all array elements should change their location.
Given an array [0,1,2,3] it would be ok to get [1,0,3,2] or [3,2,0,1] but not [3,1,2,0] (because 2 left unchanged).
I suppose algorithm would not be language-specific, but just in case, I need it in C++ program (and I cannot use std::random_shuffle due to the additional requirement).
What about this?
Allocate an array which contains numbers from 0 to arrayLength-1
Shuffle the array
If there is no element in array whose index equals its value, continue to step 4; otherwise repeat from step 2.
Use shuffled array values as indexes for your array.
For each element e
If there is an element to the left of e
Select a random element r to the left of e
swap r and e
This guarantees that each value isn't in the position that it started, but doesn't guarantee that each value changes if there's duplicates.
BeeOnRope notes that though simple, this is flawed. Given the list [0,1,2,3], this algorithm cannot produce the output [1,0,3,2].
It's not going to be very random, but you can rotate all the elements at least one position:
std::rotate(v.begin(), v.begin() + (rand() % v.size() - 1) + 1, v.end());
If v was {1,2,3,4,5,6,7,8,9} at the beginning, then after rotation it will be, for example: {2,3,4,5,6,7,8,9,1}, or {3,4,5,6,7,8,9,1,2}, etc.
All elements of the array will change position.
I kind of have a idea in my mind hope it fits your application. Have one more container and this container will be
a "map(int,vector(int))" . The key element will show index and the second element the vector will hold the already used values.
For example for the first element you will use rand function to find which element of the array you should use.Than you will check the map structure if this element of the array has been used for this index.

How does one remove duplicate elements in place in an array in O(n) in C or C++?

Is there any method to remove the duplicate elements in an array in place in C/C++ in O(n)?
Suppose elements are a[5]={1,2,2,3,4}
then resulting array should contain {1,2,3,4}
The solution can be achieved using two for loops but that would be O(n^2) I believe.
If, and only if, the source array is sorted, this can be done in linear time:
std::unique(a, a + 5); //Returns a pointer to the new logical end of a.
Otherwise you'll have to sort first, which is (99.999% of the time) n lg n.
Best case is O(n log n). Perform a heap sort on the original array: O(n log n) in time, O(1)/in-place in space. Then run through the array sequentially with 2 indices (source & dest) to collapse out repetitions. This has the side effect of not preserving the original order, but since "remove duplicates" doesn't specify which duplicates to remove (first? second? last?), I'm hoping that you don't care that the order is lost.
If you do want to preserve the original order, there's no way to do things in-place. But it's trivial if you make an array of pointers to elements in the original array, do all your work on the pointers, and use them to collapse the original array at the end.
Anyone claiming it can be done in O(n) time and in-place is simply wrong, modulo some arguments about what O(n) and in-place mean. One obvious pseudo-solution, if your elements are 32-bit integers, is to use a 4-gigabit bit-array (512 megabytes in size) initialized to all zeros, flipping a bit on when you see that number and skipping over it if the bit was already on. Of course then you're taking advantage of the fact that n is bounded by a constant, so technically everything is O(1) but with a horrible constant factor. However, I do mention this approach since, if n is bounded by a small constant - for instance if you have 16-bit integers - it's a very practical solution.
Yes. Because access (insertion or lookup) on a hashtable is O(1), you can remove duplicates in O(N).
Pseudocode:
hashtable h = {}
numdups = 0
for (i = 0; i < input.length; i++) {
if (!h.contains(input[i])) {
input[i-numdups] = input[i]
h.add(input[i])
} else {
numdups = numdups + 1
}
This is O(N).
Some commenters have pointed out that whether a hashtable is O(1) depends on a number of things. But in the real world, with a good hash, you can expect constant-time performance. And it is possible to engineer a hash that is O(1) to satisfy the theoreticians.
I'm going to suggest a variation on Borealids answer, but I'll point out up front that it's cheating. Basically, it only works assuming some severe constraints on the values in the array - e.g. that all keys are 32-bit integers.
Instead of a hash table, the idea is to use a bitvector. This is an O(1) memory requirement which should in theory keep Rahul happy (but won't). With the 32-bit integers, the bitvector will require 512MB (ie 2**32 bits) - assuming 8-bit bytes, as some pedant may point out.
As Borealid should point out, this is a hashtable - just using a trivial hash function. This does guarantee that there won't be any collisions. The only way there could be a collision is by having the same value in the input array twice - but since the whole point is to ignore the second and later occurences, this doesn't matter.
Pseudocode for completeness...
src = dest = input.begin ();
while (src != input.end ())
{
if (!bitvector [*src])
{
bitvector [*src] = true;
*dest = *src; dest++;
}
src++;
}
// at this point, dest gives the new end of the array
Just to be really silly (but theoretically correct), I'll also point out that the space requirement is still O(1) even if the array holds 64-bit integers. The constant term is a bit big, I agree, and you may have issues with 64-bit CPUs that can't actually use the full 64 bits of an address, but...
Take your example. If the array elements are bounded integer, you can create a lookup bitarray.
If you find an integer such as 3, turn the 3rd bit on.
If you find an integer such as 5, turn the 5th bit on.
If the array contains elements rather than integer, or the element is not bounded, using a hashtable would be a good choice, since hashtable lookup cost is a constant.
The canonical implementation of the unique() algorithm looks like something similar to the following:
template<typename Fwd>
Fwd unique(Fwd first, Fwd last)
{
if( first == last ) return first;
Fwd result = first;
while( ++first != last ) {
if( !(*result == *first) )
*(++result) = *first;
}
return ++result;
}
This algorithm takes a range of sorted elements. If the range is not sorted, sort it before invoking the algorithm. The algorithm will run in-place, and return an iterator pointing to one-past-the-last-element of the unique'd sequence.
If you can't sort the elements then you've cornered yourself and you have no other choice but to use for the task an algorithm with runtime performance worse than O(n).
This algorithm runs in O(n) runtime. That's big-oh of n, worst case in all cases, not amortized time. It uses O(1) space.
The example you have given is a sorted array. It is possible only in that case (given your constant space constraint)