Sorting using a three number comparator - c++

I have an array of size n waiting to be sorted. But different from ordinary sorting problem, I'm constrained to use a specific comparator, which receives three numbers and tells the maximum and minimum of the three. My goal is to use the comparator as few times as possible before completely sorting the array. What strategy can I use?
Thanks for any help!

Since your three-way comparator can be implemented by three calls to a normal comparator, that means we can't improve on any normal sorting algorithm by a factor of more than 3. A more careful argument shows that, because each three-way comparison gives us log₂ 6 ≈ 2.585 bits of information, we can't improve by a factor of more than that. Intuitively, when sorting with a normal comparator you might compare a <= b and b <= c, and therefore not need to compare a and c anyway; so the possible speedup factor could be as small as 2.
So asymptotically, we're still looking for an O(n log n) algorithm, and the question is how to exploit the comparator to do fewer comparisons by at least a factor 2. The "obvious" thing to try first is modifying an existing comparison-based sorting algorithm; a good candidate is bottom-up heapsort, which does about n log₂ n comparisons in the average case, and 1.5 n log₂ n in the worst case (Wikipedia). This beats the standard quicksort algorithm, which does about 1.39 n log₂ n comparisons in the average case (Wikipedia).
The algorithm works using two basic operations on a heap, "sift down" and "sift up".
The "sift down" operation requires comparing a parent element with its two children, to see if the parent element is greater than or equal to both its children, or if not, which child the parent should be swapped with. We can use the three-way comparator to compare the parent with both children at once.
The "sift up" operation compares a child with its parent, and swaps them if they are out of order; this is then repeated all the way up to the root node. We can use the three-way comparator to compare the child node with its parent and its grandparent at once.
The heapsort algorithm only calls the comparator within those two operations, and for both operations the three-way comparator can be called fewer times by a factor of 2. This isn't necessarily the best you can do, but it starts from a very efficient algorithm, and matches the worst-case speedup factor given by intuition.

Well, I came up with an idea.
Let's remember how quicksort works:
First, we locate a sort of a median value (pick up (a[0] + a[N-1])/2 if you're too lazy =3).
Then, we divide an array
by two on the condition of being less or greater than median.
At last, we run the algorithm recursively on each of two subarrays
Using your comparator, you can speed up your second phase twice by processing two values at once:
compare(median, a[2 * i], a[2 * i + 1])
if min is median, both are greater and go to the right subarray
if max is median, both are less and go to the left subarray
if neither is median, min goes left, and max goes right
After that, run recursive part of the algorithm as usual.

Well, I get an brilliant idea. Using 4-way mergesort and loser tree to optimize, the times of using the comparator can be reduced to less than 0.5nlog₂n, by my rough estimate.

Related

performance: find the index of max value in an arr(tie allowed)

Just as the title, and BTW, it's just out of curiosity and it's not a homework question. It might seem to be trivial for people of CS major. The problem is I would like to find the indices of max value in an array. Basically I have two approaches.
scan over and find the maximum, then scan twice to get the vector of indices
scan over and find the maximum, along this scan construct indices array and abandon if a better one is there.
May I now how should I weigh over these two approaches in terms of performance(mainly time complexity I suppose)? It is hard for me because I have even no idea what the worst case should be for the second approach! It's not a hard problem perse. But I just want to know how to approach this problem or how should I google this type of problem to get the answer.
In term of complexity:
scan over and find the maximum,
then scan twice to get the vector of indices
First scan is O(n).
Second scan is O(n) + k insertions (with k, the number of max value)
vector::push_back has amortized complexity of O(1).
so a total O(2 * n + k) which might be simplified to O(n) as k <= n
scan over and find the maximum,
along this scan construct indices array and abandon if a better one is there.
Scan is O(n).
Number of insertions is more complicated to compute.
Number of clear (and number of element cleared) is more complicated to compute too. (clear's complexity would be less or equal to number of element removed)
But both have upper bound to n, so complexity is less or equal than O(3 * n) = O(n) but also greater than equal to O(n) (Scan) so it is O(n) too.
So for both methods, complexity is the same: O(n).
For performance timing, as always, you have to measure.
For your first method, you can set a condition to add the index to the array. Whenever the max changes, you need to clear the array. You don't need to iterate twice.
For the second method, the implementation is easier. You just find max the first go. Then you find the indices that match on the second go.
As stated in a previous answer, complexity is O(n) in both cases, and measures are needed to compare performances.
However, I would like to add two points:
The first one is that the performance comparison may depend on the compiler, how optimisation is performed.
The second point is more critical: performance may depend on the input array.
For example, let us consider the corner case: 1,1,1, .., 1, 2, i.e. a huge number of 1 followed by one 2. With your second approach, you will create a huge temporary array of indices, to provide at the end an array of one element. It is possible at the end to redefine the size of the memory allocated to this array. However, I don't like the idea to create a temporary unnecessary huge vector, independently of the time performance concern. Note that such a array could suffer of several reallocations, which would impact time performance.
This is why in the general case, without any knowledge on the input, I would prefer your first approach, two scans. The situation could be different if you want to implement a function dedicated to a specific type of data.

Ho good is this optimization to heapsort by dividing into 2 parts

I was thinking about quicksort not finding the exact midpoint for pivot.
Any effort to find exact midpoint as pivot slows down quicksort & is not worth it.
So is it possible to accomplish that using heapsort & is it any worthwhile?
I selected heapsort because it can find next max/min in logarithmic time.
If we divide heapsort array into 2 parts.
1) In the left half, we find max heap. (n/2-1 comparisons)
2) In the right half, we find min heap. (n/2-1 comparisons)
3) While
(max in left half is < min in right half){
-- swap max in left half with min in right half
-- heapify the swapped elements in respective halves
(i.e. find next max in left half
& find next min in right half).
}
end while loop.
When this loop ends, we have two completely disjoint halves.
There is no improvement so far than regular heapsort.
1) We can complete the remaining heapification in each half (log n/2 for remaining elements at most).
So any element that was in the correct half would heapify log n/2 at most instead of log n at most.
This is one optimization
The other optimization can be
2) We may be able to recursively apply this in each disjoint half (divide & concur).
3) Also we can exclude the central 2 elements from subsequent disjoint partitions, because they are already in their invariant location
e.g. 1-16 (n-1 comparisons to find max/min)
we have 1-7 & 8-16 partition in the first step
second step may have 4 partitions
(7 & 8 are in invariant location) (so n-3 comparisons to find max/min)
3 step may have 8 partitions
with 4 more more elements in invariant location.
So n-7 comparisons to find max/min in each partition.
I am trying to implemented this,
But I would like to know if anybody sees any theoretical advantage in this approach or it is no good.
For already sorted, I see there will be no swapping & we just go on finding max/min in subsequent halves
For descending sort, we see all elements getting swapped & heapified with no chance to divide & concur. So it will be as good or as bad as normal heapsort. this may be the worst case.
For all others, we will see any any improvement after max/min swapping stops.
You have an O(n) pass that creates two heaps. Then in the worst case you have (n/2)*(log n/2) replacements in each of the two heaps. At this point you've already done n*log(n/2) operations, and you haven't even started sorting. You will require another n*log(n/2) operations to finish the sort.
Contrast that to heapsort, which has the O(n) pass that creates a single heap, and then n*log(n) operations to complete sorting the array.
I see no particular advantage to building two heaps of size n/2 rather than a single heap of size n. In the best case you have more complicated code that has the same or worse asymptotic complexity, and is unlikely to give you a real-world increase in performance.

Is the complexity of Dijkstra's correct?

I have a question regarding to runtime complexity of Dijkstra's algorithm. (see pseudo code in CLRS vertion 3):
DIJKSTRA(G, w, s)
1 INITIALIZE-SINGLE-SOURCE(G, s)
2 S ← ∅
3 Q ← V[G]
4 while Q != ∅
5 do u ← EXTRACT-MIN(Q)
6 S ← S ∪ {u}
7 for each vertex v ∈ Adj[u]
8 do RELAX(u, v,w)
I understand that line3 is O(V), line5 is O(VlogV) in total; line7 is O(E) in total, line8 implies decrease_key() so logV for each Relax() operation. But in relax(), after d[v]>d[u]+weight and decides to be relaxed, shouldn't we look up the position of v in queue Q before we call decrease_key(Q, pos, d[v]) to replace the key of pos with d[v]? note this look up itself costs O(V). so each Relax() should cost O(V), not O(logV), right?
A question regarding to space complexity: to compare the vertex in queue Q, I design a struct/class vertex with distance as one member and then I implement such as operator< to sort vertex by comparing their distance. but it seems I have to define a duplicate array dist[] in order to do dist[v] = dist[u]+weight in Relax(). If I do not define the duplicate array, I have to look up position of v and u in queue Q and then obtain and check their distance. is it suppose to work in this way? or maybe my implementation is not good?
Dijkstra's Algorithm (as you wrote it) does not have a runtime complexity unless you specify the datastructures. You are somehow right saying that "line 7" accounts with O(E) operations, but let's go through the lines (fortunately, Dijkstra is "easy" to analyze).
Initializing means "giving all vertices a infinite distance, except for the source, which has distance 0. Pretty easy, this can be done in O(V).
What is the set S good for? You use it "write only".
You put all elements to a queue. Here be dragons. What is a (priority!) queue? A datastructure with operations add, optionally decreaseKey (needed for Dijkstra), remove (not needed in Dijkstra), extractMin. Depending on the implementation, these operations have certain runtimes. For example, you can build a dumb PQ that is just a (marking) set - then adding and decreasing a key is constant time, but for extracting the minimum, you have to search. The canonical solution in Dijkstra is to use a queue (like a heap) that implements all relevant operations in O(log n). Let's analyze for this case, although technically speaking a Fibonacci-Heap would be better. Don't implement the queue on your own. It's amazing how much you can save by using a real PQ implementation.
You go through the loop n times.
Every time, you extract the minimum, which is in O(n log n) total (over all iterations).
What is the set S good for?
You go through the edges of each vertex at most once, i.e. you tough each edge at most twice, so in total you do whatever happens inside the loop O(E) times.
Relaxing means checking whether you have to decrease a key and do so. We already know that each such operation can add O(log V) in the queue (if it's a heap), and we have to do it O(E) times, so it'S O(E log V), which dominates the total runtime.
If you take a Fibonacci-Heap, you can go down to O(VlogV+E), but that's academic. Real implementations tune heaps. If you want to know your implementation's performance, analyze the PQ operations. But as I said, it's better to use existing implementations if you don't exactly know what your doing. Your idea of "looking up a position before calling decreaseKey" tells me you should digg deeper into that topic before you come up with an implementation which effectively takes O(V) per insert (by sorting every time some decreaseKey is called) or O(V) per extractMin (by finding the minimum on demand).

What would be the most efficient way to find a[i] = i in a sorted array?

Given an array a[], what would be the most efficient way to determine whether or not at least one element i satisfies the condition a[i] == i?
All the elements in the array are sorted and distinct, but they aren't necessarily integer types (i.e. they might be floating point types).
Several people have made claims about the relevance of “sorted”, “distinct” and “aren't necessarily integers”. In fact, proper selection of an efficient algorithm to solve this problem hinges on these characteristics. A more efficient algorithm would be possible if we could know that the values in the array were both distinct and integral, while a less efficient algorithm would be required if the values might be non-distinct, whether or not they were integral. And of course, if the array was not already sorted, you could sort it first (at average complexity O(n log n)) and then use the more efficient pre-sorted algorithm (i.e. for a sorted array), but in the unsorted case it would be more efficient to simply leave the array unsorted and run through it directly comparing the values in linear time (O(n)). Note that regardless of the algorithm chosen, best-case performance is O(1) (when the first element examined contains its index value); at any point during execution of any algorithm we might come across an element where a[i] == i at which point we return true; what actually matters in terms of algorithm performance in this problem is how quickly we can exclude all elements and declare that there is no such element a[i] where a[i] == i.
The problem does not state the sort order of a[], which is a pretty critical piece of missing information. If it’s ascending, the worst-case complexity will always be O(n), there’s nothing we can do to make the worst-case complexity better. But if the sort order is descending, even the worst-case complexity is O(log n): since values in the array are distinct and descending, there is only one possible index where a[i] could equal i, and basically all you have to do is a binary search to find the crossover point (where the ascending index values cross over the descending element values, if there even is such a crossover), and determine if a[c] == c at the crossover point index value c. Since that’s pretty trivial, I’ll proceed assuming that the sort order is ascending. Interestingly if the elements were integers, even in the ascending case there is a similar “crossover-like” situation (though in the ascending case there could be more than one a[i] == i match), so if the elements were integers, a binary search would also be applicable in the ascending case, in which case even the worst-case performance would be O(log n) (see Interview question - Search in sorted array X for index i such that X[i] = i). But we aren’t given that luxury in this version of the problem.
Here is how we might solve this problem:
Begin with the first element, a[0]. If its value is == 0, you’ve found an element which satisfies a[i] == i so return true. If its value is < 1, the next element (a[1]) could possibly contain the value 1, so you proceed to the next index. If, however, a[0] >= 1, you know (because the values are distinct) that the condition a[1] == 1 cannot possibly be true, so you can safely skip index 1. But you can even do better than that: For example, if a[0] == 12, you know (because the values are sorted in ascending order) that there cannot possibly be any elements that satisfy a[i] == i prior to element a[13]. Because the values in the array can be non-integral, we cannot make any further assumptions at this point, so the next element we can safely skip to directly is a[13] (e.g. a[1] through a[12] may all contain values between 12.000... and 13.000... such that a[13] could still equal exactly 13, so we have to check it).
Continuing that process yields an algorithm as follows:
// Algorithm 1
bool algorithm1(double* a, size_t len)
{
for (size_t i=0; i<len; ++i) // worst case is O(n)
{
if (a[i] == i)
return true; // of course we could also return i here (as an int)...
if (a[i] > i)
i = static_cast<size_t>(std::floor(a[i]));
}
return false; // ......in which case we’d want to return -1 here (an int)
}
This has pretty good performance if many of the values in a[] are greater than their index value, and has excellent performance if all values in a[] are greater than n (it returns false after only one iteration!), but it has dismal performance if all values are less than their index value (it will return false after n iterations). So we return to the drawing board... but all we need is a slight tweak. Consider that the algorithm could have been written to scan backwards from n down to 0 just as easily as it can scan forward from 0 to n. If we combine the logic of iterating from both ends toward the middle, we get an algorithm as follows:
// Algorithm 2
bool algorithm2(double* a, size_t len)
{
for (size_t i=0, j=len-1; i<j; ++i,--j) // worst case is still O(n)
{
if (a[i]==i || a[j]==j)
return true;
if (a[i] > i)
i = static_cast<size_t>(std::floor(a[i]));
if (a[j] < j)
j = static_cast<size_t>(std::ceil(a[j]));
}
return false;
}
This has excellent performance in both of the extreme cases (all values are less than 0 or greater than n), and has pretty good performance with pretty much any other distribution of values. The worst case is if all of the values in the lower half of the array are less than their index and all of the values in the upper half are greater than their index, in which case the performance degrades to the worst-case of O(n). Best case (either extreme case) is O(1), while average case is probably O(log n) but I’m deferring to someone with a math major to determine that with certainty.
Several people have suggested a “divide and conquer” approach to the problem, without specifying how the problem could be divided and what one would do with the recursively divided sub-problems. Of course such an incomplete answer would probably not satisfy the interviewer. The naïve linear algorithm and worst-case performance of algorithm 2 above are both O(n), while algorithm 2 improves the average-case performance to (probably) O(log n) by skipping (not examining) elements whenever it can. The divide-and-conquer approach can only outperform algorithm 2 if, in the average case, it is somehow able to skip more elements than algorithm 2 can skip. Let’s assume we divide the problem by splitting the array into two (nearly) equal contiguous halves , recursively, and decide if, with the resulting sub-problems, we are likely to be able to skip more elements than algorithm 2 could skip, especially in algorithm 2’s worst case. For the remainder of this discussion, let’s assume an input that would be worst-case for algorithm 2. After the first split, we can check both halves’ top & bottom elements for the same extreme case that results in O(1) performance for algorithm2, yet results in O(n) performance with both halves combined. This would be the case if all elements in the bottom half are less than 0 and all elements in the upper half are greater than n-1. In these cases, we can immediately exclude the bottom and/or top half with O(1) performance for any half we can exclude. Of course the performance of any half that cannot be excluded by that test remains to be determined after recursing further, dividing that half by half again until we find any segment whose top or bottom element contains its index value. That’s a reasonably nice performance improvement over algorithm 2, but it occurs in only certain special cases of algorithm 2’s worst case. All we’ve done with divide-and-conquer is decrease (slightly) the proportion of the problem space that evokes worst-case behavior. There are still worst-case scenarios for divide-and-conquer, and they exactly match most of the problem space that evokes worst-case behavior for algorithm 2.
So, given that the divide-and-conquer algorithm has less worst-case scenarios, doesn’t it make sense to go ahead and use a divide-and-conquer approach?
In a word, no. Well, maybe. If you know up front that about half of your data is less than 0 and half is greater than n, this special case would generally fare better with the divide-and-conquer approach. Or, if your system is multicore and your ‘n’ is large, it might be helpful to split the problem evenly between all of your cores, but once it’s split between them, I maintain that the sub-problems on each core are probably best solved with algorithm 2 above, avoiding further division of the problem and certainly avoiding recursion, as I argue below....
At each recursion level of a recursive divide-and-conquer approach, the algorithm needs some way to remember the as-yet-unsolved 2nd half of the problem while it recurses into the 1st half. Often this is done by having the algorithm recursively call itself first for one half and then for the other, a design which maintains this information implicitly on the runtime stack. Another implementation might avoid recursive function calls by maintaining essentially this same information on an explicit stack. In terms of space growth, algorithm 2 is O(1), but any recursive implementation is unavoidably O(log n) due to having to maintain this information on some sort of stack. But aside from the space issue, a recursive implementation has extra runtime overhead of remembering the state of as-yet-unrecursed-into subproblem halves until such time as they can be recursed into. This runtime overhead is not free, and given the simplicity of algorithm 2’s implementation above, I posit that such overhead is proportionally significant. Therefore I suggest that algorithm 2 above will roundly spank any recursive implementation for the vast majority of cases.
In the worst case, you can't do any better than checking every element. (Imagine something like a[i] = i + uniform_random(-.25, .25).) You'll need some information on what your input looks like.
Actually I would start from the last element, and do a basic check (for example, if you have 1000 elements, but highest is 100, you know you need only check 0..100). In a worst case scenario you still need to check every element, but it should be faster to find the areas where it may be possible. If it is as stated above (a[i] = i + [-0.25..0.25]), you are f($!ed and need to search every single element.
For a sorted array, you can perform an interpolation search. Similiar to a binary search, but assuming an even distribution of values, can be faster.
I think the main problem here is your conflicting statements:
a[i] == i
All the elements in the array are sorted and distinct , they need not be integer always.
If the array's value is equal to its accessing subscript that means it's an integer. If it's not an integer, and they're say.. char, what is considered "sorted"? ASCII value ( A < B < C)?
If it were an array of chars would we consider:
a[i] == i
to be true if
i == 6510 && a[i] == 'A'
If I were in this interview I would be grilling the interviewer with follow up questions before answering. That said...
If all we know is what you stated, we can safely say that we can find the value in O(n) because that is the time to make one full pass of the array. With more details we can probably limit this to O(log(n)) with a binary search of the array.
Noticed that all the elements in the array are sorted and distinct, so if we construct a new array b with b[i]=a[i]-i, elements in array b is also sorted, what we need to find is to find zeros in array b. I think binary search can solve the problem! Here is a link for count the number of occurrences in a sorted array. You can also do the similar Divide & Conquer technique on the original array without construct a auxiliary array! The time complexity is O(Logn)!
Take this as an example:
a=[0,1,2,4,8]
b=[0,0,0,1,4]
What we need to find is exactly index 0,1,2
Hope it helps!

How to efficiently *nearly* sort a list?

I have a list of items; I want to sort them, but I want a small element of randomness so they are not strictly in order, only on average ordered.
How can I do this most efficiently?
I don't mind if the quality of the random is not especially good, e.g. it simply based on the chance ordering of the input, e.g. an early-terminated incomplete sort.
The context is implementing a nearly-greedy search by introducing a very slight element of inexactness; this is in a tight loop and so the speed of sorting and calling random() are to be considered
My current code is to do a std::sort (this being C++) and then do a very short shuffle just in the early part of the array:
for(int i=0; i<3; i++) // I know I have more than 6 elements
std::swap(order[i],order[i+rand()%3]);
Use first two passes of JSort. Build heap twice, but do not perform insertion sort. If element of randomness is not small enough, repeat.
There is an approach that (unlike incomplete JSort) allows finer control over the resulting randomness and has time complexity dependent on randomness (the more random result is needed, the less time complexity). Use heapsort with Soft heap. For detailed description of the soft heap, see pdf 1 or pdf 2.
You could use a standard sort algorithm (is a standard library available?) and pass a predicate that "knows", given two elements, which is less than the other, or if they are equal (returning -1, 0 or 1). In the predicate then introduce a rare (configurable) case where the answer is random, by using a random number:
pseudocode:
if random(1000) == 0 then
return = random(2)-1 <-- -1,0,-1 randomly choosen
Here we have 1/1000 chances to "scamble" two elements, but that number strictly depends on the size of your container to sort.
Another thing to add in the 1000 case, could be to remove the "right" answer because that would not scramble the result!
Edit:
if random(100 * container_size) == 0 then <-- here I consider the container size
{
if element_1 < element_2
return random(1); <-- do not return the "correct" value of -1
else if element_1 > element_2
return random(1)-1; <-- do not return the "correct" value of 1
else
return random(1)==0 ? -1 : 1; <-- do not return 0
}
in my pseudocode:
random(x) = y where 0 <= y <=x
One possibility that requires a bit more space but would guarantee that existing sort algorithms could be used without modification would be to create a copy of the sort value(s) and then modify those in some fashion prior to sorting (and then use the modified value(s) for the sort).
For example, if the data to be sorted is a simple character field Name[N] then add a field (assuming data is in a structure or class) called NameMod[N]. Fill in the NameMod with a copy of Name but add some randomization. Then 3% of the time (or some appropriate amount) change the first character of the name (e.g., change it by +/- one or two characters). And then 10% of the time change the second character +/- a few characters.
Then run it through whatever sort algorithm you prefer. The benefit is that you could easily change those percentages and randomness. And the sort algorithm will still work (e.g., it would not have problems with the compare function returning inconsistent results).
If you are sure that element is at most k far away from where they should be, you can reduce quicksort N log(N) sorting time complexity down to N log(k)....
edit
More specifically, you would create k buckets, each containing N/k elements.
You can do quick sort for each bucket, which takes k * log(k) times, and then sort N/k buckets, which takes N/k log(N/k) time. Multiplying these two, you can do sorting in N log(max(N/k,k))
This can be useful because you can run sorting for each bucket in parallel, reducing total running time.
This works if you are sure that any element in the list is at most k indices away from their correct position after the sorting.
but I do not think you meant any restriction.
Split the list into two equally-sized parts. Sort each part separately, using any usual algorithm. Then merge these parts. Perform some merge iterations as usual, comparing merged elements. For other merge iterations, do not compare the elements, but instead select element from the same part, as in the previous step. It is not necessary to use RNG to decide, how to treat each element. Just ignore sorting order for every N-th element.
Other variant of this approach nearly sorts an array nearly in-place. Split the array into two parts with odd/even indexes. Sort them. (It is even possible to use standard C++ algorithm with appropriately modified iterator, like boost::permutation_iterator). Reserve some limited space at the end of the array. Merge parts, starting from the end. If merged part is going to overwrite one of the non-merged elements, just select this element. Otherwise select element in sorted order. Level of randomness is determined by the amount of reserved space.
Assuming you want the array sorted in ascending order, I would do the following:
for M iterations
pick a random index i
pick a random index k
if (i<k)!=(array[i]<array[k]) then swap(array[i],array[k])
M controls the "sortedness" of the array - as M increases the array becomes more and more sorted. I would say a reasonable value for M is n^2 where n is the length of the array. If it is too slow to pick random elements then you can precompute their indices beforehand. If the method is still too slow then you can always decrease M at the cost of getting a poorer sort.
Take a small random subset of the data and sort it. You can use this as a map to provide an estimate of where every element should appear in the final nearly-sorted list. You can scan through the full list now and move/swap elements that are not in a good position.
This is basically O(n), assuming the small initial sorting of the subset doesn't take a long time. Hopefully you can build the map such that the estimate can be extracted quickly.
Bubblesort to the rescue!
For a unsorted array, you could pick a few random elements and bubble them up or down. (maybe by rotation, which is a bit more efficient) It will be hard to control the amount of (dis)order, even if you pick all N elements, you are not sure that the whole array will be sorted, because elements are moved and you cannot ensure that you touched every element only once.
BTW: this kind of problem tends to occur in game playing engines, where the list with candidate moves is kept more-or-less sorted (because of weighted sampling), and sorting after each iteration is too expensive, and only one or a few elements are expected to move.