Maintain a sorted array in O(1)? - c++

We have a sorted array and we would like to increase the value of one index by only 1 unit (array[i]++), such that the resulting array is still sorted. Is this possible in O(1)?
It is fine to use any data structure possible in STL and C++.
In a more specific case, if the array is initialised by all 0 values, and it is always incrementally constructed only by increasing a value of an index by one, is there an O(1) solution?

I haven't worked this out completely, but I think the general idea might help for integers at least. At the cost of more memory, you can maintain a separate data-structure that maintains the ending index of a run of repeated values (since you want to swap your incremented value with the ending index of the repeated value). This is because it's with repeated values that you run into the worst case O(n) runtime: let's say you have [0, 0, 0, 0] and you increment the value at location 0. Then it is O(n) to find out the last location (3).
But let's say that you maintain the data-structure I mentioned (a map would works because it has O(1) lookup). In that case you would have something like this:
0 -> 3
So you have a run of 0 values that end at location 3. When you increment a value, let's say at location i, you check to see if the new value is greater than the value at i + 1. If it is not, you are fine. But if it is, you look to see if there is an entry for this value in the secondary data-structure. If there isn't, you can simply swap. If there is an entry, you look up the ending-index and then swap with the value at that location. You then make any changes you need to the secondary data-structure to reflect the new state of the array.
A more thorough example:
[0, 2, 3, 3, 3, 4, 4, 5, 5, 5, 7]
The secondary data-structure is:
3 -> 4
4 -> 6
5 -> 9
Let's say you increment the value at location 2. So you have incremented 3, to 4. The array now looks like this:
[0, 2, 4, 3, 3, 4, 4, 5, 5, 5, 7]
You look at the next element, which is 3. You then look up the entry for that element in the secondary data-structure. The entry is 4, which means that there is a run of 3's that end at 4. This means that you can swap the value from the current location with the value at index 4:
[0, 2, 3, 3, 4, 4, 4, 5, 5, 5, 7]
Now you will also need to update the secondary data-structure. Specifically, there the run of 3's ends one index early, so you need to decrement that value:
3 -> 3
4 -> 6
5 -> 9
Another check you will need to do is to see if the value is repeated anymore. You can check that by looking at the i - 1th and the i + 1th locations to see if they are the same as the value in question. If neither are equal, then you can remove the entry for this value from the map.
Again, this is just a general idea. I will have to code it out to see if it works out the way I thought about it.
Please feel free to poke holes.
UPDATE
I have an implementation of this algorithm here in JavaScript. I used JavaScript just so I could do it quickly. Also, because I coded it up pretty quickly it can probably be cleaned up. I do have comments though. I'm not doing anything esoteric either, so this should be easily portable to C++.
There are essentially two parts to the algorithm: the incrementing and swapping (if necessary), and book-keeping done on the map that keeps track of our ending indices for runs of repeated values.
The code contains a testing harness that starts with an array of zeroes and increments random locations. At the end of every iteration, there is a test to ensure that the array is sorted.
var array = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0];
var endingIndices = {0: 9};
var increments = 10000;
for(var i = 0; i < increments; i++) {
var index = Math.floor(Math.random() * array.length);
var oldValue = array[index];
var newValue = ++array[index];
if(index == (array.length - 1)) {
//Incremented element is the last element.
//We don't need to swap, but we need to see if we modified a run (if one exists)
if(endingIndices[oldValue]) {
endingIndices[oldValue]--;
}
} else if(index >= 0) {
//Incremented element is not the last element; it is in the middle of
//the array, possibly even the first element
var nextIndexValue = array[index + 1];
if(newValue === nextIndexValue) {
//If the new value is the same as the next value, we don't need to swap anything. But
//we are doing some book-keeping later with the endingIndices map. That code requires
//the ending index (i.e., where we moved the incremented value to). Since we didn't
//move it anywhere, the endingIndex is simply the index of the incremented element.
endingIndex = index;
} else if(newValue > nextIndexValue) {
//If the new value is greater than the next value, we will have to swap it
var swapIndex = -1;
if(!endingIndices[nextIndexValue]) {
//If the next value doesn't have a run, then location we have to swap with
//is just the next index
swapIndex = index + 1;
} else {
//If the next value has a run, we get the swap index from the map
swapIndex = endingIndices[nextIndexValue];
}
array[index] = nextIndexValue;
array[swapIndex] = newValue;
endingIndex = swapIndex;
} else {
//If the next value is already greater, there is nothing we need to swap but we do
//need to do some book-keeping with the endingIndices map later, because it is
//possible that we modified a run (the value might be the same as the value that
//came before it). Since we don't have anything to swap, the endingIndex is
//effectively the index that we are incrementing.
endingIndex = index;
}
//Moving the new value to its new position may have created a new run, so we need to
//check for that. This will only happen if the new position is not at the end of
//the array, and the new value does not have an entry in the map, and the value
//at the position after the new position is the same as the new value
if(endingIndex < (array.length - 1) &&
!endingIndices[newValue] &&
array[endingIndex + 1] == newValue) {
endingIndices[newValue] = endingIndex + 1;
}
//We also need to check to see if the old value had an entry in the
//map because now that run has been shortened by one.
if(endingIndices[oldValue]) {
var newEndingIndex = --endingIndices[oldValue];
if(newEndingIndex == 0 ||
(newEndingIndex > 0 && array[newEndingIndex - 1] != oldValue)) {
//In this case we check to see if the old value only has one entry, in
//which case there is no run of values and so we will need to remove
//its entry from the map. This happens when the new ending-index for this
//value is the first location (0) or if the location before the new
//ending-index doesn't contain the old value.
delete endingIndices[oldValue];
}
}
}
//Make sure that the array is sorted
for(var j = 0; j < array.length - 1; j++) {
if(array[j] > array[j + 1]) {
throw "Array not sorted; Value at location " + j + "(" + array[j] + ") is greater than value at location " + (j + 1) + "(" + array[j + 1] + ")";
}
}
}

In a more specific case, if the array is initialised by all 0 values, and it is always incrementally constructed only by increasing a value of an index by one, is there an O(1) solution?
No. Given an array of all 0's: [0, 0, 0, 0, 0]. If you increment the first value, giving [1, 0, 0, 0, 0], then you will have to make 4 swaps to ensure that it remains sorted.
Given a sorted array with no duplicates, then the answer is yes. But after the first operation (i.e. the first time you increment), then you could potentially have duplicates. The more increments you do, the higher the likelihood is that you'll have duplicates, and the more likely it'll take O(n) to keep that array sorted.
If all you have is the array, it's impossible to guarantee less than O(n) time per increment. If what you're looking for is a data structure that supports sorted order and lookup by index, then you probably want an order stastic tree.

If the values are small, counting sort will work. Represent the array [0,0,0,0] as {4}. Incrementing any zero gives {3,1} : 3 zeroes and a one. In general, to increment any value x, deduct one from the count of x and increment the count of {x+1}. The space efficiency is O(N), though, where N is the highest value.

It depends on how many items can have the same value. If more items can have the same value, then it is not possible to have O(1) with ordinary arrays.
Let's do an example: suppose array[5] = 21, and you want to do array[5]++:
Increment the item:
array[5]++
(which is O(1) because it is an array).
So, now array[5] = 22.
Check the next item (i.e., array[6]):
If array[6] == 21, then you have to keep checking new items (i.e., array[7] and so on) until you find a value higher than 21. At that point you can swap the values. This search is not O(1) because potentially you have to scan the whole array.
Instead, if items cannot have the same value, then you have:
Increment the item:
array[5]++
(which is O(1) because it is an array).
So, now array[5] = 22.
The next item cannot be 21 (because two items cannot have the same value), so it must have a value > 21 and the array is already sorted.

So you take sorted array and hashtable. You go over array to figure out 'flat' areas - where elements are of the same value. For every flat area you have to figure out three things 1) where it starts (index of first element) 2) what is it's value 3) what is the value of next element (the next bigger). Then put this tuple into the hashtable, where the key will be element value. This is prerequisite and it's complexity doesn't really matter.
Then when you increase some element (index i) you look up a table for index of next bigger element (call it j), and swap i with i - 1. Then 1) add new entry to hashtable 2) update existing entry for it's previous value.
With perfect hashtable (or limited range of possible values) it will be almost O(1). The downside: it will not be stable.
Here is some code:
#include <iostream>
#include <unordered_map>
#include <vector>
struct Range {
int start, value, next;
};
void print_ht(std::unordered_map<int, Range>& ht)
{
for (auto i = ht.begin(); i != ht.end(); i++) {
Range& r = (*i).second;
std::cout << '(' << r.start << ", "<< r.value << ", "<< r.next << ") ";
}
std::cout << std::endl;
}
void increment_el(int i, std::vector<int>& array, std::unordered_map<int, Range>& ht)
{
int val = array[i];
array[i]++;
//Pick next bigger element
Range& r = ht[val];
//Do the swapping, so last element of that range will be first
std::swap(array[i], array[ht[r.next].start - 1]);
//Update hashtable
ht[r.next].start--;
}
int main(int argc, const char * argv[])
{
std::vector<int> array = {1, 1, 1, 2, 2, 3};
std::unordered_map<int, Range> ht;
int start = 0;
int value = array[0];
//Build indexing hashtable
for (int i = 0; i <= array.size(); i++) {
int cur_value = i < array.size() ? array[i] : -1;
if (cur_value > value || i == array.size()) {
ht[value] = {start, value, cur_value};
start = i;
value = cur_value;
}
}
print_ht(ht);
//Now let's increment first element
increment_el(0, array, ht);
print_ht(ht);
increment_el(3, array, ht);
print_ht(ht);
for (auto i = array.begin(); i != array.end(); i++)
std::cout << *i << " ";
return 0;
}

Yes and no.
Yes if the list contains only unique integers, as that means you only need to check the next value. No in any other situation. If the values are not unique, incrementing the first of N duplicate values means that it must move N positions. If the values are floating-point, you may have thousands of values between x and x+1

It's important to be very clear about the requirements; the simplest way is to express the problem as an ADT (Abstract Datatype), listing the required operations and complexities.
Here's what I think you are looking for: a datatype which provides the following operations:
Construct(n): Create a new object of size n all of whose values are 0.
Value(i): Return the value at index i.
Increment(i): Increment the value at index i.
Least(): Return the index of the element with least value (or one such element if there are several).
Next(i): Return the index of the next element after element i in a sorted traversal starting at Least(), such that the traversal will return every element.
Aside from the Constructor, we want every one of the above operations to have complexity O(1). We also want the object to occupy O(n) space.
The implementation uses a list of buckets; each bucket has a value and a list of elements. Each element has an index, a pointer to the bucket it is part of. Finally, we have an array of pointers to elements. (In C++, I'd probably use iterators rather than pointers; in another language, I'd probably use intrusive lists.) The invariants are that no bucket is ever empty, and the value of the buckets are strictly monotonically increasing.
We start with a single bucket with value 0 which has a list of n elements.
Value(i) is implemented by returning the value of the bucket of the element referenced by the iterator at element i of the array. Least() is the index of the first element in the first bucket. Next(i) is the index of the next element after the one referenced by the iterator at element i, unless that iterator is already pointing at the end of the the list in which case it is the first element in the next bucket, unless the element's bucket is the last bucket, in which case we're at the end of the element list.
The only interface of interest is Increment(i), which is as follows:
If element i is the only element in its bucket (i.e. there is no next element in the bucket list, and element i is the first element in the bucket list):
Increment the value of the associated bucket.
If the next bucket has the same value, append the next bucket's element list to this bucket's element list (this is O(1), regardless of the list's size, because it is just a pointer swap), and then delete the next bucket.
If element i is not the only element in its bucket, then:
Remove it from its bucket list.
If the next bucket has the next sequential value, then push element i onto the next bucket's list.
Otherwise, the next bucket's value is larger, then create a new bucket with the next sequential value and only element i and insert it between this bucket and the next one.

just iterate along the array from the modified element until you find the correct place, then swap. Average case complexity is O(N) where N is the average number of duplicates. Worst case is O(n) where n is the length of the array. As long as N isn't large and doesn't scale badly with n, you're fine and can probably pretend it's O(1) for practical purposes.
If duplicates are the norm and/or scale strongly with n, then there are better solutions, see other responses.

I think that it is possible without using a hashtable. I have an implementation here:
#include <cstdio>
#include <vector>
#include <cassert>
// This code is a solution for http://stackoverflow.com/questions/19957753/maintain-a-sorted-array-in-o1
//
// """We have a sorted array and we would like to increase the value of one index by only 1 unit
// (array[i]++), such that the resulting array is still sorted. Is this possible in O(1)?"""
// The obvious implementation, which has O(n) worst case increment.
class LinearIncrementor
{
public:
LinearIncrementor(int numElems);
int valueAt(int index) const;
void incrementAt(int index);
private:
std::vector<int> m_values;
};
// Free list to store runs of same values
class RunList
{
public:
struct Run
{
int m_end; // end index of run, inclusive, or next object in free list
int m_value; // value at this run
};
RunList();
int allocateRun(int endIndex, int value);
void freeRun(int index);
Run& runAt(int index);
const Run& runAt(int index) const;
private:
std::vector<Run> m_runs;
int m_firstFree;
};
// More optimal implementation, which increments in O(1) time
class ConstantIncrementor
{
public:
ConstantIncrementor(int numElems);
int valueAt(int index) const;
void incrementAt(int index);
private:
std::vector<int> m_runIndices;
RunList m_runs;
};
LinearIncrementor::LinearIncrementor(int numElems)
: m_values(numElems, 0)
{
}
int LinearIncrementor::valueAt(int index) const
{
return m_values[index];
}
void LinearIncrementor::incrementAt(int index)
{
const int n = static_cast<int>(m_values.size());
const int value = m_values[index];
while (index+1 < n && value == m_values[index+1])
++index;
++m_values[index];
}
RunList::RunList() : m_firstFree(-1)
{
}
int RunList::allocateRun(int endIndex, int value)
{
int runIndex = -1;
if (m_firstFree == -1)
{
runIndex = static_cast<int>(m_runs.size());
m_runs.resize(runIndex + 1);
}
else
{
runIndex = m_firstFree;
m_firstFree = m_runs[runIndex].m_end;
}
Run& run = m_runs[runIndex];
run.m_end = endIndex;
run.m_value = value;
return runIndex;
}
void RunList::freeRun(int index)
{
m_runs[index].m_end = m_firstFree;
m_firstFree = index;
}
RunList::Run& RunList::runAt(int index)
{
return m_runs[index];
}
const RunList::Run& RunList::runAt(int index) const
{
return m_runs[index];
}
ConstantIncrementor::ConstantIncrementor(int numElems) : m_runIndices(numElems, 0)
{
const int runIndex = m_runs.allocateRun(numElems-1, 0);
assert(runIndex == 0);
}
int ConstantIncrementor::valueAt(int index) const
{
return m_runs.runAt(m_runIndices[index]).m_value;
}
void ConstantIncrementor::incrementAt(int index)
{
const int numElems = static_cast<int>(m_runIndices.size());
const int curRunIndex = m_runIndices[index];
RunList::Run& curRun = m_runs.runAt(curRunIndex);
index = curRun.m_end;
const bool freeCurRun = index == 0 || m_runIndices[index-1] != curRunIndex;
RunList::Run* runToMerge = NULL;
int runToMergeIndex = -1;
if (curRun.m_end+1 < numElems)
{
const int nextRunIndex = m_runIndices[curRun.m_end+1];
RunList::Run& nextRun = m_runs.runAt(nextRunIndex);
if (curRun.m_value+1 == nextRun.m_value)
{
runToMerge = &nextRun;
runToMergeIndex = nextRunIndex;
}
}
if (freeCurRun && !runToMerge) // then free and allocate at the same time
{
++curRun.m_value;
}
else
{
if (freeCurRun)
{
m_runs.freeRun(curRunIndex);
}
else
{
--curRun.m_end;
}
if (runToMerge)
{
m_runIndices[index] = runToMergeIndex;
}
else
{
m_runIndices[index] = m_runs.allocateRun(index, curRun.m_value+1);
}
}
}
int main(int argc, char* argv[])
{
const int numElems = 100;
const int numInc = 1000000;
LinearIncrementor linearInc(numElems);
ConstantIncrementor constInc(numElems);
srand(1);
for (int i = 0; i < numInc; ++i)
{
const int index = rand() % numElems;
linearInc.incrementAt(index);
constInc.incrementAt(index);
for (int j = 0; j < numElems; ++j)
{
if (linearInc.valueAt(j) != constInc.valueAt(j))
{
printf("Error: differing values at increment step %d, value at index %d\n", i, j);
}
}
}
return 0;
}

As a complement to the other answers: if you can only have the array, then you cannot indeed guarantee the operation will be constant-time; but because the array is sorted, you can find the end of a run of identical numbers in log n operations, not in n operations. This is simply a binary search.
If we expect most runs of numbers to be short, we should use galloping search, which is a variant where we first find the bounds by looking at positions +1, +2, +4, +8, +16, etc. and then doing binary search inside. You would get a time that is often constant (and extremely fast if the item is unique) but can grow up to log n. Unless for some reason long runs of identical numbers remain common even after many updates, this might outperform any solution that requires keeping additional data.

Related

Find out in linear time whether there is a pair in sorted vector that adds up to certain value

Given an std::vector of distinct elements sorted in ascending order, I want to develop an algorithm that determines whether there are two elements in the collection whose sum is a certain value, sum.
I've tried two different approaches with their respective trade-offs:
I can scan the whole vector and, for each element in the vector, apply binary search (std::lower_bound) on the vector for searching an element corresponding to the difference between sum and the current element. This is an O(n log n) time solution that requires no additional space.
I can traverse the whole vector and populate an std::unordered_set. Then, I scan the vector and, for each element, I look up in the std::unordered_set for the difference between sum and the current element. Since searching on a hash table runs in constant time on average, this solution runs in linear time. However, this solution requires additional linear space because of the std::unordered_set data structure.
Nevertheless, I'm looking for a solution that runs in linear time and requires no additional linear space. Any ideas? It seems that I'm forced to trade speed for space.
As the std::vector is already sorted and you can calculate the sum of a pair on the fly, you can achieve a linear time solution in the size of the vector with O(1) space.
The following is an STL-like implementation that requires no additional space and runs in linear time:
template<typename BidirIt, typename T>
bool has_pair_sum(BidirIt first, BidirIt last, T sum) {
if (first == last)
return false; // empty range
for (--last; first != last;) {
if ((*first + *last) == sum)
return true; // pair found
if ((*first + *last) > sum)
--last; // decrease pair sum
else // (*first + *last) < sum (trichotomy)
++first; // increase pair sum
}
return false;
}
The idea is to traverse the vector from both ends – front and back – in opposite directions at the same time and calculate the sum of the pair of elements while doing so.
At the very beginning, the pair consists of the elements with the lowest and the highest values, respectively. If the resulting sum is lower than sum, then advance first – the iterator pointing at the left end. Otherwise, move last – the iterator pointing at the right end – backward. This way, the resulting sum progressively approaches to sum. If both iterators end up pointing at the same element and no pair whose sum is equal to sum has been found, then there is no such a pair.
auto main() -> int {
std::vector<int> vec{1, 3, 4, 7, 11, 13, 17};
std::cout << has_pair_sum(vec.begin(), vec.end(), 2) << ' ';
std::cout << has_pair_sum(vec.begin(), vec.end(), 7) << ' ';
std::cout << has_pair_sum(vec.begin(), vec.end(), 19) << ' ';
std::cout << has_pair_sum(vec.begin(), vec.end(), 30) << '\n';
}
The output is:
0 1 0 1
Thanks to the generic nature of the function template has_pair_sum() and since it just requires bidirectional iterators, this solution works with std::list as well:
std::list<int> lst{1, 3, 4, 7, 11, 13, 17};
has_pair_sum(lst.begin(), lst.end(), 2);
I had the same idea as the one in the answer of 眠りネロク, but with a little bit more comprehensible implementation.
bool has_pair_sum(std::vector<int> v, int sum){
if(v.empty())
return false;
std::vector<int>::iterator p1 = v.begin();
std::vector<int>::iterator p2 = v.end(); // points to the End(Null-terminator), after the last element
p2--; // Now it points to the last element.
while(p1 != p2){
if(*p1 + *p2 == sum)
return true;
else if(*p1 + *p2 < sum){
p1++;
}else{
p2--;
}
}
return false;
}
well, since we are already given sorted array, we can do it with two pointer approach, we first keep a left pointer at start of the array and a right pointer at end of array, then in each iteration we check if sum of value of left pointer index and value of right pointer index is equal or not , if yes, return from here, otherwise we have to decide how to reduce the boundary, that is either increase left pointer or decrease right pointer, so we compare the temporary sum with given sum and if this temporary sum is greater than the given sum then we decide to reduce the right pointer, if we increase left pointer the temporary sum will remain same or only increase but never lesser, so we decide to reduce the right pointer so that temporary sum decrease and we reach near our given sum, similary if temporary sum is less than given sum, so no meaning of reducing the right pointer as temporary sum will either remain sum or decrease more but never increase so we increase our left pointer so our temporary sum increase and we reach near given sum, and we do the same process again and again unless we get the equal sum or left pointer index value becomes greater than right right pointer index or vice versa
below is the code for demonstration, let me know if something is not clear
bool pairSumExists(vector<int> &a, int &sum){
if(a.empty())
return false;
int len = a.size();
int left_pointer = 0 , right_pointer = len - 1;
while(left_pointer < right_pointer){
if(a[left_pointer] + a[right_pointer] == sum){
return true;
}
if(a[left_pointer] + a[right_pointer] > sum){
--right_pointer;
}
else
if(a[left_pointer] + a[right_poitner] < sum){
++left_pointer;
}
}
return false;
}

std vector size keep ground Although i insert in the same indexs

Something wired i see here with std vector
I have
variable that its value is dynamically changes but always under 20
dynamicSizeToInsert in the example.
why the vector size keeps growing ?
std::vector<int> v;
//sometimes its 5 sometimes it is 10 sometimes it is N < 20
int dynamicSizeToInsert = 5
int c = 0;
for(std::vector<int>::size_type i = 0; i != 100; i++) {
if(c == dynamicSizeToInsert )
{
c = 0;
}
v.insert(v.begin() + c, c);
c++;
printf("%d",v.size()) //THIS THINK KEEP growing although i only using vector indexes 0 to 4 allways
}
i want to keep my vector side 5 elements big
and that new value will run over other value in the same index .
std::vector::insert, as the name suggests, inserts elements at the specified position.
When c == dynamicSizeToInsert, c is set to 0. So now, v.size() == 5. Now this lines executes:
v.insert(v.begin() + c, c);
This will insert 0 at posistion v.begin() + 0, which is position 0 and it will offset every other element (it will not replace the element at position 0), and so the vector keeps growing.
Instead of using insert, use operator[]:
//So that 'v' is the right size
v.resize(dynamicSizeToInsert);
for(std::vector<int>::size_type i = 0; i != 100; i++) {
if(c == dynamicSizeToInsert )
{
c = 0;
}
v[i] = c; //Sets current index to 'c'
c++;
}
insert doesn't replace element, rather it inserts element at given location and shifts all the right elements to one position right. That's why your vector size is growing.
If you want to replace an existing index then you can use operator[]. However, keep in mind that the index must be between 0 - size() - 1 in order to use operator[].
std::vector::insert inserts a new member into the array at the index you specify, and moving the other elements forward or even reallocating the array once it reaches capacity(a relatively expensive operation)
The vector is extended by inserting new elements before the element at
the specified position, effectively increasing the container size by
the number of elements inserted.
This causes an automatic reallocation of the allocated storage space
if -and only if- the new vector size surpasses the current vector
capacity.
(http://www.cplusplus.com/reference/vector/vector/insert/)
As quoted above, the vector is extended with every insert operation.
to get the behaviour you want you need to use the [] operator like so:
v[i] = some_new_value;
this way a new element is never added, its only the value of the ith element that is changed.
const int dynamicSizeToInsert = 5;
std::vector<int> v(dynamicSizeToInsert);
int c = 0;
for(std::vector<int>::size_type i = 0; i !=100; i++)
{
v.at(i%dynamicSizeToInsert) = (dynamicSizeToInsert == c?c = 0,c ++: c ++);
printf("%d",v.size());
}

Using an array and moving duplicates to end

I got this question at an interview and at the end was told there was a more efficient way to do this but have still not been able to figure it out. You are passing into a function an array of integers and an integer for size of array. In the array you have a lot of numbers, some that repeat for example 1,7,4,8,2,6,8,3,7,9,10. You want to take that array and return an array where all the repeated numbers are put at the end of the array so the above array would turn into 1,7,4,8,2,6,3,9,10,8,7. The numbers I used are not important and I could not use a buffer array. I was going to use a BST, but the order of the numbers must be maintained(except for the duplicate numbers). I could not figure out how to use a hash table so I ended up using a double for loop(n^2 horrible I know). How would I do this more efficiently using c++. Not looking for code, just an idea of how to do it better.
In what follows:
arr is the input array;
seen is a hash set of numbers already encountered;
l is the index where the next unique element will be placed;
r is the index of the next element to be considered.
Since you're not looking for code, here is a pseudo-code solution (which happens to be valid Python):
arr = [1,7,4,8,2,6,8,3,7,9,10]
seen = set()
l = 0
r = 0
while True:
# advance `r` to the next not-yet-seen number
while r < len(arr) and arr[r] in seen:
r += 1
if r == len(arr): break
# add the number to the set
seen.add(arr[r])
# swap arr[l] with arr[r]
arr[l], arr[r] = arr[r], arr[l]
# advance `l`
l += 1
print arr
On your test case, this produces
[1, 7, 4, 8, 2, 6, 3, 9, 10, 8, 7]
I would use an additional map, where the key is the integer value from the array and the value is an integer set to 0 in the beginning. Now I would go through the array and increase the values in the map if the key is already in the map.
In the end I would go again through the array. When the integer from the array has a value of one in the map, I would not change anything. When it has a value of 2 or more in the map I would swap the integer from the array with the last one.
This should result in a runtime of O(n*log(n))
The way I would do this would be to create an array twice the size of the original and create a set of integers.
Then Loop through the original array, add each element to the set, if it already exists add it to the 2nd half of the new array, else add it to the first half of the new array.
In the end you would get an array that looks like: (using your example)
1,7,4,8,2,6,3,9,10,-,-,8,7,-,-,-,-,-,-,-,-,-
Then I would loop through the original array again and make each spot equal to the next non-null position (or 0'd or whatever you decided)
That would make the original array turn into your solution...
This ends up being O(n) which is about as efficient as I can think of
Edit: since you can not use another array, when you find a value that is already in the
set you can move every value after it forward one and set the last value equal to the
number you just checked, this would in effect do the same thing but with a lot more operations.
I have been out of touch for a while, but I'd probably start out with something like this and see how it scales with larger input. I know you didn't ask for code but in some cases it's easier to understand than an explanation.
Edit: Sorry I missed the requirement that you cannot use a buffer array.
// returns new vector with dupes a the end
std::vector<int> move_dupes_to_end(std::vector<int> input)
{
std::set<int> counter;
std::vector<int> result;
std::vector<int> repeats;
for (std::vector<int>::iterator i = input.begin(); i < input.end(); i++)
{
if (counter.find(*i) == counter.end())
result.push_back(*i);
else
repeats.push_back(*i);
counter.insert(*i);
}
result.insert(result.end(), repeats.begin(), repeats.end());
return result;
}
#include <algorithm>
T * array = [your array];
size_t size = [array size];
// Complexity:
sort( array, array + size ); // n * log(n) and could be threaded
// (if merge sort)
T * last = unique( array, array + size ); // n, but the elements after the last
// unique element are not defined
Check sort and unique.
void remove_dup(int* data, int count) {
int* L=data; //place to put next unique number
int* R=data+count; //place to place next repeat number
std::unordered_set<int> found(count); //keep track of what's been seen
for(int* cur=data; cur<R; ++cur) { //until we reach repeats
if(found.insert(*cur).second == false) { //if we've seen it
std::swap(*cur,*--R); //put at the beginning of the repeats
} else //or else
std::swap(*cur,*L++); //put it next in the unique list
}
std::reverse(R, data+count); //reverse the repeats to be in origional order
}
http://ideone.com/3choA
Not that I would turn in code this poorly commented. Also note that unordered_set probably uses it's own array internally, bigger than data. (This has been rewritten based on aix's answer, to be much faster)
If you know the bounds on what the integer values are, B, and the size of the integer array, SZ, then you can do something like the following:
Create an array of booleans seen_before with B elements, initialized to 0.
Create a result array result of integers with SZ elements.
Create two integers, one for front_pos = 0, one for back_pos = SZ - 1.
Iterate across the original list:
Set an integer variable val to the value of the current element
If seen_before[val] is set to 1, put the number at result[back_pos] then decrement back_pos
If seen_before[val] is not set to 1, put the number at result[front_pos] then increment front_pos and set seen_before[val] to 1.
Once you finish iterating across the main list, all the unique numbers will be at the front of the list while the duplicate numbers will be at the back. Fun part is that the entire process is done in one pass. Note that this only works if you know the bounds of the values appearing in the original array.
Edit: It was pointed out that there's no bounds on the integers used, so instead of initializing seen_before as an array with B elements, initialize it as a map<int, bool>, then continue as usual. That should get you n*log(n) performance.
This can be done by iterating the array & marking index of the first change.
later on swaping that mark index value with next unique value
& then incrementing that mark index for next swap
Java Implementation:
public static void solve() {
Integer[] arr = new Integer[] { 1, 7, 4, 8, 2, 6, 8, 3, 7, 9, 10 };
final HashSet<Integer> seen = new HashSet<Integer>();
int l = -1;
for (int i = 0; i < arr.length; i++) {
if (seen.contains(arr[i])) {
if (l == -1) {
l = i;
}
continue;
}
if (l > -1) {
final int temp = arr[i];
arr[i] = arr[l];
arr[l] = temp;
l++;
}
seen.add(arr[i]);
}
}
output is 1 7 4 8 2 6 3 9 10 8 7
It's ugly, but it meets the requirements of moving the duplicates to the end in place (no buffer array)
// warning, some light C++11
void dup2end(int* arr, size_t cnt)
{
std::set<int> k;
auto end = arr + cnt-1;
auto max = arr + cnt;
auto curr = arr;
while(curr < max)
{
auto res = k.insert(*curr);
// first time encountered
if(res.second)
{
++curr;
}
else
{
// duplicate:
std::swap(*curr, *end);
--end;
--max;
}
}
}
void move_duplicates_to_end(vector<int> &A) {
if(A.empty()) return;
int i = 0, tail = A.size()-1;
while(i <= tail) {
bool is_first = true; // check of current number is first-shown
for(int k=0; k<i; k++) { // always compare with numbers before A[i]
if(A[k] == A[i]) {
is_first = false;
break;
}
}
if(is_first == true) i++;
else {
int tmp = A[i]; // swap with tail
A[i] = A[tail];
A[tail] = tmp;
tail--;
}
}
If the input array is {1,7,4,8,2,6,8,3,7,9,10}, then the output is {1,7,4,8,2,6,10,3,9,7,8}. Comparing with your answer {1,7,4,8,2,6,3,9,10,8,7}, the first half is the same, while the right half is different, because I swap all duplicates with the tail of the array. As you mentioned, the order of the duplicates can be arbitrary.

Traversing an array to find the second largest element in linear time

Is there a way in linear time by which we can find which is the second largest element of an array ?
Array elements can be positive, negative or zero.
Elements can be repetitive.
No STLs allowed.
Python can be used.
Solution : Sort the array and take the second element but Sorting not allowed
Modification : By definition second largest element will be the one which is numerically smaller. Like if we have
Arr = {5,5,4,3,1}
Then second largest is 4
Addition
Lets say if i want to generalize the question to kth largest and complexity less than linear like nlogn, what can be the solution.
Go through the array, keeping 2 memory slots to record the 2 largest elements seen so far. Return the smaller of the two.
.... is there anything tricky about this question that I can't see?
You can, this is the pseudo algorithm:
max = 2max = SMALLEST_INT_VALUE;
for element in v:
if element > max:
2max = max;
max = element;
else if element > 2max:
2max = element;
2max is the value you are looking for.
The algorithm won't return a correct value for particular cases, such as an array where its elements are equal.
If you want a true O(n) algorithm, and want to find nth largest element in array then you should use quickselect (it's basically quicksort where you throw out the partition that you're not interested in), and the below is a great writeup, with the runtime analysis:
http://pine.cs.yale.edu/pinewiki/QuickSelect
Pseudo code:
int max[2] = { array[0], array[1] }
if(max[1] < max[0]) swap them
for (int i = 2; i < array.length(); i++) {
if(array[i] >= max[0]) max[1] = max[0]; max[0] = array[i]
else if(array[i] >= max[1]) max[1] = array[i];
}
Now, max array contains the max 2 elements.
create a temporary array of size 3,
copy first 3 elements there,
sort the temporary array,
replace the last one in the temporary array with the 4th element from the source array,
sort the temporary array,
replace the last one in the temporary array with the 5th element from the source array,
sort the temporary array,
etc.
Sorting array of size 3 is constant time and you do that once for each element of the source array, hence linear overall time.
Yep. You tagged this as C/C++ but you mentioned you could do it in Python. Anyway, here is the algorithm:
Create the array (obviously).
If the first item is greater than the second item, set first variable to the first item and second variable to second item. Otherwise, do vise-versa.
Loop through all the items (except the first two).
If the item from the array is greater than first variable, set second variable to first variable and first variable to the item. Else if the item is greater than second variable set second variable to the item.
The second variable is your answer.
list = [-1,6,9,2,0,2,8,10,8,-10]
if list[0] > list[1]:
first = list[0]
second = list[1]
else:
first = list[1]
second = list[0]
for i in range(2, len(list)):
if list[i] > first:
first, second = list[i], first
elif list[i] > second:
second = list[i]
print("First:", first)
print("Second:", second)
// assuming that v is the array and length is its length
int m1 = max(v[0], v[1]), m2 = min(v[0], v[1]);
for (int i=2; i<length; i++) {
if (likely(m2 >= v[i]))
continue;
if (unlikely(m1 < v[i]))
m2 = m1, m1 = v[i];
else
m2 = v[i];
}
The result you need is in m2 (likely and unlikely are macros defined as here for performance purposes, you can simply remove them if you don't need them).
I think the other answers have not accounted for the fact that in an array like [0, 1, 1], the second largest is 0 (according to the updated problem definition). Furthermore, all mentions of quickselect are not O(n) but rather O(n^2) and are doing much more work than necessary (on top of which that is a sorting algorithm which the problem statement disallowed). Here is a very similar algorithm to Simone's but updated to return the second largest unique element:
def find_second(numbers):
top = numbers[0]
second = None
for num in numbers[1:]:
if second is None:
if num < top: second = num
elif num > top:
second = top
top = num
else:
if num > second:
if num > top:
second = top
top = num
elif num < top: second = num
if second is not None: return second
return top
if __name__ == '__main__':
print "The second largest is %d" % find_second([1,2,3,4,4,5,5])
// Second larger element and its position(s)
int[] tab = { 12, 1, 21, 12, 8, 8, 1 };
int[] tmp = Arrays.copyOf(tab, tab.length);
int secMax = 0;
Arrays.sort(tmp);
secMax = tmp[tmp.length - 2];
System.out.println(secMax);
List<Integer> positions = new ArrayList<>();
for (int i = 0; i < tab.length; i++) {
if (tab[i] == secMax) {
positions.add(i);
}
}
System.out.println(positions);

Randomly permute N first elements of a singly linked list

I have to permute N first elements of a singly linked list of length n, randomly. Each element is defined as:
typedef struct E_s
{
struct E_s *next;
}E_t;
I have a root element and I can traverse the whole linked list of size n. What is the most efficient technique to permute only N first elements (starting from root) randomly?
So, given a->b->c->d->e->f->...x->y->z I need to make smth. like f->a->e->c->b->...x->y->z
My specific case:
n-N is about 20% relative to n
I have limited RAM resources, the best algorithm should make it in place
I have to do it in a loop, in many iterations, so the speed does matter
The ideal randomness (uniform distribution) is not required, it's Ok if it's "almost" random
Before making permutations, I traverse the N elements already (for other needs), so maybe I could use this for permutations as well
UPDATE: I found this paper. It states it presents an algorithm of O(log n) stack space and expected O(n log n) time.
I've not tried it, but you could use a "randomized merge-sort".
To be more precise, you randomize the merge-routine. You do not merge the two sub-lists systematically, but you do it based on a coin toss (i.e. with probability 0.5 you select the first element of the first sublist, with probability 0.5 you select the first element of the right sublist).
This should run in O(n log n) and use O(1) space (if properly implemented).
Below you find a sample implementation in C you might adapt to your needs. Note that this implementation uses randomisation at two places: In splitList and in merge. However, you might choose just one of these two places. I'm not sure if the distribution is random (I'm almost sure it is not), but some test cases yielded decent results.
#include <stdio.h>
#include <stdlib.h>
#define N 40
typedef struct _node{
int value;
struct _node *next;
} node;
void splitList(node *x, node **leftList, node **rightList){
int lr=0; // left-right-list-indicator
*leftList = 0;
*rightList = 0;
while (x){
node *xx = x->next;
lr=rand()%2;
if (lr==0){
x->next = *leftList;
*leftList = x;
}
else {
x->next = *rightList;
*rightList = x;
}
x=xx;
lr=(lr+1)%2;
}
}
void merge(node *left, node *right, node **result){
*result = 0;
while (left || right){
if (!left){
node *xx = right;
while (right->next){
right = right->next;
}
right->next = *result;
*result = xx;
return;
}
if (!right){
node *xx = left;
while (left->next){
left = left->next;
}
left->next = *result;
*result = xx;
return;
}
if (rand()%2==0){
node *xx = right->next;
right->next = *result;
*result = right;
right = xx;
}
else {
node *xx = left->next;
left->next = *result;
*result = left;
left = xx;
}
}
}
void mergeRandomize(node **x){
if ((!*x) || !(*x)->next){
return;
}
node *left;
node *right;
splitList(*x, &left, &right);
mergeRandomize(&left);
mergeRandomize(&right);
merge(left, right, &*x);
}
int main(int argc, char *argv[]) {
srand(time(NULL));
printf("Original Linked List\n");
int i;
node *x = (node*)malloc(sizeof(node));;
node *root=x;
x->value=0;
for(i=1; i<N; ++i){
node *xx;
xx = (node*)malloc(sizeof(node));
xx->value=i;
xx->next=0;
x->next = xx;
x = xx;
}
x=root;
do {
printf ("%d, ", x->value);
x=x->next;
} while (x);
x = root;
node *left, *right;
mergeRandomize(&x);
if (!x){
printf ("Error.\n");
return -1;
}
printf ("\nNow randomized:\n");
do {
printf ("%d, ", x->value);
x=x->next;
} while (x);
printf ("\n");
return 0;
}
Convert to an array, use a Fisher-Yates shuffle, and convert back to a list.
I don't believe there's any efficient way to randomly shuffle singly-linked lists without an intermediate data structure. I'd just read the first N elements into an array, perform a Fisher-Yates shuffle, then reconstruct those first N elements into the singly-linked list.
First, get the length of the list and the last element. You say you already do a traversal before randomization, that would be a good time.
Then, turn it into a circular list by linking the first element to the last element. Get four pointers into the list by dividing the size by four and iterating through it for a second pass. (These pointers could also be obtained from the previous pass by incrementing once, twice, and three times per four iterations in the previous traversal.)
For the randomization pass, traverse again and swap pointers 0 and 2 and pointers 1 and 3 with 50% probability. (Do either both swap operations or neither; just one swap will split the list in two.)
Here is some example code. It looks like it could be a little more random, but I suppose a few more passes could do the trick. Anyway, analyzing the algorithm is more difficult than writing it :vP . Apologies for the lack of indentation; I just punched it into ideone in the browser.
http://ideone.com/9I7mx
#include <iostream>
#include <cstdlib>
#include <ctime>
using namespace std;
struct list_node {
int v;
list_node *n;
list_node( int inv, list_node *inn )
: v( inv ), n( inn) {}
};
int main() {
srand( time(0) );
// initialize the list and 4 pointers at even intervals
list_node *n_first = new list_node( 0, 0 ), *n = n_first;
list_node *p[4];
p[0] = n_first;
for ( int i = 1; i < 20; ++ i ) {
n = new list_node( i, n );
if ( i % (20/4) == 0 ) p[ i / (20/4) ] = n;
}
// intervals must be coprime to list length!
p[2] = p[2]->n;
p[3] = p[3]->n;
// turn it into a circular list
n_first->n = n;
// swap the pointers around to reshape the circular list
// one swap cuts a circular list in two, or joins two circular lists
// so perform one cut and one join, effectively reordering elements.
for ( int i = 0; i < 20; ++ i ) {
list_node *p_old[4];
copy( p, p + 4, p_old );
p[0] = p[0]->n;
p[1] = p[1]->n;
p[2] = p[2]->n;
p[3] = p[3]->n;
if ( rand() % 2 ) {
swap( p_old[0]->n, p_old[2]->n );
swap( p_old[1]->n, p_old[3]->n );
}
}
// you might want to turn it back into a NULL-terminated list
// print results
for ( int i = 0; i < 20; ++ i ) {
cout << n->v << ", ";
n = n->n;
}
cout << '\n';
}
For the case when N is really big (so it doesn't fit your memory), you can do the following (a sort of Knuth's 3.4.2P):
j = N
k = random between 1 and j
traverse the input list, find k-th item and output it; remove the said item from the sequence (or mark it somehow so that you won't consider it at the next traversal)
decrease j and return to 2 unless j==0
output the rest of the list
Beware that this is O(N^2), unless you can ensure random access in the step 3.
In case the N is relatively small, so that N items fit into the memory, just load them into array and shuffle, like #Mitch proposes.
If you know both N and n, I think you can do it simply. It's fully random, too. You only iterate through the whole list once, and through the randomized part each time you add a node. I think that's O(n+NlogN) or O(n+N^2). I'm not sure. It's based upon updating the conditional probability that a node is selected for the random portion given what happened to previous nodes.
Determine the probability that a certain node will be selected for the random portion given what happened to previous nodes (p=(N-size)/(n-position) where size is number of nodes previously chosen and position is number of nodes previously considered)
If node is not selected for random part, move to step 4. If node is selected for the random part, randomly choose place in random part based upon the size so far (place=(random between 0 and 1) * size, size is again number of previous nodes).
Place the node where it needs to go, update the pointers. Increment size. Change to looking at the node that previously pointed at what you were just looking at and moved.
Increment position, look at the next node.
I don't know C, but I can give you the pseudocode. In this, I refer to the permutation as the first elements that are randomized.
integer size=0; //size of permutation
integer position=0 //number of nodes you've traversed so far
Node head=head of linked list //this holds the node at the head of your linked list.
Node current_node=head //Starting at head, you'll move this down the list to check each node, whether you put it in the list.
Node previous=head //stores the previous node for changing pointers. starts at head to avoid asking for the next field on a null node
While ((size not equal to N) or (current_node is not null)){ //iterating through the list until the permutation is full. We should never pass the end of list, but just in case, I include that condition)
pperm=(N-size)/(n-position) //probability that a selected node will be in the permutation.
if ([generate a random decimal between 0 and 1] < pperm) //this decides whether or not the current node will go in the permutation
if (j is not equal to 0){ //in case we are at start of list, there's no need to change the list
pfirst=1/(size+1) //probability that, if you select a node to be in the permutation, that it will be first. Since the permutation has
//zero elements at start, adding an element will make it the initial node of a permutation and percent chance=1.
integer place_in_permutation = round down([generate a random decimal between 0 and 1]/pfirst) //place in the permutation. note that the head =0.
previous.next=current_node.next
if(place_in_permutation==0){ //if placing current node first, must change the head
current_node.next=head //set the current Node to point to the previous head
head=current_node //set the variable head to point to the current node
}
else{
Node temp=head
for (counter starts at zero. counter is less than place_in_permutation-1. Each iteration, increment counter){
counter=counter.next
} //at this time, temp should point to the node right before the insertion spot
current_node.next=temp.next
temp.next=current_node
}
current_node=previous
}
size++ //since we add one to the permutation, increase the size of the permutation
}
j++;
previous=current_node
current_node=current_node.next
}
You could probably increase the efficiency if you held on to the most recently added node in case you had to add one to the right of it.
Similar to Vlad's answer, here is a slight improvement (statistically):
Indices in algorithm are 1 based.
Initialize lastR = -1
If N <= 1 go to step 6.
Randomize number r between 1 and N.
if r != N
4.1 Traverse the list to item r and its predecessor.
If lastR != -1
If r == lastR, your pointer for the of the r'th item predecessor is still there.
If r < lastR, traverse to it from the beginning of the list.
If r > lastR, traverse to it from the predecessor of the lastR'th item.
4.2 remove the r'th item from the list into a result list as the tail.
4.3 lastR = r
Decrease N by one and go to step 2.
link the tail of the result list to the head of the remaining input list. You now have the original list with the first N items permutated.
Since you do not have random access, this will reduce the traversing time you will need within the list (I assume that by half, so asymptotically, you won't gain anything).
O(NlogN) easy to implement solution that does not require extra storage:
Say you want to randomize L:
is L has 1 or 0 elements you are done
create two empty lists L1 and L2
loop over L destructively moving its elements to L1 or L2 choosing between the two at random.
repeat the process for L1 and L2 (recurse!)
join L1 and L2 into L3
return L3
Update
At step 3, L should be divided into equal sized (+-1) lists L1 and L2 in order to guaranty best case complexity (N*log N). That can be done adjusting the probability of one element going into L1 or L2 dynamically:
p(insert element into L1) = (1/2 * len0(L) - len(L1)) / len(L)
where
len(M) is the current number of elements in list M
len0(L) is the number of elements there was in L at the beginning of step 3
There is an algorithm takes O(sqrt(N)) space and O(N) time, for a singly linked list.
It does not generate a uniform distribution over all permutation sequence, but it can gives good permutation that is not easily distinguishable. The basic idea is similar to permute a matrix by rows and columns as described below.
Algorithm
Let the size of the elements to be N, and m = floor(sqrt(N)). Assuming a "square matrix" N = m*m will make this method much clear.
In the first pass, you should store the pointers of elements that is separated by every m elements as p_0, p_1, p_2, ..., p_m. That is, p_0->next->...->next(m times) == p_1 should be true.
Permute each row
For i = 0 to m do:
Index all elements between p_i->next to p_(i+1)->next in the link list by an array of size O(m)
Shuffle this array using standard method
Relink the elements using this shuffled array
Permute each column.
Initialize an array A to store pointers p_0, ..., p_m. It is used to traverse the columns
For i = 0 to m do
Index all elements pointed A[0], A[1], ..., A[m-1] in the link list by an array of size m
Shuffle this array
Relink the elements using this shuffled array
Advance the pointer to next column A[i] := A[i]->next
Note that p_0 is an element point to the first element and the p_m point to the last element. Also, if N != m*m, you may use m+1 separation for some p_i instead. Now you get a "matrix" such that the p_i point to the start of each row.
Analysis and randomness
Space complexity: This algorithm need O(m) space to store the start of row. O(m) space to store the array and O(m) space to store the extra pointer during column permutation. Hence, time complexity is ~ O(3*sqrt(N)). For N = 1000000, it is around 3000 entries and 12 kB memory.
Time complexity: It is obviously O(N). It either walk through the "matrix" row by row or column by column
Randomness: The first thing to note is that each element can go to anywhere in the matrix by row and column permutation. It is very important that elements can go to anywhere in the linked list. Second, though it does not generate all permutation sequence, it does generate part of them. To find the number of permutation, we assume N=m*m, each row permutation has m! and there is m row, so we have (m!)^m. If column permutation is also include, it is exactly equal to (m!)^(2*m), so it is almost impossible to get the same sequence.
It is highly recommended to repeat the second and third step by at least one more time to get an more random sequence. Because it can suppress almost all the row and column correlation to its original location. It is also important when your list is not "square". Depends on your need, you may want to use even more repetition. The more repetition you use, the more permutation it can be and the more random it is. I remember that it is possible to generate uniform distribution for N=9 and I guess that it is possible to prove that as repetition tends to infinity, it is the same as the true uniform distribution.
Edit: The time and space complexity is tight bound and is almost the same in any situation. I think this space consumption can satisfy your need. If you have any doubt, you may try it in a small list and I think you will find it useful.
The list randomizer below has complexity O(N*log N) and O(1) memory usage.
It is based on the recursive algorithm described on my other post modified to be iterative instead of recursive in order to eliminate the O(logN) memory usage.
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
typedef struct node {
struct node *next;
char *str;
} node;
unsigned int
next_power_of_two(unsigned int v) {
v--;
v |= v >> 1;
v |= v >> 2;
v |= v >> 4;
v |= v >> 8;
v |= v >> 16;
return v + 1;
}
void
dump_list(node *l) {
printf("list:");
for (; l; l = l->next) printf(" %s", l->str);
printf("\n");
}
node *
array_to_list(unsigned int len, char *str[]) {
unsigned int i;
node *list;
node **last = &list;
for (i = 0; i < len; i++) {
node *n = malloc(sizeof(node));
n->str = str[i];
*last = n;
last = &n->next;
}
*last = NULL;
return list;
}
node **
reorder_list(node **last, unsigned int po2, unsigned int len) {
node *l = *last;
node **last_a = last;
node *b = NULL;
node **last_b = &b;
unsigned int len_a = 0;
unsigned int i;
for (i = len; i; i--) {
double pa = (1.0 + RAND_MAX) * (po2 - len_a) / i;
unsigned int r = rand();
if (r < pa) {
*last_a = l;
last_a = &l->next;
len_a++;
}
else {
*last_b = l;
last_b = &l->next;
}
l = l->next;
}
*last_b = l;
*last_a = b;
return last_b;
}
unsigned int
min(unsigned int a, unsigned int b) {
return (a > b ? b : a);
}
randomize_list(node **l, unsigned int len) {
unsigned int po2 = next_power_of_two(len);
for (; po2 > 1; po2 >>= 1) {
unsigned int j;
node **last = l;
for (j = 0; j < len; j += po2)
last = reorder_list(last, po2 >> 1, min(po2, len - j));
}
}
int
main(int len, char *str[]) {
if (len > 1) {
node *l;
len--; str++; /* skip program name */
l = array_to_list(len, str);
randomize_list(&l, len);
dump_list(l);
}
return 0;
}
/* try as: a.out list of words foo bar doz li 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
*/
Note that this version of the algorithm is completely cache unfriendly, the recursive version would probably perform much better!
If both the following conditions are true:
you have plenty of program memory (many embedded hardwares execute directly from flash);
your solution does not suffer that your "randomness" repeats often,
Then you can choose a sufficiently large set of specific permutations, defined at programming time, write a code to write the code that implements each, and then iterate over them at runtime.