Convert linked list into binary search tree, do stuff and return tree as list - c++

I have the following problem:
I have a line with numbers that I have to read. The first number from the line is the amount of operations I will have to perform on the rest of the sequence.
There are two types of operations I will have to do:
Remove- we remove the number after the current one, then we move forward X steps in the sequence, where X=value of removed element)
Insert- we insert a new number after the current one with a value of (current element's value-1), then we move forward by X steps in the sequence where X = value of the current element (i.e not the new one)
We do "Remove" if the current number's value is even, and "Insert" if the value is odd.
After the amount of operations we have to print the whole sequence, starting from the number we ended the operations.
Properly working example:
Input: 3 1 2 3
Output:0 0 3 1
3 is the first number and it becomes the OperCount value.
First operation:
Sequence: 1 2 3, first element: 1
1 is odd, so we insert 0 (currNum's value-1)
We move forward by 1(currNum's value)
Output sequence: 1 0 2 3, current position: 0
Second operation:
0 is even so we remove the next value (2)
Move forward by the removed element's value(2):
From 0 to 3
From 3 to 1
Output sequence: 1 0 3, current position: 1
Third operation:
1 is even, so once again we insert new element with value of 0
Move by current element's value(1), onto the created 0.
Output sequence: 1 0 0 3, current position: first 0
Now here is the deal, we have reached the final condition and now we have to print whole sequence, but starting from the current position.
Final Output:
0 0 3 1
I have the working version, but its using the linked list, and because of that, it doesn't pass all the tests. Linked list traversal is too long, thats why I need to use the binary tree, but I kinda don't know how to start with it. I would appreciate any help.

First redefine the operations to put most (but not all) the work into a container object: We want 4 operations supported by the container object:
1) Construct from a [first,limit) pair of input random access iterators
2) insert(K) finds the value X at position K, inserts a X-1 after it and returns X
3) remove(K) finds the value X at position K, deletes it and returns X
4) size() reports the size of the contents
The work outside the container would just keep track of incremental changes to K:
K += insert(K); K %= size();
or
K += remove(K); K %= size();
Notice the importance of a sequence point before reading size()
The container data is just a root pointing to a node.
struct node {
unsigned weight;
unsigned value;
node* child[2];
unsigned cweight(unsigned s)
{ return child[s] ? child[s]->weight : 0; }
};
The container member functions insert and remove would be wrappers around recursive static insert and remove functions that each take a node*& in addition to K.
The first thing each of either recursive insert or remove must do is:
if (K<cweight(0)) recurse passing (child[0], K);
else if ((K-=cweight(0))>0) recurse passing (child[1], K-1);
else do the basic operation (read the result, create or destroy a node)
After doing that, you fix the weight at each level up the recursive call stack (starting where you did the work for insert or the level above that for remove).
After incrementing or decrementing the weight at the current level, you may need to re-balance, remembering which side you recursively changed. Insert is simpler: If child[s]->weight*4 >= This->weight*3 you need to re-balance. The re-balance is one of the two basic tree rotations and you select which one based on whether child[s]->cweight(s)<child[s]->cweight(1-s). rebalance for remove is the same idea but different details.
This system does a lot more worst case re-balancing than a red-black or AVL tree. But still is entirely logN. Maybe there is a better algorithm for a weight-semi-balanced tree. But I couldn't find that with a few google searches, nor even the real name of nor other details about what I just arbitrarily called a "weight-semi-balanced tree".
Getting the nearly 2X speed up of strangely mixing the read operation into the insert and remove operations, means you will need yet another recursive version of insert that doesn't mix in the read, and is used for the portion of the path below the point you read from (so it does the same recursive weight changes and re-balancing but with different input and output).
Given random access input iterators, the construction is a more trivial recursive function. Grab the middle item from the range of iterators and make a node of it with the total weight of the whole range, then recursively pass the sub ranges before and after the middle one to the same recursive function to create child subtree.
I haven't tested any of this, but I think the following is all the code you need for remove as well as the rebalance needed for both insert and remove. Functions taking node*& are static member function of tree and those not taking node*& are non static.
unsigned tree::remove(unsigned K)
{
node* removed = remove(root, K);
unsigned result = removed->value;
delete removed;
return result;
}
// static
node* tree::remove( node*& There, unsigned K) // Find, unlink and return the K'th node
{
node* result;
node* This = There;
unsigned s=0; // Guess at child NOT removed from
This->weight -= 1;
if ( K < This->cweight(0) )
{
s = 1;
result = remove( This->child[0], K );
}
else
{
K -= This->cweight(0);
if ( K > 0 )
{
result = remove( This->child[1], K-1 );
}
else if ( ! This->child[1] )
{
// remove This replacing it with child[0]
There = This->child[0];
return This; // Nothing here/below needs a re-balance check
}
else
{
// remove This replacing it with the leftmost descendent of child[1]
result = This;
There = This = remove( This->child[1], 0 );
This->child[0] = Result->child[0];
This->child[1] = Result->child[1];
This->weight = Result->weight;
}
}
rebalance( There, s );
return result;
}
// static
void tree::rebalance( node*& There, unsigned s)
{
node* This = There;
node* c = This->child[s];
if ( c && c->weight*4 >= This->weight*3 )
{
node* b = c->child[s];
node* d = c->child[1-s];
unsigned bweight = b ? b->weight : 0;
if ( d && bweight < d->weight )
{
// inner rotate: d becomes top of subtree
This->child[s] = d->child[1-s];
c->child[1-s] = d->child[s];
There = d;
d->child[s] = c;
d->child[1-s] = This;
d->weight = This->weight;
c->weight = bweight + c->cweight(1-s) + 1;
This->weight -= c->weight + 1;
}
else
{
// outer rotate: c becomes top of subtree
There = c;
c->child[1-s] = This;
c->weight = This->weight;
This->child[s] = d;
This->weight -= bweight+1;
}
}
}

You can use std::set which is implemented as binary tree. It's constructor allows construction from the iterator, thus you shouldn't have problem transforming list to the set.

Related

Binary tree interview: implement follow operation

I was asked to implement a binary search tree with follow operation for each node v - the complexity should be O(1). The follow operation should return a node w (w > v).
I proposed to do it in O(log(n)) but they wanted O(1)
Upd. It should be next greater node
just keep the maximum element for the tree and always return it for nodes v < maximum.
You can get O(1) if you store pointers to the "next node" (using your O(log(n) algorithm), given you are allowed to do that.
How about:
int tree[N];
size_t follow(size_t v) {
// First try the right child
size_t w = v * 2 + 1;
if(w >= N) {
// Otherwise right sibling
w = v + 1;
if(w >= N) {
// Finally right parent
w = (v - 1) / 2 + 1;
}
}
return w;
}
Where tree is a complete binary tree in array form and v/w are represented as zero-based indices.
One idea is to literally just have a next pointer on each node.
You can update these pointers in O(height) after an insert or remove (O(height) is O(log n) for a self-balancing BST), which is as long as an insert or remove takes, so it doesn't add to the time complexity.
Alternatively, you can also have a previous pointer in addition to the next pointer. If you do this, you can update these pointers in O(1).
Obviously, in either case, if you have a node, you also have its next pointer, and you can simply get this value in O(1).
Pseudo-code
For only a next pointer, after the insert, you'd do:
if inserted as a right child:
newNode.next = parent.next
parent.next = newNode
else // left child
predecessor(newNode)
For both next and previous pointers:
if inserted as a right child:
parent.next.previous = newNode
newNode.next = parent.next
parent.next = newNode
else // left child
parent.previous.next = newNode
newNode.previous = parent.previous
parent.previous = newNode
(some null checks are also required).

Maintain a sorted array in O(1)?

We have a sorted array and we would like to increase the value of one index by only 1 unit (array[i]++), such that the resulting array is still sorted. Is this possible in O(1)?
It is fine to use any data structure possible in STL and C++.
In a more specific case, if the array is initialised by all 0 values, and it is always incrementally constructed only by increasing a value of an index by one, is there an O(1) solution?
I haven't worked this out completely, but I think the general idea might help for integers at least. At the cost of more memory, you can maintain a separate data-structure that maintains the ending index of a run of repeated values (since you want to swap your incremented value with the ending index of the repeated value). This is because it's with repeated values that you run into the worst case O(n) runtime: let's say you have [0, 0, 0, 0] and you increment the value at location 0. Then it is O(n) to find out the last location (3).
But let's say that you maintain the data-structure I mentioned (a map would works because it has O(1) lookup). In that case you would have something like this:
0 -> 3
So you have a run of 0 values that end at location 3. When you increment a value, let's say at location i, you check to see if the new value is greater than the value at i + 1. If it is not, you are fine. But if it is, you look to see if there is an entry for this value in the secondary data-structure. If there isn't, you can simply swap. If there is an entry, you look up the ending-index and then swap with the value at that location. You then make any changes you need to the secondary data-structure to reflect the new state of the array.
A more thorough example:
[0, 2, 3, 3, 3, 4, 4, 5, 5, 5, 7]
The secondary data-structure is:
3 -> 4
4 -> 6
5 -> 9
Let's say you increment the value at location 2. So you have incremented 3, to 4. The array now looks like this:
[0, 2, 4, 3, 3, 4, 4, 5, 5, 5, 7]
You look at the next element, which is 3. You then look up the entry for that element in the secondary data-structure. The entry is 4, which means that there is a run of 3's that end at 4. This means that you can swap the value from the current location with the value at index 4:
[0, 2, 3, 3, 4, 4, 4, 5, 5, 5, 7]
Now you will also need to update the secondary data-structure. Specifically, there the run of 3's ends one index early, so you need to decrement that value:
3 -> 3
4 -> 6
5 -> 9
Another check you will need to do is to see if the value is repeated anymore. You can check that by looking at the i - 1th and the i + 1th locations to see if they are the same as the value in question. If neither are equal, then you can remove the entry for this value from the map.
Again, this is just a general idea. I will have to code it out to see if it works out the way I thought about it.
Please feel free to poke holes.
UPDATE
I have an implementation of this algorithm here in JavaScript. I used JavaScript just so I could do it quickly. Also, because I coded it up pretty quickly it can probably be cleaned up. I do have comments though. I'm not doing anything esoteric either, so this should be easily portable to C++.
There are essentially two parts to the algorithm: the incrementing and swapping (if necessary), and book-keeping done on the map that keeps track of our ending indices for runs of repeated values.
The code contains a testing harness that starts with an array of zeroes and increments random locations. At the end of every iteration, there is a test to ensure that the array is sorted.
var array = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0];
var endingIndices = {0: 9};
var increments = 10000;
for(var i = 0; i < increments; i++) {
var index = Math.floor(Math.random() * array.length);
var oldValue = array[index];
var newValue = ++array[index];
if(index == (array.length - 1)) {
//Incremented element is the last element.
//We don't need to swap, but we need to see if we modified a run (if one exists)
if(endingIndices[oldValue]) {
endingIndices[oldValue]--;
}
} else if(index >= 0) {
//Incremented element is not the last element; it is in the middle of
//the array, possibly even the first element
var nextIndexValue = array[index + 1];
if(newValue === nextIndexValue) {
//If the new value is the same as the next value, we don't need to swap anything. But
//we are doing some book-keeping later with the endingIndices map. That code requires
//the ending index (i.e., where we moved the incremented value to). Since we didn't
//move it anywhere, the endingIndex is simply the index of the incremented element.
endingIndex = index;
} else if(newValue > nextIndexValue) {
//If the new value is greater than the next value, we will have to swap it
var swapIndex = -1;
if(!endingIndices[nextIndexValue]) {
//If the next value doesn't have a run, then location we have to swap with
//is just the next index
swapIndex = index + 1;
} else {
//If the next value has a run, we get the swap index from the map
swapIndex = endingIndices[nextIndexValue];
}
array[index] = nextIndexValue;
array[swapIndex] = newValue;
endingIndex = swapIndex;
} else {
//If the next value is already greater, there is nothing we need to swap but we do
//need to do some book-keeping with the endingIndices map later, because it is
//possible that we modified a run (the value might be the same as the value that
//came before it). Since we don't have anything to swap, the endingIndex is
//effectively the index that we are incrementing.
endingIndex = index;
}
//Moving the new value to its new position may have created a new run, so we need to
//check for that. This will only happen if the new position is not at the end of
//the array, and the new value does not have an entry in the map, and the value
//at the position after the new position is the same as the new value
if(endingIndex < (array.length - 1) &&
!endingIndices[newValue] &&
array[endingIndex + 1] == newValue) {
endingIndices[newValue] = endingIndex + 1;
}
//We also need to check to see if the old value had an entry in the
//map because now that run has been shortened by one.
if(endingIndices[oldValue]) {
var newEndingIndex = --endingIndices[oldValue];
if(newEndingIndex == 0 ||
(newEndingIndex > 0 && array[newEndingIndex - 1] != oldValue)) {
//In this case we check to see if the old value only has one entry, in
//which case there is no run of values and so we will need to remove
//its entry from the map. This happens when the new ending-index for this
//value is the first location (0) or if the location before the new
//ending-index doesn't contain the old value.
delete endingIndices[oldValue];
}
}
}
//Make sure that the array is sorted
for(var j = 0; j < array.length - 1; j++) {
if(array[j] > array[j + 1]) {
throw "Array not sorted; Value at location " + j + "(" + array[j] + ") is greater than value at location " + (j + 1) + "(" + array[j + 1] + ")";
}
}
}
In a more specific case, if the array is initialised by all 0 values, and it is always incrementally constructed only by increasing a value of an index by one, is there an O(1) solution?
No. Given an array of all 0's: [0, 0, 0, 0, 0]. If you increment the first value, giving [1, 0, 0, 0, 0], then you will have to make 4 swaps to ensure that it remains sorted.
Given a sorted array with no duplicates, then the answer is yes. But after the first operation (i.e. the first time you increment), then you could potentially have duplicates. The more increments you do, the higher the likelihood is that you'll have duplicates, and the more likely it'll take O(n) to keep that array sorted.
If all you have is the array, it's impossible to guarantee less than O(n) time per increment. If what you're looking for is a data structure that supports sorted order and lookup by index, then you probably want an order stastic tree.
If the values are small, counting sort will work. Represent the array [0,0,0,0] as {4}. Incrementing any zero gives {3,1} : 3 zeroes and a one. In general, to increment any value x, deduct one from the count of x and increment the count of {x+1}. The space efficiency is O(N), though, where N is the highest value.
It depends on how many items can have the same value. If more items can have the same value, then it is not possible to have O(1) with ordinary arrays.
Let's do an example: suppose array[5] = 21, and you want to do array[5]++:
Increment the item:
array[5]++
(which is O(1) because it is an array).
So, now array[5] = 22.
Check the next item (i.e., array[6]):
If array[6] == 21, then you have to keep checking new items (i.e., array[7] and so on) until you find a value higher than 21. At that point you can swap the values. This search is not O(1) because potentially you have to scan the whole array.
Instead, if items cannot have the same value, then you have:
Increment the item:
array[5]++
(which is O(1) because it is an array).
So, now array[5] = 22.
The next item cannot be 21 (because two items cannot have the same value), so it must have a value > 21 and the array is already sorted.
So you take sorted array and hashtable. You go over array to figure out 'flat' areas - where elements are of the same value. For every flat area you have to figure out three things 1) where it starts (index of first element) 2) what is it's value 3) what is the value of next element (the next bigger). Then put this tuple into the hashtable, where the key will be element value. This is prerequisite and it's complexity doesn't really matter.
Then when you increase some element (index i) you look up a table for index of next bigger element (call it j), and swap i with i - 1. Then 1) add new entry to hashtable 2) update existing entry for it's previous value.
With perfect hashtable (or limited range of possible values) it will be almost O(1). The downside: it will not be stable.
Here is some code:
#include <iostream>
#include <unordered_map>
#include <vector>
struct Range {
int start, value, next;
};
void print_ht(std::unordered_map<int, Range>& ht)
{
for (auto i = ht.begin(); i != ht.end(); i++) {
Range& r = (*i).second;
std::cout << '(' << r.start << ", "<< r.value << ", "<< r.next << ") ";
}
std::cout << std::endl;
}
void increment_el(int i, std::vector<int>& array, std::unordered_map<int, Range>& ht)
{
int val = array[i];
array[i]++;
//Pick next bigger element
Range& r = ht[val];
//Do the swapping, so last element of that range will be first
std::swap(array[i], array[ht[r.next].start - 1]);
//Update hashtable
ht[r.next].start--;
}
int main(int argc, const char * argv[])
{
std::vector<int> array = {1, 1, 1, 2, 2, 3};
std::unordered_map<int, Range> ht;
int start = 0;
int value = array[0];
//Build indexing hashtable
for (int i = 0; i <= array.size(); i++) {
int cur_value = i < array.size() ? array[i] : -1;
if (cur_value > value || i == array.size()) {
ht[value] = {start, value, cur_value};
start = i;
value = cur_value;
}
}
print_ht(ht);
//Now let's increment first element
increment_el(0, array, ht);
print_ht(ht);
increment_el(3, array, ht);
print_ht(ht);
for (auto i = array.begin(); i != array.end(); i++)
std::cout << *i << " ";
return 0;
}
Yes and no.
Yes if the list contains only unique integers, as that means you only need to check the next value. No in any other situation. If the values are not unique, incrementing the first of N duplicate values means that it must move N positions. If the values are floating-point, you may have thousands of values between x and x+1
It's important to be very clear about the requirements; the simplest way is to express the problem as an ADT (Abstract Datatype), listing the required operations and complexities.
Here's what I think you are looking for: a datatype which provides the following operations:
Construct(n): Create a new object of size n all of whose values are 0.
Value(i): Return the value at index i.
Increment(i): Increment the value at index i.
Least(): Return the index of the element with least value (or one such element if there are several).
Next(i): Return the index of the next element after element i in a sorted traversal starting at Least(), such that the traversal will return every element.
Aside from the Constructor, we want every one of the above operations to have complexity O(1). We also want the object to occupy O(n) space.
The implementation uses a list of buckets; each bucket has a value and a list of elements. Each element has an index, a pointer to the bucket it is part of. Finally, we have an array of pointers to elements. (In C++, I'd probably use iterators rather than pointers; in another language, I'd probably use intrusive lists.) The invariants are that no bucket is ever empty, and the value of the buckets are strictly monotonically increasing.
We start with a single bucket with value 0 which has a list of n elements.
Value(i) is implemented by returning the value of the bucket of the element referenced by the iterator at element i of the array. Least() is the index of the first element in the first bucket. Next(i) is the index of the next element after the one referenced by the iterator at element i, unless that iterator is already pointing at the end of the the list in which case it is the first element in the next bucket, unless the element's bucket is the last bucket, in which case we're at the end of the element list.
The only interface of interest is Increment(i), which is as follows:
If element i is the only element in its bucket (i.e. there is no next element in the bucket list, and element i is the first element in the bucket list):
Increment the value of the associated bucket.
If the next bucket has the same value, append the next bucket's element list to this bucket's element list (this is O(1), regardless of the list's size, because it is just a pointer swap), and then delete the next bucket.
If element i is not the only element in its bucket, then:
Remove it from its bucket list.
If the next bucket has the next sequential value, then push element i onto the next bucket's list.
Otherwise, the next bucket's value is larger, then create a new bucket with the next sequential value and only element i and insert it between this bucket and the next one.
just iterate along the array from the modified element until you find the correct place, then swap. Average case complexity is O(N) where N is the average number of duplicates. Worst case is O(n) where n is the length of the array. As long as N isn't large and doesn't scale badly with n, you're fine and can probably pretend it's O(1) for practical purposes.
If duplicates are the norm and/or scale strongly with n, then there are better solutions, see other responses.
I think that it is possible without using a hashtable. I have an implementation here:
#include <cstdio>
#include <vector>
#include <cassert>
// This code is a solution for http://stackoverflow.com/questions/19957753/maintain-a-sorted-array-in-o1
//
// """We have a sorted array and we would like to increase the value of one index by only 1 unit
// (array[i]++), such that the resulting array is still sorted. Is this possible in O(1)?"""
// The obvious implementation, which has O(n) worst case increment.
class LinearIncrementor
{
public:
LinearIncrementor(int numElems);
int valueAt(int index) const;
void incrementAt(int index);
private:
std::vector<int> m_values;
};
// Free list to store runs of same values
class RunList
{
public:
struct Run
{
int m_end; // end index of run, inclusive, or next object in free list
int m_value; // value at this run
};
RunList();
int allocateRun(int endIndex, int value);
void freeRun(int index);
Run& runAt(int index);
const Run& runAt(int index) const;
private:
std::vector<Run> m_runs;
int m_firstFree;
};
// More optimal implementation, which increments in O(1) time
class ConstantIncrementor
{
public:
ConstantIncrementor(int numElems);
int valueAt(int index) const;
void incrementAt(int index);
private:
std::vector<int> m_runIndices;
RunList m_runs;
};
LinearIncrementor::LinearIncrementor(int numElems)
: m_values(numElems, 0)
{
}
int LinearIncrementor::valueAt(int index) const
{
return m_values[index];
}
void LinearIncrementor::incrementAt(int index)
{
const int n = static_cast<int>(m_values.size());
const int value = m_values[index];
while (index+1 < n && value == m_values[index+1])
++index;
++m_values[index];
}
RunList::RunList() : m_firstFree(-1)
{
}
int RunList::allocateRun(int endIndex, int value)
{
int runIndex = -1;
if (m_firstFree == -1)
{
runIndex = static_cast<int>(m_runs.size());
m_runs.resize(runIndex + 1);
}
else
{
runIndex = m_firstFree;
m_firstFree = m_runs[runIndex].m_end;
}
Run& run = m_runs[runIndex];
run.m_end = endIndex;
run.m_value = value;
return runIndex;
}
void RunList::freeRun(int index)
{
m_runs[index].m_end = m_firstFree;
m_firstFree = index;
}
RunList::Run& RunList::runAt(int index)
{
return m_runs[index];
}
const RunList::Run& RunList::runAt(int index) const
{
return m_runs[index];
}
ConstantIncrementor::ConstantIncrementor(int numElems) : m_runIndices(numElems, 0)
{
const int runIndex = m_runs.allocateRun(numElems-1, 0);
assert(runIndex == 0);
}
int ConstantIncrementor::valueAt(int index) const
{
return m_runs.runAt(m_runIndices[index]).m_value;
}
void ConstantIncrementor::incrementAt(int index)
{
const int numElems = static_cast<int>(m_runIndices.size());
const int curRunIndex = m_runIndices[index];
RunList::Run& curRun = m_runs.runAt(curRunIndex);
index = curRun.m_end;
const bool freeCurRun = index == 0 || m_runIndices[index-1] != curRunIndex;
RunList::Run* runToMerge = NULL;
int runToMergeIndex = -1;
if (curRun.m_end+1 < numElems)
{
const int nextRunIndex = m_runIndices[curRun.m_end+1];
RunList::Run& nextRun = m_runs.runAt(nextRunIndex);
if (curRun.m_value+1 == nextRun.m_value)
{
runToMerge = &nextRun;
runToMergeIndex = nextRunIndex;
}
}
if (freeCurRun && !runToMerge) // then free and allocate at the same time
{
++curRun.m_value;
}
else
{
if (freeCurRun)
{
m_runs.freeRun(curRunIndex);
}
else
{
--curRun.m_end;
}
if (runToMerge)
{
m_runIndices[index] = runToMergeIndex;
}
else
{
m_runIndices[index] = m_runs.allocateRun(index, curRun.m_value+1);
}
}
}
int main(int argc, char* argv[])
{
const int numElems = 100;
const int numInc = 1000000;
LinearIncrementor linearInc(numElems);
ConstantIncrementor constInc(numElems);
srand(1);
for (int i = 0; i < numInc; ++i)
{
const int index = rand() % numElems;
linearInc.incrementAt(index);
constInc.incrementAt(index);
for (int j = 0; j < numElems; ++j)
{
if (linearInc.valueAt(j) != constInc.valueAt(j))
{
printf("Error: differing values at increment step %d, value at index %d\n", i, j);
}
}
}
return 0;
}
As a complement to the other answers: if you can only have the array, then you cannot indeed guarantee the operation will be constant-time; but because the array is sorted, you can find the end of a run of identical numbers in log n operations, not in n operations. This is simply a binary search.
If we expect most runs of numbers to be short, we should use galloping search, which is a variant where we first find the bounds by looking at positions +1, +2, +4, +8, +16, etc. and then doing binary search inside. You would get a time that is often constant (and extremely fast if the item is unique) but can grow up to log n. Unless for some reason long runs of identical numbers remain common even after many updates, this might outperform any solution that requires keeping additional data.

Popping/deleting nodes from a Huffman Tree Minheap

I'm having trouble popping correctly from a Huffman Tree. Right now I am creating a Huffman Tree based off a minheap and I want to do the following:
If we assume A and B to be two different subtrees, I would say that A would be popped off first if A's frequency is less than B's frequency. If they have the same frequency, then I would find the smallest character in ASCII value in any of A's leaf nodes. Then I would see if that smallest character leaf node in A is smaller than that in any of B's leaf nodes. If so I would pop off A before B. If not I would pop off B. <- this is what I'm having trouble with.
For example:
Let's assume I input:
eeffgghh\n (every letter except for \n's frequency which is 1 is 2)
into my Huffman Tree. Then my tree would look like this:
9
/ \
5 4
/ \ / \
3 h g f
/\
e \n
Below is my attempt for popping out of my Huffman minheap. I am having trouble with the part of comparing if the frequencies of two letters are the same. If anyone could help, that would be great. Thanks!
void minHeap::heapDown(int index)
{
HuffmanNode *t = new HuffmanNode();
if(arr[index]->getFreq() == arr[left]->getFreq() || arr[index]->getFreq() == arr[right]->getFreq()) //arr is an array of HeapNodes
{
if(arr[left]->getLetter() < arr[right]->getLetter())
{
t = arr[index]; //equals operator is overloaded for swapping
arr[index] = arr[left];
arr[left] = t;
heapDown(left);
}
else
{
t = arr[index];
arr[index] = arr[right];
arr[right] = t;
heapDown(right);
}
}
if(arr[index]->getFreq() > arr[left]->getFreq() || arr[index]->getFreq() > arr[right]->getFreq())
{
if(arr[left]->getFreq() < arr[right]->getFreq())
{
t = arr[index];
arr[index] = arr[left];
arr[left] = t;
heapDown(left);
}
else
{
t = arr[index];
arr[index] = arr[right];
arr[right] = t;
heapDown(right);
}//else
}//if
}
The standard C++ library contains heap algorithms. Unless you're not allowed to use it, you might well find it easier.
The standard C++ library also contains swap(a, b), which would be a lot more readable than the swap you're doing. However, swapping in heapDown is inefficient: what you should do is hold onto the element to be placed in a temporary, then sift children down until you find a place to put the element, and then put it there.
Your code would also be a lot more readable if you implemented operator< for HuffmanNode. In any event, you're doing one more comparison than is necessary; what you really want to do is (leaving out lots of details):
heapDown(int index, Node* value) {
int left = 2 * min - 1; // (where do you do this in your code???)
// It's not this simple because you have to check if left and right both exist
min = *array[left] < *array[left + 1] ? left : left + 1; // 1 comparison
if (array[min] < to_place) {
array[index] = array[min];
heapDown(min, value);
} else {
array[index] = value;
}
Your first comparison (third line) is completely wrong. a == b || a == c does not imply that b==c, or indeed give you any information about which of b and c is less. Doing only the second comparison on b and c will usually give you the wrong answer.
Finally, you're doing a new unnecessarily on every invocation, but never doing a delete. So you are slowly but inexorably leaking memory.

How Recursion Works Inside a For Loop

I am new to recursion and trying to understand this code snippet. I'm studying for an exam, and this is a "reviewer" I found from Standford' CIS Education Library (From Binary Trees by Nick Parlante).
I understand the concept, but when we're recursing INSIDE THE LOOP, it all blows! Please help me. Thank you.
countTrees() Solution (C/C++)
/*
For the key values 1...numKeys, how many structurally unique
binary search trees are possible that store those keys.
Strategy: consider that each value could be the root.
Recursively find the size of the left and right subtrees.
*/
int countTrees(int numKeys) {
if (numKeys <=1) {
return(1);
}
// there will be one value at the root, with whatever remains
// on the left and right each forming their own subtrees.
// Iterate through all the values that could be the root...
int sum = 0;
int left, right, root;
for (root=1; root<=numKeys; root++) {
left = countTrees(root - 1);
right = countTrees(numKeys - root);
// number of possible trees with this root == left*right
sum += left*right;
}
return(sum);
}
Imagine the loop being put "on pause" while you go in to the function call.
Just because the function happens to be a recursive call, it works the same as any function you call within a loop.
The new recursive call starts its for loop and again, pauses while calling the functions again, and so on.
For recursion, it's helpful to picture the call stack structure in your mind.
If a recursion sits inside a loop, the structure resembles (almost) a N-ary tree.
The loop controls horizontally how many branches at generated while the recursion decides the height of the tree.
The tree is generated along one specific branch until it reaches the leaf (base condition) then expand horizontally to obtain other leaves and return the previous height and repeat.
I find this perspective generally a good way of thinking.
Look at it this way: There's 3 possible cases for the initial call:
numKeys = 0
numKeys = 1
numKeys > 1
The 0 and 1 cases are simple - the function simply returns 1 and you're done. For numkeys 2, you end up with:
sum = 0
loop(root = 1 -> 2)
root = 1:
left = countTrees(1 - 1) -> countTrees(0) -> 1
right = countTrees(2 - 1) -> countTrees(1) -> 1
sum = sum + 1*1 = 0 + 1 = 1
root = 2:
left = countTrees(2 - 1) -> countTrees(1) -> 1
right = countTrees(2 - 2) -> countTrees(0) -> 1
sum = sum + 1*1 = 1 + 1 = 2
output: 2
for numKeys = 3:
sum = 0
loop(root = 1 -> 3):
root = 1:
left = countTrees(1 - 1) -> countTrees(0) -> 1
right = countTrees(3 - 1) -> countTrees(2) -> 2
sum = sum + 1*2 = 0 + 2 = 2
root = 2:
left = countTrees(2 - 1) -> countTrees(1) -> 1
right = countTrees(3 - 2) -> countTrees(1) -> 1
sum = sum + 1*1 = 2 + 1 = 3
root = 3:
left = countTrees(3 - 1) -> countTrees(2) -> 2
right = countTrees(3 - 3) -> countTrees(0) -> 1
sum = sum + 2*1 = 3 + 2 = 5
output 5
and so on. This function is most likely O(n^2), since for every n keys, you're running 2*n-1 recursive calls, meaning its runtime will grow very quickly.
Just to remember that all the local variables, such as numKeys, sum, left, right, root are in the stack memory. When you go to the n-th depth of the recursive function , there will be n copies of these local variables. When it finishes executing one depth, one copy of these variable will be popped up from the stack.
In this way, you will understand that, the next-level depth will NOT affect the current-level depth local variables (UNLESS you are using references, but we are NOT in this particular problem).
For this particular problem, time-complexity should be carefully paid attention to. Here are my solutions:
/* Q: For the key values 1...n, how many structurally unique binary search
trees (BST) are possible that store those keys.
Strategy: consider that each value could be the root. Recursively
find the size of the left and right subtrees.
http://stackoverflow.com/questions/4795527/
how-recursion-works-inside-a-for-loop */
/* A: It seems that it's the Catalan numbers:
http://en.wikipedia.org/wiki/Catalan_number */
#include <iostream>
#include <vector>
using namespace std;
// Time Complexity: ~O(2^n)
int CountBST(int n)
{
if (n <= 1)
return 1;
int c = 0;
for (int i = 0; i < n; ++i)
{
int lc = CountBST(i);
int rc = CountBST(n-1-i);
c += lc*rc;
}
return c;
}
// Time Complexity: O(n^2)
int CountBST_DP(int n)
{
vector<int> v(n+1, 0);
v[0] = 1;
for (int k = 1; k <= n; ++k)
{
for (int i = 0; i < k; ++i)
v[k] += v[i]*v[k-1-i];
}
return v[n];
}
/* Catalan numbers:
C(n, 2n)
f(n) = --------
(n+1)
2*(2n+1)
f(n+1) = -------- * f(n)
(n+2)
Time Complexity: O(n)
Space Complexity: O(n) - but can be easily reduced to O(1). */
int CountBST_Math(int n)
{
vector<int> v(n+1, 0);
v[0] = 1;
for (int k = 0; k < n; ++k)
v[k+1] = v[k]*2*(2*k+1)/(k+2);
return v[n];
}
int main()
{
for (int n = 1; n <= 10; ++n)
cout << CountBST(n) << '\t' << CountBST_DP(n) <<
'\t' << CountBST_Math(n) << endl;
return 0;
}
/* Output:
1 1 1
2 2 2
5 5 5
14 14 14
42 42 42
132 132 132
429 429 429
1430 1430 1430
4862 4862 4862
16796 16796 16796
*/
You can think of it from the base case, working upward.
So, for base case you have 1 (or less) nodes. There is only 1 structurally unique tree that is possible with 1 node -- that is the node itself. So, if numKeys is less than or equals to 1, just return 1.
Now suppose you have more than 1 key. Well, then one of those keys is the root, some items are in the left branch and some items are in the right branch.
How big are those left and right branches? Well it depends on what is the root element. Since you need to consider the total amount of possible trees, we have to consider all configurations (all possible root values) -- so we iterate over all possible values.
For each iteration i, we know that i is at the root, i - 1 nodes are on the left branch and numKeys - i nodes are on the right branch. But, of course, we already have a function that counts the total number of tree configurations given the number of nodes! It's the function we're writing. So, recursive call the function to get the number of possible tree configurations of the left and right subtrees. The total number of trees possible with i at the root is then the product of those two numbers (for each configuration of the left subtree, all possible right subtrees can happen).
After you sum it all up, you're done.
So, if you kind of lay it out there's nothing special with calling the function recursively from within a loop -- it's just a tool that we need for our algorithm. I would also recommend (as Grammin did) to run this through a debugger and see what is going on at each step.
Each call has its own variable space, as one would expect. The complexity comes from the fact that the execution of the function is "interrupted" in order to execute -again- the same function.
This code:
for (root=1; root<=numKeys; root++) {
left = countTrees(root - 1);
right = countTrees(numKeys - root);
// number of possible trees with this root == left*right
sum += left*right;
}
Could be rewritten this way in Plain C:
root = 1;
Loop:
if ( !( root <= numkeys ) ) {
goto EndLoop;
}
left = countTrees( root -1 );
right = countTrees ( numkeys - root );
sum += left * right
++root;
goto Loop;
EndLoop:
// more things...
It is actually translated by the compiler to something like that, but in assembler. As you can see the loop is controled by a pair of variables, numkeys and root, and their values are not modified because of the execution of another instance of the same procedure. When the callee returns, the caller resumes the execution, with the same values for all values it had before the recursive call.
IMO, key element here is to understand function call frames, call stack, and how they work together.
In your example, you have bunch of local variables which are initialised but not finalised in the first call. It's important to observe those local variables to understand the whole idea. At each call, the local variables are updated and finally returned in a backwards manner (most likely it's stored in a register before each function call frame is popped off from the stack) up until it's added to the initial function call's sum variable.
The important distinction here is - where to return. If you need accumulated sum value like in your example, you cannot return inside the function which would cause to early-return/exit. However, if you depend on a value to be in a certain state, then you can check if this state is hit inside the for loop and return immediately without going all the way up.

Randomly permute N first elements of a singly linked list

I have to permute N first elements of a singly linked list of length n, randomly. Each element is defined as:
typedef struct E_s
{
struct E_s *next;
}E_t;
I have a root element and I can traverse the whole linked list of size n. What is the most efficient technique to permute only N first elements (starting from root) randomly?
So, given a->b->c->d->e->f->...x->y->z I need to make smth. like f->a->e->c->b->...x->y->z
My specific case:
n-N is about 20% relative to n
I have limited RAM resources, the best algorithm should make it in place
I have to do it in a loop, in many iterations, so the speed does matter
The ideal randomness (uniform distribution) is not required, it's Ok if it's "almost" random
Before making permutations, I traverse the N elements already (for other needs), so maybe I could use this for permutations as well
UPDATE: I found this paper. It states it presents an algorithm of O(log n) stack space and expected O(n log n) time.
I've not tried it, but you could use a "randomized merge-sort".
To be more precise, you randomize the merge-routine. You do not merge the two sub-lists systematically, but you do it based on a coin toss (i.e. with probability 0.5 you select the first element of the first sublist, with probability 0.5 you select the first element of the right sublist).
This should run in O(n log n) and use O(1) space (if properly implemented).
Below you find a sample implementation in C you might adapt to your needs. Note that this implementation uses randomisation at two places: In splitList and in merge. However, you might choose just one of these two places. I'm not sure if the distribution is random (I'm almost sure it is not), but some test cases yielded decent results.
#include <stdio.h>
#include <stdlib.h>
#define N 40
typedef struct _node{
int value;
struct _node *next;
} node;
void splitList(node *x, node **leftList, node **rightList){
int lr=0; // left-right-list-indicator
*leftList = 0;
*rightList = 0;
while (x){
node *xx = x->next;
lr=rand()%2;
if (lr==0){
x->next = *leftList;
*leftList = x;
}
else {
x->next = *rightList;
*rightList = x;
}
x=xx;
lr=(lr+1)%2;
}
}
void merge(node *left, node *right, node **result){
*result = 0;
while (left || right){
if (!left){
node *xx = right;
while (right->next){
right = right->next;
}
right->next = *result;
*result = xx;
return;
}
if (!right){
node *xx = left;
while (left->next){
left = left->next;
}
left->next = *result;
*result = xx;
return;
}
if (rand()%2==0){
node *xx = right->next;
right->next = *result;
*result = right;
right = xx;
}
else {
node *xx = left->next;
left->next = *result;
*result = left;
left = xx;
}
}
}
void mergeRandomize(node **x){
if ((!*x) || !(*x)->next){
return;
}
node *left;
node *right;
splitList(*x, &left, &right);
mergeRandomize(&left);
mergeRandomize(&right);
merge(left, right, &*x);
}
int main(int argc, char *argv[]) {
srand(time(NULL));
printf("Original Linked List\n");
int i;
node *x = (node*)malloc(sizeof(node));;
node *root=x;
x->value=0;
for(i=1; i<N; ++i){
node *xx;
xx = (node*)malloc(sizeof(node));
xx->value=i;
xx->next=0;
x->next = xx;
x = xx;
}
x=root;
do {
printf ("%d, ", x->value);
x=x->next;
} while (x);
x = root;
node *left, *right;
mergeRandomize(&x);
if (!x){
printf ("Error.\n");
return -1;
}
printf ("\nNow randomized:\n");
do {
printf ("%d, ", x->value);
x=x->next;
} while (x);
printf ("\n");
return 0;
}
Convert to an array, use a Fisher-Yates shuffle, and convert back to a list.
I don't believe there's any efficient way to randomly shuffle singly-linked lists without an intermediate data structure. I'd just read the first N elements into an array, perform a Fisher-Yates shuffle, then reconstruct those first N elements into the singly-linked list.
First, get the length of the list and the last element. You say you already do a traversal before randomization, that would be a good time.
Then, turn it into a circular list by linking the first element to the last element. Get four pointers into the list by dividing the size by four and iterating through it for a second pass. (These pointers could also be obtained from the previous pass by incrementing once, twice, and three times per four iterations in the previous traversal.)
For the randomization pass, traverse again and swap pointers 0 and 2 and pointers 1 and 3 with 50% probability. (Do either both swap operations or neither; just one swap will split the list in two.)
Here is some example code. It looks like it could be a little more random, but I suppose a few more passes could do the trick. Anyway, analyzing the algorithm is more difficult than writing it :vP . Apologies for the lack of indentation; I just punched it into ideone in the browser.
http://ideone.com/9I7mx
#include <iostream>
#include <cstdlib>
#include <ctime>
using namespace std;
struct list_node {
int v;
list_node *n;
list_node( int inv, list_node *inn )
: v( inv ), n( inn) {}
};
int main() {
srand( time(0) );
// initialize the list and 4 pointers at even intervals
list_node *n_first = new list_node( 0, 0 ), *n = n_first;
list_node *p[4];
p[0] = n_first;
for ( int i = 1; i < 20; ++ i ) {
n = new list_node( i, n );
if ( i % (20/4) == 0 ) p[ i / (20/4) ] = n;
}
// intervals must be coprime to list length!
p[2] = p[2]->n;
p[3] = p[3]->n;
// turn it into a circular list
n_first->n = n;
// swap the pointers around to reshape the circular list
// one swap cuts a circular list in two, or joins two circular lists
// so perform one cut and one join, effectively reordering elements.
for ( int i = 0; i < 20; ++ i ) {
list_node *p_old[4];
copy( p, p + 4, p_old );
p[0] = p[0]->n;
p[1] = p[1]->n;
p[2] = p[2]->n;
p[3] = p[3]->n;
if ( rand() % 2 ) {
swap( p_old[0]->n, p_old[2]->n );
swap( p_old[1]->n, p_old[3]->n );
}
}
// you might want to turn it back into a NULL-terminated list
// print results
for ( int i = 0; i < 20; ++ i ) {
cout << n->v << ", ";
n = n->n;
}
cout << '\n';
}
For the case when N is really big (so it doesn't fit your memory), you can do the following (a sort of Knuth's 3.4.2P):
j = N
k = random between 1 and j
traverse the input list, find k-th item and output it; remove the said item from the sequence (or mark it somehow so that you won't consider it at the next traversal)
decrease j and return to 2 unless j==0
output the rest of the list
Beware that this is O(N^2), unless you can ensure random access in the step 3.
In case the N is relatively small, so that N items fit into the memory, just load them into array and shuffle, like #Mitch proposes.
If you know both N and n, I think you can do it simply. It's fully random, too. You only iterate through the whole list once, and through the randomized part each time you add a node. I think that's O(n+NlogN) or O(n+N^2). I'm not sure. It's based upon updating the conditional probability that a node is selected for the random portion given what happened to previous nodes.
Determine the probability that a certain node will be selected for the random portion given what happened to previous nodes (p=(N-size)/(n-position) where size is number of nodes previously chosen and position is number of nodes previously considered)
If node is not selected for random part, move to step 4. If node is selected for the random part, randomly choose place in random part based upon the size so far (place=(random between 0 and 1) * size, size is again number of previous nodes).
Place the node where it needs to go, update the pointers. Increment size. Change to looking at the node that previously pointed at what you were just looking at and moved.
Increment position, look at the next node.
I don't know C, but I can give you the pseudocode. In this, I refer to the permutation as the first elements that are randomized.
integer size=0; //size of permutation
integer position=0 //number of nodes you've traversed so far
Node head=head of linked list //this holds the node at the head of your linked list.
Node current_node=head //Starting at head, you'll move this down the list to check each node, whether you put it in the list.
Node previous=head //stores the previous node for changing pointers. starts at head to avoid asking for the next field on a null node
While ((size not equal to N) or (current_node is not null)){ //iterating through the list until the permutation is full. We should never pass the end of list, but just in case, I include that condition)
pperm=(N-size)/(n-position) //probability that a selected node will be in the permutation.
if ([generate a random decimal between 0 and 1] < pperm) //this decides whether or not the current node will go in the permutation
if (j is not equal to 0){ //in case we are at start of list, there's no need to change the list
pfirst=1/(size+1) //probability that, if you select a node to be in the permutation, that it will be first. Since the permutation has
//zero elements at start, adding an element will make it the initial node of a permutation and percent chance=1.
integer place_in_permutation = round down([generate a random decimal between 0 and 1]/pfirst) //place in the permutation. note that the head =0.
previous.next=current_node.next
if(place_in_permutation==0){ //if placing current node first, must change the head
current_node.next=head //set the current Node to point to the previous head
head=current_node //set the variable head to point to the current node
}
else{
Node temp=head
for (counter starts at zero. counter is less than place_in_permutation-1. Each iteration, increment counter){
counter=counter.next
} //at this time, temp should point to the node right before the insertion spot
current_node.next=temp.next
temp.next=current_node
}
current_node=previous
}
size++ //since we add one to the permutation, increase the size of the permutation
}
j++;
previous=current_node
current_node=current_node.next
}
You could probably increase the efficiency if you held on to the most recently added node in case you had to add one to the right of it.
Similar to Vlad's answer, here is a slight improvement (statistically):
Indices in algorithm are 1 based.
Initialize lastR = -1
If N <= 1 go to step 6.
Randomize number r between 1 and N.
if r != N
4.1 Traverse the list to item r and its predecessor.
If lastR != -1
If r == lastR, your pointer for the of the r'th item predecessor is still there.
If r < lastR, traverse to it from the beginning of the list.
If r > lastR, traverse to it from the predecessor of the lastR'th item.
4.2 remove the r'th item from the list into a result list as the tail.
4.3 lastR = r
Decrease N by one and go to step 2.
link the tail of the result list to the head of the remaining input list. You now have the original list with the first N items permutated.
Since you do not have random access, this will reduce the traversing time you will need within the list (I assume that by half, so asymptotically, you won't gain anything).
O(NlogN) easy to implement solution that does not require extra storage:
Say you want to randomize L:
is L has 1 or 0 elements you are done
create two empty lists L1 and L2
loop over L destructively moving its elements to L1 or L2 choosing between the two at random.
repeat the process for L1 and L2 (recurse!)
join L1 and L2 into L3
return L3
Update
At step 3, L should be divided into equal sized (+-1) lists L1 and L2 in order to guaranty best case complexity (N*log N). That can be done adjusting the probability of one element going into L1 or L2 dynamically:
p(insert element into L1) = (1/2 * len0(L) - len(L1)) / len(L)
where
len(M) is the current number of elements in list M
len0(L) is the number of elements there was in L at the beginning of step 3
There is an algorithm takes O(sqrt(N)) space and O(N) time, for a singly linked list.
It does not generate a uniform distribution over all permutation sequence, but it can gives good permutation that is not easily distinguishable. The basic idea is similar to permute a matrix by rows and columns as described below.
Algorithm
Let the size of the elements to be N, and m = floor(sqrt(N)). Assuming a "square matrix" N = m*m will make this method much clear.
In the first pass, you should store the pointers of elements that is separated by every m elements as p_0, p_1, p_2, ..., p_m. That is, p_0->next->...->next(m times) == p_1 should be true.
Permute each row
For i = 0 to m do:
Index all elements between p_i->next to p_(i+1)->next in the link list by an array of size O(m)
Shuffle this array using standard method
Relink the elements using this shuffled array
Permute each column.
Initialize an array A to store pointers p_0, ..., p_m. It is used to traverse the columns
For i = 0 to m do
Index all elements pointed A[0], A[1], ..., A[m-1] in the link list by an array of size m
Shuffle this array
Relink the elements using this shuffled array
Advance the pointer to next column A[i] := A[i]->next
Note that p_0 is an element point to the first element and the p_m point to the last element. Also, if N != m*m, you may use m+1 separation for some p_i instead. Now you get a "matrix" such that the p_i point to the start of each row.
Analysis and randomness
Space complexity: This algorithm need O(m) space to store the start of row. O(m) space to store the array and O(m) space to store the extra pointer during column permutation. Hence, time complexity is ~ O(3*sqrt(N)). For N = 1000000, it is around 3000 entries and 12 kB memory.
Time complexity: It is obviously O(N). It either walk through the "matrix" row by row or column by column
Randomness: The first thing to note is that each element can go to anywhere in the matrix by row and column permutation. It is very important that elements can go to anywhere in the linked list. Second, though it does not generate all permutation sequence, it does generate part of them. To find the number of permutation, we assume N=m*m, each row permutation has m! and there is m row, so we have (m!)^m. If column permutation is also include, it is exactly equal to (m!)^(2*m), so it is almost impossible to get the same sequence.
It is highly recommended to repeat the second and third step by at least one more time to get an more random sequence. Because it can suppress almost all the row and column correlation to its original location. It is also important when your list is not "square". Depends on your need, you may want to use even more repetition. The more repetition you use, the more permutation it can be and the more random it is. I remember that it is possible to generate uniform distribution for N=9 and I guess that it is possible to prove that as repetition tends to infinity, it is the same as the true uniform distribution.
Edit: The time and space complexity is tight bound and is almost the same in any situation. I think this space consumption can satisfy your need. If you have any doubt, you may try it in a small list and I think you will find it useful.
The list randomizer below has complexity O(N*log N) and O(1) memory usage.
It is based on the recursive algorithm described on my other post modified to be iterative instead of recursive in order to eliminate the O(logN) memory usage.
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
typedef struct node {
struct node *next;
char *str;
} node;
unsigned int
next_power_of_two(unsigned int v) {
v--;
v |= v >> 1;
v |= v >> 2;
v |= v >> 4;
v |= v >> 8;
v |= v >> 16;
return v + 1;
}
void
dump_list(node *l) {
printf("list:");
for (; l; l = l->next) printf(" %s", l->str);
printf("\n");
}
node *
array_to_list(unsigned int len, char *str[]) {
unsigned int i;
node *list;
node **last = &list;
for (i = 0; i < len; i++) {
node *n = malloc(sizeof(node));
n->str = str[i];
*last = n;
last = &n->next;
}
*last = NULL;
return list;
}
node **
reorder_list(node **last, unsigned int po2, unsigned int len) {
node *l = *last;
node **last_a = last;
node *b = NULL;
node **last_b = &b;
unsigned int len_a = 0;
unsigned int i;
for (i = len; i; i--) {
double pa = (1.0 + RAND_MAX) * (po2 - len_a) / i;
unsigned int r = rand();
if (r < pa) {
*last_a = l;
last_a = &l->next;
len_a++;
}
else {
*last_b = l;
last_b = &l->next;
}
l = l->next;
}
*last_b = l;
*last_a = b;
return last_b;
}
unsigned int
min(unsigned int a, unsigned int b) {
return (a > b ? b : a);
}
randomize_list(node **l, unsigned int len) {
unsigned int po2 = next_power_of_two(len);
for (; po2 > 1; po2 >>= 1) {
unsigned int j;
node **last = l;
for (j = 0; j < len; j += po2)
last = reorder_list(last, po2 >> 1, min(po2, len - j));
}
}
int
main(int len, char *str[]) {
if (len > 1) {
node *l;
len--; str++; /* skip program name */
l = array_to_list(len, str);
randomize_list(&l, len);
dump_list(l);
}
return 0;
}
/* try as: a.out list of words foo bar doz li 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
*/
Note that this version of the algorithm is completely cache unfriendly, the recursive version would probably perform much better!
If both the following conditions are true:
you have plenty of program memory (many embedded hardwares execute directly from flash);
your solution does not suffer that your "randomness" repeats often,
Then you can choose a sufficiently large set of specific permutations, defined at programming time, write a code to write the code that implements each, and then iterate over them at runtime.