Can someone explain how recursion works when finding all subsets?

Can someone explain how recursion works when finding all subsets? - c++

I cannot, for the life of me, picture recursion and what it's doing. I struggle with this a lot. From the Competitive Programmer's Handbook, I uncovered the following snippet of code in C++ as a solution to the following problem:
Consider the problem of generating all subsets of a set of n elements.
For example, the subsets of {0,1,2} are ;, {0}, {1}, {2}, {0,1},
{0,2}, {1,2} and {0,1,2}.
An elegant way to go through all subsets of a set is to use recursion.
The following function search generates the subsets of the set
{0,1,...,n − 1}. The function maintains a vector subset that will
contain the elements of each subset. The search begins when the
function is called with parameter 0.
When the function search is called with parameter k, it decides
whether to include the element k in the subset or not, and in both
cases, then calls itself with parameter k + 1 However, if k = n, the
function notices that all elements have been processed and a subset
has been generated.
void search(int k) {
if (k == n) {
// process subset
} else {
search(k+1);
subset.push_back(k);
search(k+1);
subset.pop_back();
}
}
So sure, this function works and I have done it about 3 times by hand to see that it does work flawlessly. But why?
Short of memorizing all recursive solutions for all problems I will never be able to come up with this kind of solution. What kind of abstraction is being made here? What is the more general concept that is being used here?
I've always struggled with recursion so any help is appreciated. Thank you.

For each k < n we simply call search(k+1) recursively. once with the value k inside your set and once without it.
search(k+1); // call search (k+1) with k NOT inside the set
subset.push_back(k); // puts the value k inside the set
search(k+1); // call search (k+1) with k inside the set
subset.pop_back(); // removes the value k from the set
Once we reach n==k the recursion is terminated.
Imagine a binary tree of depth n, where each level represents the current value and the two branches, the decision if the value goes into your final set or not. The leaves represent all final sets.
So given n=3 and starting with k=0 you get:
search(0);
-> search(1); // with 0 in
->-> search(2); // with 0 in AND 1 in
->->-> search (3); // with 0 in AND 1 in AND 2 in. terminates with (0,1,2)
->->-> search (3); // with 0 in AND 1 in AND 2 not in. terminates with (0,1)
->-> search(2); // with 0 in AND 1 not in
->->-> search (3); // with 0 in AND 1 not in AND 2 in. terminates with (0,2)
->->-> search (3); // with 0 in AND 1 not in AND 2 not in. terminates with (0)
-> search(1); // with 0 not in
->-> search(2); // with 0 not in AND 1 in
->->-> search (3); // with 0 not in AND 1 in AND 2 in. terminates with (1,2)
->->-> search (3); // with 0 not in AND 1 in AND 2 not in. terminates with (1)
->-> search(2); // with 0 not in AND 1 not in
->->-> search (3); // with 0 not in AND 1 not in AND 2 in. terminates with (2)
->->-> search (3); // with 0 not in AND 1 not in AND 2 not in. terminates with ()
As john smartly pointed out in his comment, the recursion uses the fact that:
all_subsets(a1,a2,...,an) == all_subsets(a2,...,an) U {a1, all_subsets(a2,...,an)} where U is the set union operator.
Many other mathematical definitions will translate into recursive calls naturally.

I think what you are lacking is visualization. So I will suggest you to visit sites like algorithm-visualizer.org , pythontutor.com .
You can paste this code snippet here and run it line by line so that you can understand how the code flow works.
#include <bits/stdc++.h>
using namespace std;
void subsetsUtil(vector<int>& A, vector<vector<int> >& res, vector<int>& subset, int index) {
res.push_back(subset);
for (int i = index; i < A.size(); i++) {
subset.push_back(A[i]);
subsetsUtil(A, res, subset, i + 1);
}
return;
}
vector<vector<int> > subsets(vector<int>& A) {
vector<int> subset;
vector<vector<int> > res;
int index = 0;
subsetsUtil(A, res, subset, index);
return res;
}
int32_t main() {
vector<int> array = { 1, 2, 3 };
vector<vector<int> > res = subsets(array);
for (int i = 0; i < res.size(); i++) {
for (int j = 0; j < res[i].size(); j++)
cout << res[i][j] << " ";
cout << endl;
}
return 0;
}
It's good that you are really trying to learn. This will help you in competitive programming a lot. Hope this will help you

This is not only your problem. Everyone who starts learning recursion first time, he/she will face this. The main thing is nothing but just visualization. Literally it's tough though.
If you try to visualize any recursion code by making it handy(using pen and paper), you will just see that "Oh!, it's working". But you should know that most of recursions have a recurrence relation. Based on that, the function recurs. Similarly, for finding all the the subsets of a particular set, there is a recurrence relation.That is the following...
By taking a particular item + By not taking that item
Here in your code, "Taking a particular item" implies "Push_back" and "Not taking a particular item" implies "Pop_back". That's it.
One of the possibility is, taking no item. We call it Null set .
Another possibility is, taking all the items. Here {0,1,2}.
From permutation combination theory, we can calculate the number of subsets. That is 2n, where n is number of items. Here n=3. So the number of subsets will be 23 = 8.
For 0, take it or throw it , possibilities = 2
For 1, take it or throw it , possibilities = 2
For 2, take it or throw it , possibilities = 2
So,total number of subsets is 2*2*2 = 8 (including Null Set).
If you discard the Null Set , so the total number of subsets will be 8-1 = 7.
That's the theory behind your recursion code.

Related

How to erase elements more efficiently from a vector or set?

Problem statement:
Input:
First two inputs are integers n and m. n is the number of knights fighting in the tournament (2 <= n <= 100000, 1 <= m <= n-1). m is the number of battles that will take place.
The next line contains n power levels.
The next m lines contain two integers l and r, indicating the range of knight positions to compete in the ith battle.
After each battle, all nights apart from the one with the highest power level will be eliminated.
The range for each battle is given in terms of the new positions of the knights, not the original positions.
Output:
Output m lines, the ith line containing the original positions (indices) of the knights from that battle. Each line is in ascending order.
Sample Input:
8 4
1 0 5 6 2 3 7 4
1 3
2 4
1 3
0 1
Sample Output:
1 2
4 5
3 7
0
Here is a visualisation of this process.
1 2
[(1,0),(0,1),(5,2),(6,3),(2,4),(3,5),(7,6),(4,7)]
-----------------
4 5
[(1,0),(6,3),(2,4),(3,5),(7,6),(4,7)]
-----------------
3 7
[(1,0),(6,3),(7,6),(4,7)]
-----------------
0
[(1,0),(7,6)]
-----------
[(7,6)]
I have solved this problem. My program produces the correct output, however, it is O(n*m) = O(n^2). I believe that if I erase knights more efficiently from the vector, efficiency can be increased. Would it be more efficient to erase elements using a set? I.e. erase contiguous segments rather that individual knights. Is there an alternative way to do this that is more efficient?
#define INPUT1(x) scanf("%d", &x)
#define INPUT2(x, y) scanf("%d%d", &x, &y)
#define OUTPUT1(x) printf("%d\n", x);
int main(int argc, char const *argv[]) {
int n, m;
INPUT2(n, m);
vector< pair<int,int> > knights(n);
for (int i = 0; i < n; i++) {
int power;
INPUT(power);
knights[i] = make_pair(power, i);
}
while(m--) {
int l, r;
INPUT2(l, r);
int max_in_range = knights[l].first;
for (int i = l+1; i <= r; i++) if (knights[i].first > max_in_range) {
max_in_range = knights[i].first;
}
int offset = l;
int range = r-l+1;
while (range--) {
if (knights[offset].first != max_in_range) {
OUTPUT1(knights[offset].second));
knights.erase(knights.begin()+offset);
}
else offset++;
}
printf("\n");
}
}

Well, removing from vector wouldn't be efficient for sure. Removing from set, or unordered set would be more effective (use iterators instead of indexes).
Yet the problem will still remain O(n^2), because you have two nested whiles running n*m times.
--EDIT--
I believe I understand the question now :)
First let's calculate the complexity of your code above. Your worst case would be the case that max range in all battles is 1 (two nights for each battle) and the battles are not ordered with respect to the position. Which means you have m battles (in this case m = n-1 ~= O(n))
The first while loop runs n times
For runs for once every time which makes it n*1 = n in total
The second while loop runs once every time which makes it n again.
Deleting from vector means n-1 shifts that makes it O(n).
Thus with the complexity of the vector total complexity is O(n^2)
First of all, you don't really need the inner for loop. Take the first knight as the max in range, compare the rest in the range one-by-one and remove the defeated ones.
Now, i believe it can be done in O(nlogn) with using std::map. The key to the map is the position and the value is the level of the knight.
Before proceeding, finding and removing an element in map is logarithmic, iterating is constant.
Finally, your code should look like:
while(m--) // n times
strongest = map.find(first_position); // find is log(n) --> n*log(n)
for (opponent = next of strongest; // this will run 1 times, since every range is 1
opponent in range;
opponent = next opponent) // iterating is constant
// removing from map is log(n) --> n * 1 * log(n)
if strongest < opponent
remove strongest, opponent is the new strongest
else
remove opponent, (be careful to remove it after iterating to next)
Ok, now the upper bound would be O(2*nlogn) = O(nlogn). If the ranges increases, that makes the run time of upper loop decrease but increases the number of remove operations. I'm sure the upper bound won't change, let's make it a homework for you to calculate :)

A solution with a treap is pretty straightforward.
For each query, you need to split the treap by implicit key to obtain the subtree that corresponds to the [l, r] range (it takes O(log n) time).
After that, you can iterate over the subtree and find the knight with the maximum strength. After that, you just need to merge the [0, l) and [r + 1, end) parts of the treap with the node that corresponds to this knight.
It's clear that all parts of the solution except for the subtree traversal and printing work in O(log n) time per query. However, each operation reinserts only one knight and erase the rest from the range, so the size of the output (and the sum of sizes of subtrees) is linear in n. So the total time complexity is O(n log n).
I don't think you can solve with standard stl containers because there'no standard container that supports getting an iterator by index quickly and removing arbitrary elements.

Varying initializer in a 'for loop' in C++

int i = 0;
for(; i<size-1; i++) {
int temp = arr[i];
arr[i] = arr[i+1];
arr[i+1] = temp;
}
Here I started with the fist position of array. What if after the loop I need to execute the for loop again where the for loop starts with the next position of array.
Like for first for loop starts from: Array[0]
Second iteration: Array[1]
Third iteration: Array[2]
Example:
For array: 1 2 3 4 5
for i=0: 2 1 3 4 5, 2 3 1 4 5, 2 3 4 1 5, 2 3 4 5 1
for i=1: 1 3 2 4 5, 1 3 4 2 5, 1 3 4 5 2 so on.

You can nest loops inside each other, including the ability for the inner loop to access the iterator value of the outer loop. Thus:
for(int start = 0; start < size-1; start++) {
for(int i = start; i < size-1; i++) {
// Inner code on 'i'
}
}
Would repeat your loop with an increasing start value, thus repeating with a higher initial value for i until you're gone through your list.

Suppose you have a routine to generate all possible permutations of the array elements for a given length n. Suppose the routine, after processing all n! permutations, leaves the n items of the array in their initial order.
Question: how can we build a routine to make all possible permutations of an array with (n+1) elements?
Answer:
Generate all permutations of the initial n elements, each time process the whole array; this way we have processed all n! permutations with the same last item.
Now, swap the (n+1)-st item with one of those n and repeat permuting n elements – we get another n! permutations with a new last item.
The n elements are left in their previous order, so put that last item back into its initial place and choose another one to put at the end of an array. Reiterate permuting n items.
And so on.
Remember, after each call the routine leaves the n-items array in its initial order. To retain this property at n+1 we need to make sure the same element gets finally placed at the end of an array after the (n+1)-st iteration of n! permutations.
This is how you can do that:
void ProcessAllPermutations(int arr[], int arrLen, int permLen)
{
if(permLen == 1)
ProcessThePermutation(arr, arrLen); // print the permutation
else
{
int lastpos = permLen - 1; // last item position for swaps
for(int pos = lastpos; pos >= 0; pos--) // pos of item to swap with the last
{
swap(arr[pos], arr[lastpos]); // put the chosen item at the end
ProcessAllPermutations(arr, arrLen, permLen - 1);
swap(arr[pos], arr[lastpos]); // put the chosen item back at pos
}
}
}
and here is an example of the routine running: https://ideone.com/sXp35O
Note, however, that this approach is highly ineffective:
It may work in a reasonable time for very small input size only. The number of permutations is a factorial function of the array length, and it grows faster than exponentially, which makes really BIG number of tests.
The routine has no short return. Even if the first or second permutation is the correct result, the routine will perform all the rest of n! unnecessary tests, too. Of course one can add a return path to break iteration, but that would make the code somewhat ugly. And it would bring no significant gain, because the routine will have to make n!/2 test on average.
Each generated permutation appears deep in the last level of the recursion. Testing for a correct result requires making a call to ProcessThePermutation from within ProcessAllPermutations, so it is difficult to replace the callee with some other function. The caller function must be modified each time you need another method of testing / procesing / whatever. Or one would have to provide a pointer to a processing function (a 'callback') and push it down through all the recursion, down to the place where the call will happen. This might be done indirectly by a virtual function in some context object, so it would look quite nice – but the overhead of passing additional data down the recursive calls can not be avoided.
The routine has yet another interesting property: it does not rely on the data values. Elements of the array are never compared. This may sometimes be an advantage: the routine can permute any kind of objects, even if they are not comparable. On the other hand it can not detect duplicates, so in case of equal items it will make repeated results. In a degenerate case of all n equal items the result will be n! equal sequences.
So if you ask how to generate all permutations to detect a sorted one, I must answer: DON'T.
Do learn effective sorting algorithms instead.

Creating a subset of a set iteratively using vector

So I want to create a subset, run that subset through code, then create a new subset. I'm using a vector for the set and subset. So far I have 3 nested for loops but I'm having trouble figuring out the variables I need.
Here's what I want to do. set = {0, 1, 2, 3, 4, 5} the value matches the index just to simplify this example. I now want subset = {} -> {0} -> {1} -> ... -> {0,1} -> {0,2} -> ... -> {0,5} -> {0,1,2} -> ... -> {0,4,5}. I'm having trouble representing the conditions in terms of variables.
Basically I want the first for loop to increase the subset size. from 0 to set.size() (this is easy). Within that loop, I want to have an iterator corresponding to the index in the element of the subset. I have this iterator initialized to subset.size(), so that we work with the last element first, then work our way to the first element in the subset. then the 3rd for loop, I want to iterate between possible values from the set. Let's say our current subset = {0,1,2} how do I let my program know to put the value '2' inside the last element of the subset, then 1 then 0?
I'm thinking it would involve something with taking the difference from set.size()-1 and subset.size()-1? But I'm not quite sure how. so then I want to iterate through until {0,1,5} and then {0,4,5} but again I'm not sure how to tell the program to stop at 4, as opposed to 5. Again I think this is something with difference but I can't quite figure it out.
to recap:
for loop to iterate through subset size
for loop to iterate through subset "working" element, starting from back
for loop to iterate through that index of subset,
starting from the correct corresponding set value to ending
at the correct corresponding set value
such that the subset goes from {} -> {0} -> {1} ->...-> {0,1} -> {4,5} -> {1,2,3} -> ... -> {1,4,5} and I dont actually need subset = {1,2,3,4,5} but it doesn't hurt my code if I can't stop before that. Again I'm looking to represent the start and end points as variables to make the inner loops work, but I can't figure it out. Huge thanks to anyone who can help me out.

this is approximately how I would go about it.
//handle null subset
for ( int size = 1; size < n; i++ ) {
int indices[size];
for ( int i = 0; i < size; i++ ) indices[i] = i;
while ( indices[0] <= n - size ) {
int i;
for ( i = 1; indices[size - i] == n - i; i-- );
indices[i]++;
for ( i = i + 1; i < size; i++ ) indices[i] = indices[i-1] + 1;
//print out elems using the indices in `indices`
}
//done with all subsets of size `size`
}
The outer loop should be pretty self explanatory. Including 0 seemed like it was going to make some of the inner logic annoying so I started at subsets of size 1.
indices holds the indices of the elements that should be included in the current subset. It starts with the indices 0-size-1.
The condition for the while isn't exactly obvious. The last valid subset this generates contains the last size elements, so if the first index is past n - size we've gone too far.
The inside of the while loop is just incrementing the subset. It looks for the last element that can be incremented and still give a valid subset, increments it, and then resets all of the subsequent elements to be as small as possible. Then you print it out somehow.
And that should be close to something that will do what you want. Let me know if it needs clarifications or corrections.

A trick to enumerate all subsets is to permutate a "selection flag" array, each element of which indicates whether corresponding element in original array is selected.
following is sample code:
void foo(const vector<int>& a)
{
size_t size = a.size();
// selection flag array
// '1' indicates selected, '0' indicates unselected
vector<int> f(size, 0);
for (size_t i = 1; i <= size; i++)
{
// increase the count of selected elements
f[i - 1] = 1;
do
{
for (size_t i = 0; i < size; i++)
{
if (f[i])
{
printf("%d\t", a[i]);
}
}
printf("\n");
} while (next_permutation(f.begin(), f.end(), [](int a, int b){ return a > b; }));
// next_permutation tries to permutate the array
// i.e. '1 1 0 0' -> '1 0 1 0' -> '0 1 1 0' -> ... -> '0 0 1 1'(end)
}
}

Finding a all the combination of a number for a specific sum

Array
A ={1,2,3}
For Sum value = 5
Possible Combination
{3,2} , {1,1,1,1,1} , {2,2,1} and all possiable one
here is my approach:
int count( int S[], int m, int n )
{
// m is the size of the array and n is required sum
// If n is 0 then there is 1 solution (do not include any coin)
if (n == 0)
return 1;
// If n is less than 0 then no solution exists
if (n < 0)
return 0;
// If there are no coins and n is greater than 0, then no solution exist
if (m <=0 && n >= 1)
return 0;
// count is sum of solutions (i) including S[m-1] (ii) excluding S[m-1]
return count( S, m - 1, n ) + count( S, m, n-S[m-1] );
}
My approach Disadvantage: : It have to recalculate the many combination again and again. So it the value of sum is very high so it is very slow. I want to implement this using dynamic programming please provide me an explaination how can i store the calculated value so i can reuse it and reduce time of my program

A very simple change to your solution would be to just add "memoization".
Considering the array S fixed the result of your function just depends on m and n. Therefore you can do the following small change:
int count( int S[], int m, int n ) {
...
if (cache[m][n] == -1) {
cache[m][n] = count( S, m - 1, n ) + count( S, m, n-S[m-1] );
}
return cache[m][n];
}
This way you're only compute the result once for each distinct pair of values m and n.
The idea is to keep a 2d array indexed by (m,n) all initialized to -1 (meaning "not yet computed"). When you're about to compute a value in count you first check if the value has not been computed yet and if this is the case you also store the result in the 2d matrix so you will not recompute the same number again in the future.

I would do it differently:
generate coin array to match sum
genere one coin type
start with the biggest one
add them as much as you can
but the sum must be <= then the target sum
if target sum is achieved store result
recursively call step 1 for next lower coin
but remember last coin array state
if there is no coin=1 then sometimes the result will be invalid
move to next combination
restore last coin array state
remove last coin from it
if there is none to remove then stop
else repeat step 2
count permutations/combinantions if also the order matters
so take valid result and permute it by rules of your problem
to get more solutions from it
it is faster then try every possibility in 1.
example (for 1.):
coins = { 5,2,1 }
sum=7
5 | 2
5 | - | 1 1
- | 2 2 2 | 1
- | 2 2 | 1 1 1
- | 2 | 1 1 1 1 1
| separates recursion layer
there is one recursion level per each coin type
so you need memory for 3 array states in this case (the lengths depends on target sum)
this is acceptable (I saw solutions with much worse space complexity for this problem)
for very big Sums I would use RLE for memory conservations and speedup the process
[edit1] C++ source code
//---------------------------------------------------------------------------
void prn_coins(int *count,int *value,int coins) // just to output solution somewhere
{
int i;
AnsiString s="";
for (i=0;i<coins;i++)
s+=AnsiString().sprintf("%ix%i ",count[i],value[i]);
Form1->mm_log->Lines->Add(s);
}
//---------------------------------------------------------------------------
void get_coins(int *count,int *value,int coins,int sum,int ix=0,int si=0)
{
if (ix>=coins) return; // exit
if (ix==0) // init:
{
ix=0; // first layer
si=0; // no sum in solution for now
for (int i=0;i<coins;i++) count[i]=0; // no coins in solution for now
}
//1. genere actal coint type value[]
count[ix]=(sum-si)/value[ix]; // as close to sum as can
si+=value[ix]*count[ix]; // update actual sum
for(;;)
{
//2. recursion call
if (si==sum) prn_coins(count,value,coins);
else get_coins(count,value,coins,sum,ix+1,si);
//3. next combination
if (count[ix]) { count[ix]--; si-=value[ix]; }
else break;
}
}
//---------------------------------------------------------------------------
void main()
{
const int _coins=3; // coin types
int value[_coins]={5,2,1}; // coin values (must be in descending order !!!)
int count[_coins]={0,0,0}; // coin count in actual solution (RLE)
get_coins(count,value,_coins,7);
}
//-------------------------------------------------------------------------
this code took ~3ms on mine HW setup
just change the prn_coins function to your form of print (I used VCL memo and AnsiSring)
in this code the solution state is automaticly rewriten back to previous state
so no need to further memoize (else it would be necessary to copy the count array before and after recursion)
Now the permutation step would be necessary if:
each coin is unique? (1 1 2 5) != (1 1 2 5)
or just coin type? (1 1 2 5) != (1 2 1 5)
in that case just add the permutation code to prn_coins function ...
but that is different question ...

For dynamic programming you need to generalise your problem. Let S(a, x) be all possible sums of value x, only using values from A starting at index a. Your original problem is S(0, X).
Since you have a discrete function with two parameters you can store its outcomes in a 2d array.
There are some simple cases: there is no solution for a = A.length and X > 0.
There is the set only containing an empty sum for X = 0.
Now, you should find a recursive formula for the other cases and fill the table in such a way that the indices you depend on have already been calculated (hint: consider looping backwards).

n-th or Arbitrary Combination of a Large Set

Say I have a set of numbers from [0, ....., 499]. Combinations are currently being generated sequentially using the C++ std::next_permutation. For reference, the size of each tuple I am pulling out is 3, so I am returning sequential results such as [0,1,2], [0,1,3], [0,1,4], ... [497,498,499].
Now, I want to parallelize the code that this is sitting in, so a sequential generation of these combinations will no longer work. Are there any existing algorithms for computing the ith combination of 3 from 500 numbers?
I want to make sure that each thread, regardless of the iterations of the loop it gets, can compute a standalone combination based on the i it is iterating with. So if I want the combination for i=38 in thread 1, I can compute [1,2,5] while simultaneously computing i=0 in thread 2 as [0,1,2].
EDIT Below statement is irrelevant, I mixed myself up
I've looked at algorithms that utilize factorials to narrow down each individual element from left to right, but I can't use these as 500! sure won't fit into memory. Any suggestions?

Here is my shot:
int k = 527; //The kth combination is calculated
int N=500; //Number of Elements you have
int a=0,b=1,c=2; //a,b,c are the numbers you get out
while(k >= (N-a-1)*(N-a-2)/2){
k -= (N-a-1)*(N-a-2)/2;
a++;
}
b= a+1;
while(k >= N-1-b){
k -= N-1-b;
b++;
}
c = b+1+k;
cout << "["<<a<<","<<b<<","<<c<<"]"<<endl; //The result
Got this thinking about how many combinations there are until the next number is increased. However it only works for three elements. I can't guarantee that it is correct. Would be cool if you compare it to your results and give some feedback.

If you are looking for a way to obtain the lexicographic index or rank of a unique combination instead of a permutation, then your problem falls under the binomial coefficient. The binomial coefficient handles problems of choosing unique combinations in groups of K with a total of N items.
I have written a class in C# to handle common functions for working with the binomial coefficient. It performs the following tasks:
Outputs all the K-indexes in a nice format for any N choose K to a file. The K-indexes can be substituted with more descriptive strings or letters.
Converts the K-indexes to the proper lexicographic index or rank of an entry in the sorted binomial coefficient table. This technique is much faster than older published techniques that rely on iteration. It does this by using a mathematical property inherent in Pascal's Triangle and is very efficient compared to iterating over the set.
Converts the index in a sorted binomial coefficient table to the corresponding K-indexes. I believe it is also faster than older iterative solutions.
Uses Mark Dominus method to calculate the binomial coefficient, which is much less likely to overflow and works with larger numbers.
The class is written in .NET C# and provides a way to manage the objects related to the problem (if any) by using a generic list. The constructor of this class takes a bool value called InitTable that when true will create a generic list to hold the objects to be managed. If this value is false, then it will not create the table. The table does not need to be created in order to use the 4 above methods. Accessor methods are provided to access the table.
There is an associated test class which shows how to use the class and its methods. It has been extensively tested with 2 cases and there are no known bugs.
To read about this class and download the code, see Tablizing The Binomial Coeffieicent.
The following tested code will iterate through each unique combinations:
public void Test10Choose5()
{
String S;
int Loop;
int N = 500; // Total number of elements in the set.
int K = 3; // Total number of elements in each group.
// Create the bin coeff object required to get all
// the combos for this N choose K combination.
BinCoeff<int> BC = new BinCoeff<int>(N, K, false);
int NumCombos = BinCoeff<int>.GetBinCoeff(N, K);
// The Kindexes array specifies the indexes for a lexigraphic element.
int[] KIndexes = new int[K];
StringBuilder SB = new StringBuilder();
// Loop thru all the combinations for this N choose K case.
for (int Combo = 0; Combo < NumCombos; Combo++)
{
// Get the k-indexes for this combination.
BC.GetKIndexes(Combo, KIndexes);
// Verify that the Kindexes returned can be used to retrive the
// rank or lexigraphic order of the KIndexes in the table.
int Val = BC.GetIndex(true, KIndexes);
if (Val != Combo)
{
S = "Val of " + Val.ToString() + " != Combo Value of " + Combo.ToString();
Console.WriteLine(S);
}
SB.Remove(0, SB.Length);
for (Loop = 0; Loop < K; Loop++)
{
SB.Append(KIndexes[Loop].ToString());
if (Loop < K - 1)
SB.Append(" ");
}
S = "KIndexes = " + SB.ToString();
Console.WriteLine(S);
}
}
You should be able to port this class over fairly easily to C++. You probably will not have to port over the generic part of the class to accomplish your goals. Your test case of 500 choose 3 yields 20,708,500 unique combinations, which will fit in a 4 byte int. If 500 choose 3 is simply an example case and you need to choose combinations greater than 3, then you will have to use longs or perhaps fixed point int.

You can describe a particular selection of 3 out of 500 objects as a triple (i, j, k), where i is a number from 0 to 499 (the index of the first number), j ranges from 0 to 498 (the index of the second, skipping over whichever number was first), and k ranges from 0 to 497 (index of the last, skipping both previously-selected numbers). Given that, it's actually pretty easy to enumerate all the possible selections: starting with (0,0,0), increment k until it gets to its maximum value, then increment j and reset k to 0 and so on, until j gets to its maximum value, and so on, until j gets to its own maximum value; then increment i and reset both j and k and continue.
If this description sounds familiar, it's because it's exactly the same way that incrementing a base-10 number works, except that the base is much funkier, and in fact the base varies from digit to digit. You can use this insight to implement a very compact version of the idea: for any integer n from 0 to 500*499*498, you can get:
struct {
int i, j, k;
} triple;
triple AsTriple(int n) {
triple result;
result.k = n % 498;
n = n / 498;
result.j = n % 499;
n = n / 499;
result.i = n % 500; // unnecessary, any legal n will already be between 0 and 499
return result;
}
void PrintSelections(triple t) {
int i, j, k;
i = t.i;
j = t.j + (i <= j ? 1 : 0);
k = t.k + (i <= k ? 1 : 0) + (j <= k ? 1 : 0);
std::cout << "[" << i << "," << j << "," << k << "]" << std::endl;
}
void PrintRange(int start, int end) {
for (int i = start; i < end; ++i) {
PrintSelections(AsTriple(i));
}
}
Now to shard, you can just take the numbers from 0 to 500*499*498, divide them into subranges in any way you'd like, and have each shard compute the permutation for each value in its subrange.
This trick is very handy for any problem in which you need to enumerate subsets.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Can someone explain how recursion works when finding all subsets? - c++

Related

How to erase elements more efficiently from a vector or set?

Varying initializer in a 'for loop' in C++

Creating a subset of a set iteratively using vector

Finding a all the combination of a number for a specific sum

n-th or Arbitrary Combination of a Large Set

Categories

Resources