So, I am trying to solve the following question: https://www.codechef.com/TSTAM15/problems/ACM14AM3
The Mars Orbiter Mission probe lifted-off from the First Launch Pad at Satish Dhawan Space Centre (Sriharikota Range SHAR), Andhra
Pradesh, using a Polar Satellite Launch Vehicle (PSLV) rocket C25 at
09:08 UTC (14:38 IST) on 5 November 2013.
The secret behind this successful launch was the launch pad that ISRO
used. An important part of the launch pad is the launch tower. It is
the long vertical structure which supports the rocket.
ISRO now wants to build a better launch pad for their next mission.
For this, ISRO has acquired a long steel bar, and the launch tower can
be made by cutting a segment from the bar. As part of saving the cost,
the bar they have acquired is not homogeneous.
The bar is made up of several blocks, where the ith block has
durability S[i], which is a number between 0 and 9. A segment is
defined as any contiguous group of one or more blocks.
If they cut out a segment of the bar from ith block to jth block
(i<=j), then the durability of the resultant segment is given by (S[i]*10(j-i) + S[i+1]*10(j-i-1) + S[i+2]*10(j-i-2) + … + S[j] * 10(0)) % M. In other words, if W(i,j) is the base-10 number formed by
concatenating the digits S[i], S[i+1], S[i+2], …, S[j], then
the durability of the segment (i,j) is W(i,j) % M.
For technical reasons that ISRO will not disclose, the durability of
the segment used for building the launch tower should be exactly L.
Given S and M, find the number of ways ISRO can cut out a segment from
the steel bar whose durability is L. Input
The first line contains a string S. The ith character of this string
represents the durability of ith segment. The next line contains a
single integer Q, denoting the number of queries. Each of the next Q
lines contain two space separated integers, denoting M and L. Output
For each query, output the number of ways of cutting the bar on a
separate line. Constraints
1 ≤ |S| ≤ 2 * 10^4
Q ≤ 5
0 < M < 500
0 ≤ L < M
Example
Input:
23128765
3
7 2
9 3
15 5
Output:
9
4
5
Explanation
For M=9, L=3, the substrings whose remainder is 3 when divided by
9 are: 3, 31287, 12 and 876.
Now, what I did was, I initially generate all possible substrings of numbers of the given length, and tried to divide it by the given number to check if it is divisible and added it to the answer. Therefore, my code for the same was,
string s;
cin>>s;
int m,l,ans=0;
for ( i = 0; i < s.length(); i++ )
{
for ( j = i+1; j < s.length(); j++ )
{
string p = s.substr(i,j);
long long num = stoi(p);
if (num%m == l)
ans++;
}
}
cout<<ans<<"\n";
return 0;
But obviously since the input length is upto 10^4, this doesn't work in required time. How can I make it more optimal?
A little advice I can give you is to initialize a variable to s.length() to avoid calling the function each time for each for block.
Ok, here goes, with a working program at the bottom
Major optimization #1
Do not (ever) work with strings when it comes to integer arithmetic. You're converting string => integer over and over and over again (this is an O(n^2) problem), which is painstakingly slow. Besides, it also misses the point.
Solution: first convert your array-of-characters (string) to array-of-numbers. Integer arithmetic is fast.
Major optimization #2
Use a smart conversion from "substring" to number. After transforming the characters to actual integers, they become the factors in the the polynomial a_n * 10^n. To convert a substring of n segments into a number, it is enough to compute sum(a_i * 10^i) for 0 <= i < n.
And nicely enough, if the coefficients a_i are arranged the way they are in the problem's statement, you can use Horner's method (https://en.wikipedia.org/wiki/Horner%27s_method) to very quickly evaluate the numerical value of the substring.
In short: keep a running value of the current substring and growing it by one element is just * 10 + new element
Example: string "128472373".
First substring = "1", value = 1.
For the second substring we need to
add the digit "2" as follows: value = value * 10 + "2", thus: value = 1 * 10 + 2 = 12.
For 3rd substring need to add digit "8": value = value * 10 + "8", thus: value = 12 * 10 + 8 = 128.
Etcetera.
I had some issues with formatting the C++ code inline so I stuck it in IDEone: https://ideone.com/TbJiqK
The gist of the program:
In main loop, loop over all possible start points:
// For all startpoints in the segments array ...
for(int* f=segments; f<segments+n_segments; f++)
// add up the substrings that fullfill the question
n += count_segments(f, segments+n_segments, m, l);
// Output the answer for this question
cout << n << endl;
Implementation of the count_segments() function:
// Find all substrings that % m == l
// Use Horner's algorithm to quickly evaluate sum(a_n*10^n) where
// a_n are the segments' durabilities
int count_segments(int* first, int* last, int m, int l) {
int n = 0, number = 0;
while( first<last ) {
number = number * 10 + *first; // This is Horner's method
if( (number % m)==l ) {
n++;
// If you don't believe - enable this line of output and
// see the numbers matching the combinations of the
//cout << "[" << m << ", " << l << "]: " << number << endl;
}
first++;
}
return n;
}
Related
Can anyone explain how this code for computing of e works? Looks very easy for such complicated task, but I can't even understand the process. It has been created by Xavier Gourdon in 1999.
int main() {
int N = 9009, a[9009], x = 0;
for (int n = N - 1; n > 0; --n) {
a[n] = 1;
}
a[1] = 2, a[0] = 0;
while (N > 9) {
int n = N--;
while (--n) {
a[n] = x % n;
x = 10 * a[n-1] + x/n;
}
printf("%d", x);
}
return 0;
}
I traced the algorithm back to a 1995 paper by Stanley Rabinowitz and Stan Wagon. It's quite interesting.
A bit of background first. Start with the ordinary decimal representation of e:
e = 2.718281828...
This can be expressed as an infinite sum as follows:
e = 2 + 1⁄10(7 + 1⁄10(1 + 1⁄10(8 + 1⁄10(2 + 1⁄10(8 + 1⁄10(1 ...
Obviously this isn't a particularly useful representation; we just have the same digits of e wrapped up inside a complicated expression.
But look what happens when we replace these 1⁄10 factors with the reciprocals of the natural numbers:
e = 2 + 1⁄2(1 + 1⁄3(1 + 1⁄4(1 + 1⁄5(1 + 1⁄6(1 + 1⁄7(1 ...
This so-called mixed-radix representation gives us a sequence consisting of the digit 2 followed by a repeating sequence of 1's. It's easy to see why this works. When we expand the brackets, we end up with the well-known Taylor series for e:
e = 1 + 1 + 1/2! + 1/3! + 1/4! + 1/5! + 1/6! + 1/7! + ...
So how does this algorithm work? Well, we start by filling an array with the mixed-radix number (0; 2; 1; 1; 1; 1; 1; ...). To generate each successive digit, we simply multiply this number by 10 and spit out the leftmost digit.*
But since the number is represented in mixed-radix form, we have to work in a different base at each digit. To do this, we work from right to left, multiplying the nth digit by 10 and replacing it with the resulting value modulo n. If the result was greater than or equal to n, we carry the value x/n to the next digit to the left. (Dividing by n changes the base from 1/n! to 1/(n-1)!, which is what we want). This is effectively what the inner loop does:
while (--n) {
a[n] = x % n;
x = 10 * a[n-1] + x/n;
}
Here, x is initialized to zero at the start of the program, and the initial 0 at the start of the array ensures that it is reset to zero every time the inner loop finishes. As a result, the array will gradually fill with zeroes from the right as the program runs. This is why n can be initialized with the decreasing value N-- at each step of the outer loop.
The additional 9 digits at the end of the array are presumably included to safeguard against rounding errors. When this code is run, x reaches a maximum value of 89671, which means the quotients will be carried across multiple digits.
Notes:
This is a type of spigot algorithm, because it outputs successive digits of e using simple integer arithmetic.
As noted by Rabinowitz and Wagon in their paper, this algorithm was actually invented 50 years ago by A.H.J. Sale
* Except at the first iteration where it outputs two digits ("27")
Here is short code which computes sum of all square numbers(not actually sum of squares) till n,where n can be upto 10 pow 20.
long long res=0;
long long sm=0;
for (long long i = 1; res <=n; i=i+2)
{
res = (res+i);
sm = sm+(res*(n/res));
}
How do we make the above code work faster? Here, the computation of sm is taking time for very large n like 10 pow 20.
Is there any way that the computation of sm can be made faster?
Here res computes all the square numbers like 1,4,9,16,25....
Lets say n=10, then the squares are 1,4,9 and then by the above code the sm is (1)(10/4)+(4)(10/4)+(9)(10/9)=27.
1*10+4*2+9*1=27.
Here the division is integer division.
edit1:
i need to compute sm mentioned in above code.
here sm is summation ( i2 * floor(n/(i2)) ) where i=1 to sqrt(n)
we can find the sum of all square number till n using the formaula :
n * (n + 1) * (2*n + 1) / 6
long summation(long n)
{
return (n * (n + 1) *
(2 * n + 1)) / 6;
}
Is there any way that the computation of sm can be made faster?
If you notice the pattern plus apply some mathematics, yes.
The next perfect square after your very first perfect square (1 in all cases except for n==0) will be the square of ceil(sqrt(first number)).
In other words, the square root of say the nth number, in correspondence to your first number will be given by pow(ceil(sqrt(L)), n).
Now, notice the pattern between squares: 0 1 4 9 16 25...
Difference between 0 and 1 is 1
Difference between 1 and 4 is 3
Difference between 4 and 9 is 5
Difference between 9 and 16 is 7
Difference between 16 and 25 is 9, and so on.
This makes it clear that the difference between two perfect squares is always an odd number.
Proceeding with this knowledge, you'll need to know what must be added to get the next number, the answer to which is (sqrt(square) * 2) + 1).
i.e., current_square + (sqrt(current_square)*2+1) = next_square.
For instance and to prove this equation, consider the perfect square 25. Applying this logic, the next perfect square will be 25 + (sqrt(25) * 2 + 1) = 36, which is correct. Here 11 is added to 25, which is an odd number.
Similarly if you follow this trend, you'll observe all these numbers are odd, with a difference of +2. For finding the next square of 2, you'll need to add (sqrt(22)+1) = 5 to it (4+5=9); for finding the next square (i.e. for 3) you'll need to add (sqrt(32+1) = 7 to it (9+7=16). The difference is always +2.
Moreover, summing the odd number or applying addition is computationally less expensive than performing multiplication or finding square roots of every number, so your complexity should be fine.
Following that, do this:
Collect the first square. (which ideally should be 1, but if n>0 condition is not mentioned, apply the condition if(n!=0) to my logic)
Assign the next term's difference as first_square*2+1. You'll need to add the first square though, as this is not the next square, but the difference between next square and current square. Add the term in a loop like I did below.
Run a loop upto your required number. Collect your required sum given by (square*floor(n/square) in a variable within the loop.
Follow the approach I mentioned above, i.e. add the current square to the next term (difference between current and next square) and increment next square by 2.
A working example for the above logic:
#include <iostream>
#include <cmath>
#define ll long long
int main()
{
ll int n;
std::cin>>n;
// Start from 1: (add case for 0 if input is not >0)
// you can also start from any other square or define a range.
ll int first = 1;
// Square it:
ll int first_square = first * first;
// Find next square:
ll int next = (first_square * 2) + 1;
// Initialize variable to collect your required sum:
ll int sum = 0;
ll int square = first_square;
while ((square >= 0 && square <= n))
{
sum += (square *floor(n/square));
// Add the perfect square:
square += next;
// Next odd number to be added:
next += 2;
}
std::cout<<sum;
return 0;
}
Given an array A with size N. Value of a subset of Array A is defined as product of all numbers in that subset. We have to return the product of values of all possible non-empty subsets of array A %(10^9+7).
E.G. array A {3,5}
` Value{3} = 3,
Value{5} = 5,
Value{3,5} = 5*3 = 15
answer = 3*5*15 %(10^9+7).
Can someone explain the mathematics behind the problem. I am thinking of solving it by combination to solve it efficiently.
I have tried using brute force it gives correct answer but it is way too slow.
Next approach is using combination. Now i think that if we take all the sets and multiply all the numbers in those set then we will get the correct answer. Thus i have to find out how many times a number is coming in calculation of answer. In the example 5 and 3 both come 2 times. If we look closely, each number in a will come same number of times.
You're heading in the right direction.
Let x be an element of the given array A. In our final answer, x appears p number of times, where p is equivalent to the number of subsets of A possible that include x.
How to calculate p? Once we have decided that we will definitely include x in our subset, we have two choices for the rest N-1 elements: either include them in set or do not. So, we conclude p = 2^(N-1).
So, each element of A appears exactly 2^(N-1) times in the final product. All remains is to calculate the answer: (a1 * a2 * ... * an)^p. Since the exponent is very large, you can use binary exponentiation for fast calculation.
As Matt Timmermans suggested in comments below, we can obtain our answer without actually calculating p = 2^(N-1). We first calculate the product a1 * a2 * ... * an. Then, we simply square this product n-1 times.
The corresponding code in C++:
int func(vector<int> &a) {
int n = a.size();
int m = 1e9+7;
if(n==0) return 0;
if(n==1) return (m + a[0]%m)%m;
long long ans = 1;
//first calculate ans = (a1*a2*...*an)%m
for(int x:a){
//negative sign does not matter since we're squaring
if(x<0) x *= -1;
x %= m;
ans *= x;
ans %= m;
}
//now calculate ans = [ ans^(2^(n-1)) ]%m
//we do this by squaring ans n-1 times
for(int i=1; i<n; i++){
ans = ans*ans;
ans %= m;
}
return (int)ans;
}
Let,
A={a,b,c}
All possible subset of A is ={{},{a},{b},{c},{a,b},{b,c},{c,a},{a,b,c,d}}
Here number of occurrence of each of the element are 4 times.
So if A={a,b,c,d}, then numbers of occurrence of each of the element will be 2^3.
So if the size of A is n, number of occurrence of eachof the element will be 2^(n-1)
So final result will be = a1^p*a2^pa3^p....*an^p
where p is 2^(n-1)
We need to solve x^2^(n-1) % mod.
We can write x^2^(n-1) % mod as x^(2^(n-1) % phi(mod)) %mod . link
As mod is a prime then phi(mod)=mod-1.
So at first find p= 2^(n-1) %(mod-1).
Then find Ai^p % mod for each of the number and multiply with the final result.
I read the previous answers and I was understanding the process of making sets. So here I am trying to put it in as simple as possible for people so that they can apply it to similar problems.
Let i be an element of array A. Following the approach given in the question, i appears p number of times in final answer.
Now, how do we make different sets. We take sets containing only one element, then sets containing group of two, then group of 3 ..... group of n elements.
Now we want to know for every time when we are making set of certain numbers say group of 3 elements, how many of these sets contain i?
There are n elements so for sets of 3 elements which always contains i, combinations are (n-1)C(3-1) because from n-1 elements we can chose 3-1 elements.
if we do this for every group, p = [ (n-1)C(x-1) ] , m going from 1 to n. Thus, p= 2^(n-1).
Similarly for every element i, p will be same. Thus we get
final answer= A[0]^p *A[1]^p...... A[n]^p
I am trying to solve a problem asked in TCS MockVita 2019 Round 2:
Problem Description
Dr Felix Kline, the Math teacher at Gauss School introduced the following game to teach his students problem solving. He places a series of “hopping stones” (pieces of paper) in a line with points (a positive number) marked on each of the stones.
Students start from one end and hop to the other end. One can step on a stone and add the number on the stone to their cumulative score or jump over a stone and land on the next stone. In this case, they get twice the points marked on the stone they land but do not get the points marked on the stone they jumped over.
At most once in the journey, the student is allowed (if they choose) to do a “double jump”– that is, they jump over two consecutive stones – where they would get three times the points of the stone they land on, but not the points of the stone they jump over.
The teacher expected his students to do some thinking and come up with a plan to get the maximum score possible. Given the numbers on the sequence of stones, write a program to determine the maximum score possible.
Constraints
The number of stones in the sequence< 30
Input Format
The first line contains N, the number of integers (this is a positive integer)
The next line contains the N points (each a positive integer) separated by commas. These are the points on the stones in the order the stones are placed.
Output
One integer representing the maximum score
Test Case
Explanation
Example 1
Input
3
4,2,3
Output
10
Explanation
There are 3 stones (N=3), and the points (in the order laid out) are 4,2 and 3 respectively.
If we step on the first stone and skip the second to get 4 + 2 x 3 = 10. A double jump to the third stone will get only 9. Hence the result is 10, and the double jump is not used
Example 2
Input
6
4,5,6,7,4,5
Output
35
Explanation
N=6, and the sequence of points is given.One way of getting 35 is to start with a double jump to stone 3 (3 x 6=18), go to stone 4 (7) and jump to stone 6 (10 points) for a total of 35. The double jump was used only once, and the result is 35.
I found that it's a Dynamic programming problem, but I don't know what I did wrong because my solution is not able to pass all the test cases. My code passed all the tests I created.
unordered_map<int, int> lookup;
int res(int *arr, int n, int i){
if(i == n-1){
return 0;
}
if(i == n-2){
return arr[i+1];
}
if(lookup.find(i) != lookup.end())
return lookup[i];
int maxScore = 0;
if(i< n-3 && flag == false){
flag = true;
maxScore = max(maxScore, 3 * (arr[i+3]) + res(arr, n, i+3));
flag = false;
}
maxScore = max(maxScore, (arr[i+1] + res(arr,n,i+1)));
lookup[i] = max(maxScore, 2 * (arr[i+2]) + res(arr, n, i+2));
return lookup[i];
}
cout << res(arr, n, 0) + arr[0]; // It is inside the main()
I expect you to find the mistake in my code and give the correct solution, and any test case which fails this solution. Thanks :)
You don't need any map. All you need to remember are last few maximal values. You have two options every move (except two first), end with double jump made or without it. If you don't want ot make a dj then your best joice is maximum of last stone + current and stone before last + 2 * current max(no_dj[2] + arr[i], no_dj[1] + 2 * arr[i]).
On the other hand, if you want to have dj made than you have three options, either jump one stone after some previous dj dj[2] + arr[i] or jump over last stone after some dj dj[1] + 2 * arr[i] or do double jump in current move no_dj[0] + 3 * arr[i].
int res(int *arr, int n){
int no_dj[3]{ 0, 0, arr[0]};
int dj[3]{ 0, 0, 0};
for(int i = 1; i < n; i++){
int best_nodj = max(no_dj[1] + 2 * arr[i], no_dj[2] + arr[i]);
int best_dj = 0;
if(i > 1) best_dj = max(max(dj[1] + 2 * arr[i], dj[2] + arr[i]), no_dj[0] + 3 * arr[i]);
no_dj[0] = no_dj[1];
no_dj[1] = no_dj[2];
no_dj[2] = best_nodj;
dj[0] = dj[1];
dj[1] = dj[2];
dj[2] = best_dj;
}
return max(no_dj[2], dj[2]);
}
All you have to remember are two arrays of three elements. Last three maximum values after double jump and last three maximum values without double jump.
Say I have a set of numbers from [0, ....., 499]. Combinations are currently being generated sequentially using the C++ std::next_permutation. For reference, the size of each tuple I am pulling out is 3, so I am returning sequential results such as [0,1,2], [0,1,3], [0,1,4], ... [497,498,499].
Now, I want to parallelize the code that this is sitting in, so a sequential generation of these combinations will no longer work. Are there any existing algorithms for computing the ith combination of 3 from 500 numbers?
I want to make sure that each thread, regardless of the iterations of the loop it gets, can compute a standalone combination based on the i it is iterating with. So if I want the combination for i=38 in thread 1, I can compute [1,2,5] while simultaneously computing i=0 in thread 2 as [0,1,2].
EDIT Below statement is irrelevant, I mixed myself up
I've looked at algorithms that utilize factorials to narrow down each individual element from left to right, but I can't use these as 500! sure won't fit into memory. Any suggestions?
Here is my shot:
int k = 527; //The kth combination is calculated
int N=500; //Number of Elements you have
int a=0,b=1,c=2; //a,b,c are the numbers you get out
while(k >= (N-a-1)*(N-a-2)/2){
k -= (N-a-1)*(N-a-2)/2;
a++;
}
b= a+1;
while(k >= N-1-b){
k -= N-1-b;
b++;
}
c = b+1+k;
cout << "["<<a<<","<<b<<","<<c<<"]"<<endl; //The result
Got this thinking about how many combinations there are until the next number is increased. However it only works for three elements. I can't guarantee that it is correct. Would be cool if you compare it to your results and give some feedback.
If you are looking for a way to obtain the lexicographic index or rank of a unique combination instead of a permutation, then your problem falls under the binomial coefficient. The binomial coefficient handles problems of choosing unique combinations in groups of K with a total of N items.
I have written a class in C# to handle common functions for working with the binomial coefficient. It performs the following tasks:
Outputs all the K-indexes in a nice format for any N choose K to a file. The K-indexes can be substituted with more descriptive strings or letters.
Converts the K-indexes to the proper lexicographic index or rank of an entry in the sorted binomial coefficient table. This technique is much faster than older published techniques that rely on iteration. It does this by using a mathematical property inherent in Pascal's Triangle and is very efficient compared to iterating over the set.
Converts the index in a sorted binomial coefficient table to the corresponding K-indexes. I believe it is also faster than older iterative solutions.
Uses Mark Dominus method to calculate the binomial coefficient, which is much less likely to overflow and works with larger numbers.
The class is written in .NET C# and provides a way to manage the objects related to the problem (if any) by using a generic list. The constructor of this class takes a bool value called InitTable that when true will create a generic list to hold the objects to be managed. If this value is false, then it will not create the table. The table does not need to be created in order to use the 4 above methods. Accessor methods are provided to access the table.
There is an associated test class which shows how to use the class and its methods. It has been extensively tested with 2 cases and there are no known bugs.
To read about this class and download the code, see Tablizing The Binomial Coeffieicent.
The following tested code will iterate through each unique combinations:
public void Test10Choose5()
{
String S;
int Loop;
int N = 500; // Total number of elements in the set.
int K = 3; // Total number of elements in each group.
// Create the bin coeff object required to get all
// the combos for this N choose K combination.
BinCoeff<int> BC = new BinCoeff<int>(N, K, false);
int NumCombos = BinCoeff<int>.GetBinCoeff(N, K);
// The Kindexes array specifies the indexes for a lexigraphic element.
int[] KIndexes = new int[K];
StringBuilder SB = new StringBuilder();
// Loop thru all the combinations for this N choose K case.
for (int Combo = 0; Combo < NumCombos; Combo++)
{
// Get the k-indexes for this combination.
BC.GetKIndexes(Combo, KIndexes);
// Verify that the Kindexes returned can be used to retrive the
// rank or lexigraphic order of the KIndexes in the table.
int Val = BC.GetIndex(true, KIndexes);
if (Val != Combo)
{
S = "Val of " + Val.ToString() + " != Combo Value of " + Combo.ToString();
Console.WriteLine(S);
}
SB.Remove(0, SB.Length);
for (Loop = 0; Loop < K; Loop++)
{
SB.Append(KIndexes[Loop].ToString());
if (Loop < K - 1)
SB.Append(" ");
}
S = "KIndexes = " + SB.ToString();
Console.WriteLine(S);
}
}
You should be able to port this class over fairly easily to C++. You probably will not have to port over the generic part of the class to accomplish your goals. Your test case of 500 choose 3 yields 20,708,500 unique combinations, which will fit in a 4 byte int. If 500 choose 3 is simply an example case and you need to choose combinations greater than 3, then you will have to use longs or perhaps fixed point int.
You can describe a particular selection of 3 out of 500 objects as a triple (i, j, k), where i is a number from 0 to 499 (the index of the first number), j ranges from 0 to 498 (the index of the second, skipping over whichever number was first), and k ranges from 0 to 497 (index of the last, skipping both previously-selected numbers). Given that, it's actually pretty easy to enumerate all the possible selections: starting with (0,0,0), increment k until it gets to its maximum value, then increment j and reset k to 0 and so on, until j gets to its maximum value, and so on, until j gets to its own maximum value; then increment i and reset both j and k and continue.
If this description sounds familiar, it's because it's exactly the same way that incrementing a base-10 number works, except that the base is much funkier, and in fact the base varies from digit to digit. You can use this insight to implement a very compact version of the idea: for any integer n from 0 to 500*499*498, you can get:
struct {
int i, j, k;
} triple;
triple AsTriple(int n) {
triple result;
result.k = n % 498;
n = n / 498;
result.j = n % 499;
n = n / 499;
result.i = n % 500; // unnecessary, any legal n will already be between 0 and 499
return result;
}
void PrintSelections(triple t) {
int i, j, k;
i = t.i;
j = t.j + (i <= j ? 1 : 0);
k = t.k + (i <= k ? 1 : 0) + (j <= k ? 1 : 0);
std::cout << "[" << i << "," << j << "," << k << "]" << std::endl;
}
void PrintRange(int start, int end) {
for (int i = start; i < end; ++i) {
PrintSelections(AsTriple(i));
}
}
Now to shard, you can just take the numbers from 0 to 500*499*498, divide them into subranges in any way you'd like, and have each shard compute the permutation for each value in its subrange.
This trick is very handy for any problem in which you need to enumerate subsets.