Intuition behind storing the remainders? - c++

I am trying to solve a question on LeetCode.com:
Given a list of non-negative numbers and a target integer k, write a function to check if the array has a continuous subarray of size at least 2 that sums up to the multiple of k, that is, sums up to n*k where n is also an integer. For e.g., if [23, 2, 4, 6, 7], k=6, then the output should be True, since [2, 4] is a continuous subarray of size 2 and sums up to 6.
I am trying to understand the following solution:
class Solution {
public:
bool checkSubarraySum(vector<int>& nums, int k) {
int n = nums.size(), sum = 0, pre = 0;
unordered_set<int> modk;
for (int i = 0; i < n; ++i) {
sum += nums[i];
int mod = k == 0 ? sum : sum % k;
if (modk.count(mod)) return true;
modk.insert(pre);
pre = mod;
}
return false;
}
};
I understand that we are trying to store: 0, (a/k), (a+b)/k, (a+b+c)/k, etc. into the hashSet (where k!=0) and that we do that in the next iteration since we want the subarray size to be at least 2.
But, how does this guarantee that we get a subarray whose elements sum up to k? What mathematical property guarantees this?

The set modk is gradually populated with all sums (considered modulo k) of contiguous sub-arrays starting at the beginning of the array.
The key observation is that:
a-b = n*k for some natural n iff
a-b ≡ 0 mod k iff
a ≡ b mod k
so if a contiguous sub-array nums[i_0]..nums[i_1], sums up to 0 modulo k, then the two sub-arrays nums[0]..nums[i_0] and nums[i_0 + 1]..nums[i_1] have the same sum modulo k.
Thus it's enough if two distinct sub-arrays starting at the beginning of the array have the same sum, modulo k.
Luckily, there are only k such values, so you only need to use a set of size k.
Some nitpicks:
if n > k, you're going to have an appropriate sub-array anyway (the pigeon-hole principle), so the loop will actually never iterate more than k+1 times.
There should not be any sort of class involved here, that makes no sense.
contiguous, not continuous. Arrays and sub-arrays are discrete and can't be continuous...

module base k of sum is equivalent to the module k of sum of the modules base k
(a+b)%k = (a%k + b%k) % k
(23 + 2) % 6 = 1
( (23%6) + (2%6) ) % 6 = (5 + 2) % 6 = 1
modk stores all modules that you calculated iteratively. If at iteration i you get a repeated module calculated at i-m that means that you added a subsequence of m elements which sum is multiple of k
i=0 nums[0] = 23 sum = 23 sum%6 = 5 modk = [5]
i=1 nums[1] = 2 sum = 25 sum%6 = 1 modk = [5, 1]
i=2 nums[2] = 4 sum = 29 sum%6 = 5 5 already exists in modk (4+2)%6 =0

Related

Find number of ranges (a[i], a[i+1]) which contains the value k for i in range a, b

Given an array of n elements, how do I find the number of ranges [min(A[i], A[i+1]), max(A[i], A[i+1])] which contains a given value k. Here i lies between 0 <= a < b < n for zero-based indexing, where a and b are zero-based indexes.
For example, for the below array with a = 1, and b = 3;
6 3 2 8 5
Suppose for k = 3, it has to be found in range [min(A[1], A[2]), max(A[1], A[2])] and [min(A[2], A[3]), max(A[2], A[3])]. Here k=3 appears in both the ranges [2, 3] and [2, 8] so the answer is 2.
How could I find the count of ranges in less than linear time with certain pre-computation?
I don't need the exact code, just a high level overview of the approach/data structure would do.
Thanks in advance!
You can save some time by previously kicking out all elements i of the array A which fullfill (A[i-1] <= A[i] and A[i] <= A[i+1]) or ((A[i-1] >= A[i] and A[i] >= A[i+1]). Then counting the valid interval range number should still be the same.
int ctr = 0;
for(i=a;i<b-1;i++)
{
if ( (A[i]<=k && A[i+1]>= k) || (A[i]>=k && A[i+1]<= k) )
ctr++;
}
printf(" Count: %i", ctr);

All pair Bitwise OR sum

Is there an algorithm to find Bit-wise OR sum or an array in linear time complexity?
Suppose if the array is {1,2,3} then all pair sum id 1|2 + 2|3 + 1|3 = 9.
I can find all pair AND sum in O(n) using following algorithm.... How can I change this to get all pair OR sum.
int ans = 0; // Initialize result
// Traverse over all bits
for (int i = 0; i < 32; i++)
{
// Count number of elements with i'th bit set
int k = 0; // Initialize the count
for (int j = 0; j < n; j++)
if ( (arr[j] & (1 << i)) )
k++;
// There are k set bits, means k(k-1)/2 pairs.
// Every pair adds 2^i to the answer. Therefore,
// we add "2^i * [k*(k-1)/2]" to the answer.
ans += (1<<i) * (k*(k-1)/2);
}
From here: http://www.geeksforgeeks.org/calculate-sum-of-bitwise-and-of-all-pairs/
You can do it in linear time. The idea is as follows:
For each bit position, record the number of entries in your array that have that bit set to 1. In your example, you have 2 entries (1 and 3) with the ones bit set, and 2 entries with the two's bit set (2 and 3).
For each number, compute the sum of the number's bitwise OR with all other numbers in the array by looking at each bit individually and using your cached sums. For example, consider the sum 1|1 + 1|2 + 1|3 = 1 + 3 + 3 = 7.
Because 1's last bit is 1, the result of a bitwise or with 1 will have the last bit set to 1. Thus, all three of the numbers 1|1, 1|2, and 1|3 will have last bit equal to 1. Two of those numbers have the two's bit set to 1, which is precisely the number of elements which have the two's bit set to 1. By grouping the bits together, we can obtain the sum 3*1 (three ones bits) + 2*2 (two two's bits) = 7.
Repeating this procedure for each element lets you compute the sum of all bitwise ors for all ordered pairs of elements in the array. So in your example, 1|2 and 2|1 will be computed, as will 1|1. So you'll have to subtract off all cases like 1|1 and then divide by 2 to account for double counting.
Let's try this algorithm out for your example.
Writing the numbers in binary, {1,2,3} = {01, 10, 11}. There are 2 numbers with the one's bit set, and 2 with the two's bit set.
For 01 we get 3*1 + 2*2 = 7 for the sum of ors.
For 10 we get 2*1 + 3*2 = 8 for the sum of ors.
For 11 we get 3*1 + 3*2 = 9 for the sum of ors.
Summing these, 7+8+9 = 24. We need to subtract off 1|1 = 1, 2|2 = 2 and 3|3 = 3, as we counted these in the sum. 24-1-2-3 = 18.
Finally, as we counted things like 1|3 twice, we need to divide by 2. 18/2 = 9, the correct sum.
This algorithm is O(n * max number of bits in any array element).
Edit: We can modify your posted algorithm by simply subtracting the count of all 0-0 pairs from all pairs to get all 0-1 or 1-1 pairs for each bit position. Like so:
int ans = 0; // Initialize result
// Traverse over all bits
for (int i = 0; i < 32; i++)
{
// Count number of elements with i'th bit not set
int k = 0; // Initialize the count
for (int j = 0; j < n; j++)
if ( !(arr[j] & (1 << i)) )
k++;
// There are k not set bits, means k(k-1)/2 pairs that don't contribute to the total sum, out of n*(n-1)/2 pairs.
// So we subtract the ones from don't contribute from the ones that do.
ans += (1<<i) * (n*(n-1)/2 - k*(k-1)/2);
}

Finding the smallest possible number which cannot be represented as sum of 1,2 or other numbers in the sequence

I am a newbie in C++ and need logical help in the following task.
Given a sequence of n positive integers (n < 10^6; each given integer is less than 10^6), write a program to find the smallest positive integer, which cannot be expressed as a sum of 1, 2, or more items of the given sequence (i.e. each item could be taken 0 or 1 times). Examples: input: 2 3 4, output: 1; input: 1 2 6, output: 4
I cannot seem to construct the logic out of it, why the last output is 4 and how to implement it in C++, any help is greatly appreciated.
Here is my code so far:
#include<iostream>
using namespace std;
const int SIZE = 3;
int main()
{
//Lowest integer by default
int IntLowest = 1;
int x = 0;
//Our sequence numbers
int seq;
int sum = 0;
int buffer[SIZE];
//Loop through array inputting sequence numbers
for (int i = 0; i < SIZE; i++)
{
cout << "Input sequence number: ";
cin >> seq;
buffer[i] = seq;
sum += buffer[i];
}
int UpperBound = sum + 1;
int a = buffer[x] + buffer[x + 1];
int b = buffer[x] + buffer[x + 2];
int c = buffer[x + 1] + buffer[x + 2];
int d = buffer[x] + buffer[x + 1] + buffer[x + 2];
for (int y = IntLowest - 1; y < UpperBound; y++)
{
//How should I proceed from here?
}
return 0;
}
What the answer of Voreno suggests is in fact solving 0-1 knapsack problem (http://en.wikipedia.org/wiki/Knapsack_problem#0.2F1_Knapsack_Problem). If you follow the link you can read how it can be done without constructing all subsets of initial set (there are too much of them, 2^n). And it would work if the constraints were a bit smaller, like 10^3.
But with n = 10^6 it still requires too much time and space. But there is no need to solve knapsack problem - we just need to find first number we can't get.
The better solution would be to sort the numbers and then iterate through them once, finding for each prefix of your array a number x, such that with that prefix you can get all numbers in interval [1..x]. The minimal number that we cannot get at this point is x + 1. When you consider the next number a[i] you have two options:
a[i] <= x + 1, then you can get all numbers up to x + a[i],
a[i] > x + 1, then you cannot get x + 1 and you have your answer.
Example:
you are given numbers 1, 4, 12, 2, 3.
You sort them (and get 1, 2, 3, 4, 12), start with x = 0, consider each element and update x the following way:
1 <= x + 1, so x = 0 + 1 = 1.
2 <= x + 1, so x = 1 + 2 = 3.
3 <= x + 1, so x = 3 + 3 = 6.
4 <= x + 1, so x = 6 + 4 = 10.
12 > x + 1, so we have found the answer and it is x + 1 = 11.
(Edit: fixed off-by-one error, added example.)
I think this can be done in O(n) time and O(log2(n)) memory complexities.
Assuming that a BSR (highest set bit index) (floor(log2(x))) implementation in O(1) is used.
Algorithm:
1 create an array of (log2(MAXINT)) buckets, 20 in case of 10^6, Each bucket contains the sum and min values (init: min = 2^(i+1)-1, sum = 0). (lazy init may be used for small n)
2 one pass over the input, storing each value in the buckets[bsr(x)].
for (x : buffer) // iterate input
buckets[bsr(x)].min = min(buckets[bsr(x)].min, x)
buckets[bsr(x)].sum += x
3 Iterate over buckets, maintaining unreachable:
int unreachable = 1 // 0 is always reachable
for(b : buckets)
if (unreachable >= b.min)
unreachable += b.sum
else
break
return unreachable
This works because, assuming we are at bucket i, lets consider the two cases:
unreachable >= b.min is true: because this bucket contains values in the range [2^i...2^(i+1)-1], this implies that 2^i <= b.min. in turn, b.min <= unreachable. therefor unreachable+b.min >= 2^(i+1). this means that all values in the bucket may be added (after adding b.min all the other values are smaller) i.e. unreachable += b.sum.
unreachable >= b.min is false: this means that b.min (the smallest number the the remaining sequence) is greater than unreachable. thus we need to return unreachable.
The output of the second input is 4 because that is the smallest positive number that cannot be expressed as a sum of 1,2 or 6 if you can take each item only 0 or 1 times. I hope this can help you understand more:
You have 3 items in that list: 1,2,6
Starting from the smallest positive integer, you start checking if that integer can be the result of the sum of 1 or more numbers of the given sequence.
1 = 1+0+0
2 = 0+2+0
3 = 1+2+0
4 cannot be expressed as a result of the sum of one of the items in the list (1,2,6). Thus 4 is the smallest positive integer which cannot be expressed as a sum of the items of that given sequence.
The last output is 4 because:
1 = 1
2 = 2
1 + 2 = 3
1 + 6 = 7
2 + 6 = 8
1 + 2 + 6 = 9
Therefore, the lowest integer that cannot be represented by any combination of your inputs (1, 2, 6) is 4.
What the question is asking:
Part 1. Find the largest possible integer that can be represented by your input numbers (ie. the sum of all the numbers you are given), that gives the upper bound
UpperBound = sum(all_your_inputs) + 1
Part 2. Find all the integers you can get, by combining the different integers you are given. Ie if you are given a, b and c as integers, find:
a + b, a + c, b + c, and a + b + c
Part 2) + the list of integers, gives you all the integers you can get using your numbers.
cycle for each integer from 1 to UpperBound
for i = 1 to UpperBound
if i not = a number in the list from point 2)
i = your smallest integer
break
This is a clumsy way of doing it, but I'm sure that with some maths it's possible to find a better way?
EDIT: Improved solution
//sort your input numbers from smallest to largest
input_numbers = sort(input_numbers)
//create a list of integers that have been tried numbers
tried_ints = //empty list
for each input in input_numbers
//build combinations of sums of this input and any of the previous inputs
//add the combinations to tried_ints, if not tried before
for 1 to input
//check whether there is a gap in tried_ints
if there_is_gap
//stop the program, return the smallest integer
//the first gap number is the smallest integer

Maximum subset which has no sum of two divisible by K

I am given the set {1, 2, 3, ... ,N}. I have to find the maximum size of a subset of the given set so that the sum of any 2 numbers from the subset is not divisible by a given number K. N and K can be up to 2*10^9 so i need a very fast algorithm. I only came up with an algorithm of complexity O(K), which is slow.
first calculate all of the set elements mod k.and solve simple problem:
find the maximum size of a subset of the given set so that the sum of any 2 numbers from the subset is not equal by a given number K.
i divide this set to two sets (i and k-i) that you can not choose set(i) and set(k-i) Simultaneously.
int myset[]
int modclass[k]
for(int i=0; i< size of myset ;i++)
{
modclass[(myset[i] mod k)] ++;
}
choose
for(int i=0; i< k/2 ;i++)
{
if (modclass[i] > modclass[k-i])
{
choose all of the set elements that the element mod k equal i
}
else
{
choose all of the set elements that the element mod k equal k-i
}
}
finally you can add one element from that the element mod k equal 0 or k/2.
this solution with an algorithm of complexity O(K).
you can improve this idea with dynamic array:
for(int i=0; i< size of myset ;i++)
{
x= myset[i] mod k;
set=false;
for(int j=0; j< size of newset ;j++)
{
if(newset[j][1]==x or newset[j][2]==x)
{
if (x < k/2)
{
newset[j][1]++;
set=true;
}
else
{
newset[j][2]++;
set=true;
}
}
}
if(set==false)
{
if (x < k/2)
{
newset.add(1,0);
}
else
{
newset.add(0,1);
}
}
}
now you can choose with an algorithm of complexity O(myset.count).and your algorithm is more than O(myset.count) because you need O(myset.count) for read your set.
complexity of this solution is O(myset.count^2),that you can choose algorithm depended your input.with compare between O(myset.count^2) and o(k).
and for better solution you can sort myset based on mod k.
I'm assuming that the set of numbers is always 1 through N for some N.
Consider the first N-(N mod K) numbers. The form floor(N/K) sequences of K consecutive numbers, with reductions mod K from 0 through K-1. For each group, floor(K/2) have to be dropped for having a reduction mod K that is the negation mod K of another subset of floor(K/2). You can keep ceiling(K/2) from each set of K consecutive numbers.
Now consider the remaining N mod K numbers. They have reductions mod K starting at 1. I have not worked out the exact limits, but if N mod K is less than about K/2 you will be able to keep all of them. If not, you will be able to keep about the first ceiling(K/2) of them.
==========================================================================
I believe the concept here is correct, but I have not yet worked out all the details.
==========================================================================
Here is my analysis of the problem and answer. In what follows |x| is floor(x). This solution is similar to the one in #Constantine's answer, but differs in a few cases.
Consider the first K*|N/K| elements. They consist of |N/K| repeats of the reductions modulo K.
In general, we can include |N/K| elements that are k modulo K subject to the following limits:
If (k+k)%K is zero, we can include only one element that is k modulo K. That is the case for k=0 and k=(K/2)%K, which can only happen for even K.
That means we get |N/K| * |(K-1)/2| elements from the repeats.
We need to correct for the omitted elements. If N >= K we need to add 1 for the 0 mod K elements. If K is even and N>=K/2 we also need to add 1 for the (K/2)%K elements.
Finally, if M(N)!=0 we need to add a partial or complete copy of the repeat elements, min(N%K,|(K-1)/2|).
The final formula is:
|N/K| * |(K-1)/2| +
(N>=K ? 1 : 0) +
((N>=K/2 && (K%2)==0) ? 1 : 0) +
min(N%K,|(K-1)/2|)
This differs from #Constantine's version in some cases involving even K. For example, consider N=4, K=6. The correct answer is 3, the size of the set {1, 2, 3}. #Constantine's formula gives |(6-1)/2| = |5/2| = 2. The formula above gets 0 for each of the first two lines, 1 from the third line, and 2 from the final line, giving the correct answer.
formula is
|N/K| * |(K-1)/2| + ost
ost =
if n<k:
ost =0
else if n%k ==0 :
ost =1
else if n%k < |(K-1)/2| :
ost = n%k
else:
ost = |(K-1)/2|
where |a/b|
for example |9/2| = 4 |7/2| = 3
example n = 30 , k =7 ;
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30
1 2 3 |4| 5 6 7. - is first line .
8 9 10 |11| 12 13 14 - second line
if we getting first 3 number in each line we may get size of this subset. also we may adding one number from ( 7 14 28)
getting first 3 number (1 2 3) is a number |(k-1)/2| .
a number of this line is |n/k| .
if there is not residue we may add one number (for example last number).
if residue < |(k-1)/2| we get all number in last line
else getting |(K-1)/2|.
thanks for exception case.
ost = 0 if k>n
n,k=(raw_input().split(' '))
n=int(n)
k=int(k)
l=[0 for x in range(k)]
d=[int(x) for x in raw_input().split(' ')]
flag=0
for x in d:
l[x%k]=l[x%k]+1
sum=0
if l[0]!=0:
sum+=1
if (k%2==0):
sum+=1
if k==1:
print 1
elif k==2:
print 2
else:
i=1
j=k-1
while i<j:
sum=sum+(l[i] if l[i]>=l[j] else l[j])
i=i+1
j=j-1
print sum
This is explanation to ABRAR TYAGI and amin k's solution.
The approach to this solution is:
Create an array L with K buckets and group all the elements from the
input array D into the K buckets. Each bucket L[i] contains D's elements such that ( element % K ) = i.
All the elements that are individually divisible by K are in L[0]. So
only one of these elements (if any) can belong in our final (maximal)
subset. Sum of any two of these elements is divisible by K.
If we add an element from L[i] to an element in L[K-i] then the sum is divisible by K. Hence we can add elements from only one of these buckets to
our final set. We pick the largest bucket.
Code:
d is the array containing the initial set of numbers of size n. The goal of this code is to find the count of the largest subset of d such that the sum of no two integers is divisible by 2.
l is an array that will contain k integers. The idea is to reduce each (element) in array d to (element % k) and save the frequency of their occurrences in array l.
For example, l[1] contains the frequency of all elements % k = 1
We know that 1 + (k-1) % k = 0 so either l[1] or l[k-1] have to be discarded to meet the criteria that sum of no two numbers % k should be 0.
But as we need the largest subset of d, we choose the larger of l[1] and l[k-1]
We loop through array l such that for (i=1; i<=k/2 && i < k-i; i++) and do the above step.
There are two outliers. The sum of any two numbers in the l[0] group % k = 0. So add 1 if l[0] is non-zero.
if k is even, the loop does not handle i=k/2, and using the same logic as above increment the count by one.

Porting optimized Sieve of Eratosthenes from Python to C++

Some time ago I used the (blazing fast) primesieve in python that I found here: Fastest way to list all primes below N
To be precise, this implementation:
def primes2(n):
""" Input n>=6, Returns a list of primes, 2 <= p < n """
n, correction = n-n%6+6, 2-(n%6>1)
sieve = [True] * (n/3)
for i in xrange(1,int(n**0.5)/3+1):
if sieve[i]:
k=3*i+1|1
sieve[ k*k/3 ::2*k] = [False] * ((n/6-k*k/6-1)/k+1)
sieve[k*(k-2*(i&1)+4)/3::2*k] = [False] * ((n/6-k*(k-2*(i&1)+4)/6-1)/k+1)
return [2,3] + [3*i+1|1 for i in xrange(1,n/3-correction) if sieve[i]]
Now I can slightly grasp the idea of the optimizing by automaticly skipping multiples of 2, 3 and so on, but when it comes to porting this algorithm to C++ I get stuck (I have a good understanding of python and a reasonable/bad understanding of C++, but good enough for rock 'n roll).
What I currently have rolled myself is this (isqrt() is just a simple integer square root function):
template <class T>
void primesbelow(T N, std::vector<T> &primes) {
T sievemax = (N-3 + (1-(N % 2))) / 2;
T i;
T sievemaxroot = isqrt(sievemax) + 1;
boost::dynamic_bitset<> sieve(sievemax);
sieve.set();
primes.push_back(2);
for (i = 0; i <= sievemaxroot; i++) {
if (sieve[i]) {
primes.push_back(2*i+3);
for (T j = 3*i+3; j <= sievemax; j += 2*i+3) sieve[j] = 0; // filter multiples
}
}
for (; i <= sievemax; i++) {
if (sieve[i]) primes.push_back(2*i+3);
}
}
This implementation is decent and automatically skips multiples of 2, but if I could port the Python implementation I think it could be much faster (50%-30% or so).
To compare the results (in the hope this question will be successfully answered), the current execution time with N=100000000, g++ -O3 on a Q6600 Ubuntu 10.10 is 1230ms.
Now I would love some help with either understanding what the above Python implementation does or that you would port it for me (not as helpful though).
EDIT
Some extra information about what I find difficult.
I have trouble with the techniques used like the correction variable and in general how it comes together. A link to a site explaining different Eratosthenes optimizations (apart from the simple sites that say "well you just skip multiples of 2, 3 and 5" and then get slam you with a 1000 line C file) would be awesome.
I don't think I would have issues with a 100% direct and literal port, but since after all this is for learning that would be utterly useless.
EDIT
After looking at the code in the original numpy version, it actually is pretty easy to implement and with some thinking not too hard to understand. This is the C++ version I came up with. I'm posting it here in full version to help further readers in case they need a pretty efficient primesieve that is not two million lines of code. This primesieve does all primes under 100000000 in about 415 ms on the same machine as above. That's a 3x speedup, better then I expected!
#include <vector>
#include <boost/dynamic_bitset.hpp>
// http://vault.embedded.com/98/9802fe2.htm - integer square root
unsigned short isqrt(unsigned long a) {
unsigned long rem = 0;
unsigned long root = 0;
for (short i = 0; i < 16; i++) {
root <<= 1;
rem = ((rem << 2) + (a >> 30));
a <<= 2;
root++;
if (root <= rem) {
rem -= root;
root++;
} else root--;
}
return static_cast<unsigned short> (root >> 1);
}
// https://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
// https://stackoverflow.com/questions/5293238/porting-optimized-sieve-of-eratosthenes-from-python-to-c/5293492
template <class T>
void primesbelow(T N, std::vector<T> &primes) {
T i, j, k, l, sievemax, sievemaxroot;
sievemax = N/3;
if ((N % 6) == 2) sievemax++;
sievemaxroot = isqrt(N)/3;
boost::dynamic_bitset<> sieve(sievemax);
sieve.set();
primes.push_back(2);
primes.push_back(3);
for (i = 1; i <= sievemaxroot; i++) {
if (sieve[i]) {
k = (3*i + 1) | 1;
l = (4*k-2*k*(i&1)) / 3;
for (j = k*k/3; j < sievemax; j += 2*k) {
sieve[j] = 0;
sieve[j+l] = 0;
}
primes.push_back(k);
}
}
for (i = sievemaxroot + 1; i < sievemax; i++) {
if (sieve[i]) primes.push_back((3*i+1)|1);
}
}
I'll try to explain as much as I can. The sieve array has an unusual indexing scheme; it stores a bit for each number that is congruent to 1 or 5 mod 6. Thus, a number 6*k + 1 will be stored in position 2*k and k*6 + 5 will be stored in position 2*k + 1. The 3*i+1|1 operation is the inverse of that: it takes numbers of the form 2*n and converts them into 6*n + 1, and takes 2*n + 1 and converts it into 6*n + 5 (the +1|1 thing converts 0 to 1 and 3 to 5). The main loop iterates k through all numbers with that property, starting with 5 (when i is 1); i is the corresponding index into sieve for the number k. The first slice update to sieve then clears all bits in the sieve with indexes of the form k*k/3 + 2*m*k (for m a natural number); the corresponding numbers for those indexes start at k^2 and increase by 6*k at each step. The second slice update starts at index k*(k-2*(i&1)+4)/3 (number k * (k+4) for k congruent to 1 mod 6 and k * (k+2) otherwise) and similarly increases the number by 6*k at each step.
Here's another attempt at an explanation: let candidates be the set of all numbers that are at least 5 and are congruent to either 1 or 5 mod 6. If you multiply two elements in that set, you get another element in the set. Let succ(k) for some k in candidates be the next element (in numerical order) in candidates that is larger than k. In that case, the inner loop of the sieve is basically (using normal indexing for sieve):
for k in candidates:
for (l = k; ; l += 6) sieve[k * l] = False
for (l = succ(k); ; l += 6) sieve[k * l] = False
Because of the limitations on which elements are stored in sieve, that is the same as:
for k in candidates:
for l in candidates where l >= k:
sieve[k * l] = False
which will remove all multiples of k in candidates (other than k itself) from the sieve at some point (either when the current k was used as l earlier or when it is used as k now).
Piggy-Backing onto Howard Hinnant's response, Howard, you don't have to test numbers in the set of all natural numbers not divisible by 2, 3 or 5 for primality, per se. You need simply multiply each number in the array (except 1, which self-eliminates) times itself and every subsequent number in the array. These overlapping products will give you all the non-primes in the array up to whatever point you extend the deterministic-multiplicative process. Thus the first non-prime in the array will be 7 squared, or 49. The 2nd, 7 times 11, or 77, etc. A full explanation here: http://www.primesdemystified.com
As an aside, you can "approximate" prime numbers. Call the approximate prime P. Here are a few formulas:
P = 2*k+1 // not divisible by 2
P = 6*k + {1, 5} // not divisible 2, 3
P = 30*k + {1, 7, 11, 13, 17, 19, 23, 29} // not divisble by 2, 3, 5
The properties of the set of numbers found by these formulas is that P may not be prime, however all primes are in the set P. I.e. if you only test numbers in the set P for prime, you won't miss any.
You can reformulate these formulas to:
P = X*k + {-i, -j, -k, k, j, i}
if that is more convenient for you.
Here is some code that uses this technique with a formula for P not divisible by 2, 3, 5, 7.
This link may represent the extent to which this technique can be practically leveraged.