Construct mirror vector around the centre element in c++ - c++

I have a for-loop that is constructing a vector with 101 elements, using (let's call it equation 1) for the first half of the vector, with the centre element using equation 2, and the latter half being a mirror of the first half.
Like so,
double fc = 0.25
const double PI = 3.1415926
// initialise vectors
int M = 50;
int N = 101;
std::vector<double> fltr;
fltr.resize(N);
std::vector<int> mArr;
mArr.resize(N);
// Creating vector mArr of 101 elements, going from -50 to +50
int count;
for(count = 0; count < N; count++)
mArr[count] = count - M;
// using these elements, enter in to equations to form vector 'fltr'
int n;
for(n = 0; n < M+1; n++)
// for elements 0 to 50 --> use equation 1
fltr[n] = (sin((fc*mArr[n])-M))/((mArr[n]-M)*PI);
// for element 51 --> use equation 2
fltr[M] = fc/PI;
This part of the code works fine and does what I expect, but for elements 52 to 101, I would like to mirror around element 51 (the output value using equation)
For a basic example;
1 2 3 4 5 6 0.2 6 5 4 3 2 1
This is what I have so far, but it just outputs 0's as the elements:
for(n = N; n > M; n--){
for(i = 0; n < M+1; i++)
fltr[n] = fltr[i];
}
I feel like there is an easier way to mirror part of a vector but I'm not sure how.
I would expect the values to plot like this:

After you have inserted the middle element, you can get a reverse iterator to the mid point and copy that range back into the vector through std::back_inserter. The vector is named vec in the example.
auto rbeg = vec.rbegin(), rend = vec.rend();
++rbeg;
copy(rbeg, rend, back_inserter(vec));

Lets look at your code:
for(n = N; n > M; n--)
for(i = 0; n < M+1; i++)
fltr[n] = fltr[i];
And lets make things shorter, N = 5, M = 3,
array is 1 2 3 0 0 and should become 1 2 3 2 1
We start your first outer loop with n = 3, pointing us to the first zero. Then, in the inner loop, we set i to 0 and call fltr[3] = fltr[0], leaving us with the array as
1 2 3 1 0
We could now continue, but it should be obvious that this first assignment was useless.
With this I want to give you a simple way how to go through your code and see what it actually does. You clearly had something different in mind. What should be clear is that we do need to assign every part of the second half once.
What your code does is for each value of n to change the value of fltr[n] M times, ending with setting it to fltr[M] in any case, regardless of what value n has. The result should be that all values in the second half of the array are now the same as the center, in my example it ends with
1 2 3 3 3
Note that there is also a direct error: starting with n = N and then accessing fltr[n]. N is out of bounds for an arry of size N.
To give you a very simple working solution:
for(int i=0; i<M; i++)
{
fltr[N-i-1] = fltr[i];
}
N-i-1 is the mirrored address of i (i = 0 -> N-i-1 = 101-0-1 = 100, last valid address in an array with 101 entries).
Now, I saw several guys answering with a more elaborate code, but I thought that as a beginner, it might be beneficial for you to do this in a very simple manner.
Other than that, as #Pzc already said in the comments, you could do this assignment in the loop where the data is generated.
Another thing, with your code
for(n = 0; n < M+1; n++)
// for elements 0 to 50 --> use equation 1
fltr[n] = (sin((fc*mArr[n])-M))/((mArr[n]-M)*PI);
// for element 51 --> use equation 2
fltr[M] = fc/PI;
I have two issues:
First, the indentation makes it look like fltr[M]=.. would be in the loop. Don't do that, not even if this should have been a mistake when you wrote the question and is not like this in the code. This will lead to errors in the future. Indentation is important. Using the auto-indentation of your IDE is an easy way to go. And try to use brackets, even if it is only one command.
Second, n < M+1 as a condition includes the center. The center is located at adress 50, and 50 < 50+1. You haven't seen any problem as after the loop you overwrite it, but in a different situation, this can easily produce errors.
There are other small things I'd change, and I recommend that, when your code works, you post it on CodeReview.

Let's use std::iota, std::transform, and std::copy instead of raw loops:
const double fc = 0.25;
constexpr double PI = 3.1415926;
const std::size_t M = 50;
const std::size_t N = 2 * M + 1;
std::vector<double> mArr(M);
std::iota(mArr.rbegin(), mArr.rend(), 1.); // = [M, M - 1, ..., 1]
const auto fn = [=](double m) { return std::sin((fc * m) + M) / ((m + M) * PI); };
std::vector<double> fltr(N);
std::transform(mArr.begin(), mArr.end(), fltr.begin(), fn);
fltr[M] = fc / PI;
std::copy(fltr.begin(), fltr.begin() + M, fltr.rbegin());

Related

Efficiently inserting column/row into a matrix stored in row/col-major vector in-place

It's not hard to efficiently insert row or column into a matrix stored in a row-
or col-major (respectively) vector. The problem of inserting row into a col-major vector or column into a row-major vector is slightly more interesting.
For example, given a 2x3 matrix stored in row-major in vector:
1 2 3 <=> 1 2 3 4 5 6
4 5 6
and a column 7 8 that is inserted before after column 1 in the original matrix, we get:
1 7 2 3 <=> 1 7 2 3 4 8 5 6
4 8 5 6
[Inserting a row into a col-major vector is similar.]
The sample setup in C++:
auto m = 2; // #rows
auto n = 3; // #cols
// row-major vector
auto x = std::vector<double>{1,2,3,4,5,6};
auto const colIndex = 1;
auto const col = std::vector<double>{7,8};
// insert column {7,8} into the 2nd position
// =>{1,7,2,3,4,8,5,6}
There could be various options to achieve this algorithmically and in C++, but we're looking for the efficiency and scalability to large matrices and multiple inserts.
The first obvious option that I can think of is to use std::vector<double>::insert to insert new elements to the correct positions:
//option 1: insert in-place
x.reserve(m*(n+1));
for(auto i = 0; i < col.size(); i++)
x.insert(begin(x) + colIndex + i * (n + 1), col[i]);
, which is valid but extremely slow even for moderate data sizes because of the resizing and shifting on each iteration.
Another, more direct option is to create another vector, populate all the columns in the ranges [0,colIndex),colIndex,(colIndex,n+1], and swap it with the original vector:
// option 2: temp vec and swap
{
auto tmp = std::vector<double>(m*(n+1));
for(auto i = 0; i < m; i++)
{
for(auto j = 0; j < colIndex; j++)
tmp[j + i * (n + 1)] = x[j + i * n];
tmp[colIndex + i * (n + 1)] = col[i];
for(auto j = colIndex + 1; j < n + 1; j++)
tmp[j + i * (n + 1)] = x[(j - 1) + i * n];
}
std::swap(tmp, x);
};
This is much faster than the option 1, but requires extra space for the matrix copy and iterating over all elements.
Are there any other ways to achieve this that would beat the above in speed/space or both?
Example code on ideone: https://ideone.com/iXrPfF
This version is likely to be much faster, especially at scale, and could be the basis for further micro-optimization (if [and only if] really necessary):
// one-time reallocation of the vector to get space for the new column
x.resize(x.size() + col.size());
// we'll start shifting elements over from the right
double *from = &x[m * n];
const double *src = &col[m];
double *to = from + m;
size_t R = n - colIndex; // number of cols left of the insert
size_t L = colIndex; // number of cols right of the insert
while (to != &x[0]) {
for (size_t i = 0; i < R; ++i) *(--to) = *(--from);
*(--to) = *(--src); // insert value from new column
for (size_t i = 0; i < L; ++i) *(--to) = *(--from);
}
ideone
This doesn't require any temporary allocation and aside from possible micro-optimizations of the loop it's probably about as fast as it gets. To understand how it works, we can start by observing that the bottom-right element of the original matrix is being shifted m elements to the right in the source vector. Working backwards from the last element, at some point a value from the inserted column vector gets inserted, and subsequent elements from the source vector are now shifted m - 1 only elements to the right. Using that logic we simply construct a 3-phase loop that works from right to left on the source array. The loop iterates m times, once for each row. The three phases of the loop, corresponding to its three lines of code, are:
Shift row elements that are "to the right" of the insertion point.
Insert the row value from the new column.
Shift row elements that are "to the left" of the insertion point (shifting one less place than in phase 1).
There's also serious room for improvement in the naming of the variables, and the algorithm should certainly be encapsulated in its own function with proper input parameters. One possible signature would be:
void insert_column(std::vector<double>& matrix,
size_t rows, size_t columns, size_t insertBefore,
const std::vector<double>& column);
From here there's further room for improvement in making it generic using templates.
And from there, you might observe that the algorithm has possible application beyond matrices. What's really happening is that you're "zippering" two vectors together with a skip and an offset (i.e., starting at element i, insert an element from B into A after every n'th element).
so what I would go with is something like (completely untested (tm))
x.resize(x.size() + col.size());
for (size_t processed = 0; processed < col.size(); ++processed) {
// shift the elements for row n (starting at the end)
// to their new location
auto start = x.end()-(processed+1) * rowSize;
auto end = start + rowSize;
auto middle = end - (col.size()-processed);
std::rotate(start, middle, end);
// replace one of the default value items to be the new value
x[x.size()- rowSize*(1+processed)] = col[col.size()-processed-1];
}
The idea being that you go from
[1,2,3,4,5,6] & adding [a,b,c]
Resize:
[1,2,3,4,5,6,x,x,x]
First loop shift:
[1,2,3,4,x,x,x,5,6]
First loop replace
[1,2,3,4,x,x,c,5,6]
Second loop shift
[1,2,x,x,3,4,c,5,6]
and so on.
Since std::rotate is linear, and each item only ever gets moved once; this should also be linear.
This differs to your option #1 in that every time you inserted, you have to move everything afterwards; meaning that the last x elements are shifted col.size() times.
An alternate solution can be transpose followed by insertion and transpose again. However, the in-place transpose in non-trivial (https://en.wikipedia.org/wiki/In-place_matrix_transposition). See the implementation here https://stackoverflow.com/a/9320349

How to reduce execution time in C++ for the following code?

I have written this code which has an execution time of 3.664 sec but the time limit is 3 seconds.
The question is this-
N teams participate in a league cricket tournament on Mars, where each
pair of distinct teams plays each other exactly once. Thus, there are a total
of (N × (N­1))/2 matches. An expert has assigned a strength to each team,
a positive integer. Strangely, the Martian crowds love one­sided matches
and the advertising revenue earned from a match is the absolute value of
the difference between the strengths of the two matches. Given the
strengths of the N teams, find the total advertising revenue earned from all
the matches.
Input format
Line 1 : A single integer, N.
Line 2 : N space ­separated integers, the strengths of the N teams.
#include<iostream>
using namespace std;
int main()
{
int n;
cin>>n;
int stren[200000];
for(int a=0;a<n;a++)
cin>>stren[a];
long long rev=0;
for(int b=0;b<n;b++)
{
int pos=b;
for(int c=pos;c<n;c++)
{
if(stren[pos]>stren[c])
rev+=(long long)(stren[pos]-stren[c]);
else
rev+=(long long)(stren[c]-stren[pos]);
}
}
cout<<rev;
}
Can you please give me a solution??
Rewrite your loop as:
sort(stren);
for(int b=0;b<n;b++)
{
rev += (2 * b - n + 1) * static_cast<long long>(stren[b]);
}
Live code here
Why does it workYour loops make all pairs of 2 numbers and add the difference to rev. So in a sorted array, bth item is subtracted (n-1-b) times and added b times. Hence the number 2 * b - n + 1
There can be 1 micro optimization that possibly is not needed:
sort(stren);
for(int b = 0, m = 1 - n; b < n; b++, m += 2)
{
rev += m * static_cast<long long>(stren[b]);
}
In place of the if statement, use
rev += std::abs(stren[pos]-stren[c]);
abs returns the positive difference between two integers. This will be much quicker than an if test and ensuing branching. The (long long) cast is also unnecessary although the compiler will probably optimise that out.
There are other optimisations you could make, but this one should do it. If your abs function is poorly implemented on your system, you could always make use of this fast version for computing the absolute value of i:
(i + (i >> 31)) ^ (i >> 31) for a 32 bit int.
This has no branching at all and would beat even an inline ternary! (But you should use int32_t as your data type; if you have 64 bit int then you'll need to adjust my formula.) But we are in the realms of micro-optimisation here.
for(int b = 0; b < n; b++)
{
for(int c = b; c < n; c++)
{
rev += abs(stren[b]-stren[c]);
}
}
This should give you a speed increase, might be enough.
An interesting approach might be to collapse down the strengths from an array - if that distribution is pretty small.
So:
std::unordered_map<int, int> strengths;
for (int i = 0; i < n; ++i) {
int next;
cin >> next;
++strengths[next];
}
This way, we can reduce the number of things we have to sum:
long long rev = 0;
for (auto a = strengths.begin(); a != strengths.end(); ++a) {
for (auto b = std::next(a), b != strengths.end(); ++b) {
rev += abs(a->first - b->first) * (a->second * b->second);
// ^^^^ stren diff ^^^^^^^^ ^^ number of occurences ^^
}
}
cout << rev;
If the strengths tend to be repeated a lot, this could save a lot of cycles.
What exactly we are doing in this problem is: For all combinations of pairs of elements, we are adding up the absolute values of the differences between the elements of the pair. i.e. Consider the sample input
3 10 3 5
Ans (Take only absolute values) = (3-10) + (3-3) + (3-5) + (10-3) + (10-5) + (3-5) = 7 + 0 + 2 + 7 + 5 + 2 = 23
Notice that I have fixed 3, iterated through the remaining elements, found the differences and added them to Ans, then fixed 10, iterated through the remaining elements and so on till the last element
Unfortunately, N(N-1)/2 iterations are required for the above procedure, which wouldn't be ok for the time limit.
Could we better it?
Let's sort the array and repeat this procedure. After sorting, the sample input is now 3 3 5 10
Let's start by fixing the greatest element, 10 and iterating through the array like how we did before (of course, the time complexity is the same)
Ans = (10-3) + (10-3) + (10-5) + (5-3) + (5-3) + (3-3) = 7 + 7 + 5 + 2 + 2 = 23
We could rearrange the above as
Ans = (10)(3)-(3+3+5) + 5(2) - (3+3) + 3(1) - (3)
Notice a pattern? Let's generalize it.
Suppose we have an array of strengths arr[N] of size N indexed from 0
Ans = (arr[N-1])(N-1) - (arr[0] + arr[1] + ... + arr[N-2]) + (arr[N-2])(N-2) - (arr[0] + arr[1] + arr[N-3]) + (arr[N-3])(N-3) - (arr[0] + arr[1] + arr[N-4]) + ... and so on
Right. So let's put this new idea to work. We'll introduce a 'sum' variable. Some basic DP to the rescue.
For i=0 to N-1
sum = sum + arr[i]
Ans = Ans + (arr[i+1]*(i+1)-sum)
That's it, you just have to sort the array and iterate only once through it. Excluding the sorting part, it's down to N iterations from N(N-1)/2, I suppose that's called O(N) time EDIT: That is O(N log N) time overall
Hope it helped!

Number of parallelograms on a NxM grid

I have to solve a problem when Given a grid size N x M , I have to find the number of parallelograms that "can be put in it", in such way that they every coord is an integer.
Here is my code:
/*
~Keep It Simple!~
*/
#include<fstream>
#define MaxN 2005
int N,M;
long long Paras[MaxN][MaxN]; // Number of parallelograms of Height i and Width j
long long Rects; // Final Number of Parallelograms
int cmmdc(int a,int b)
{
while(b)
{
int aux = b;
b = a -(( a/b ) * b);
a = aux;
}
return a;
}
int main()
{
freopen("paralelograme.in","r",stdin);
freopen("paralelograme.out","w",stdout);
scanf("%d%d",&N,&M);
for(int i=2; i<=N+1; i++)
for(int j=2; j<=M+1; j++)
{
if(!Paras[i][j])
Paras[i][j] = Paras[j][i] = 1LL*(i-2)*(j-2) + i*j - cmmdc(i-1,j-1) -2; // number of parallelograms with all edges on the grid + number of parallelograms with only 2 edges on the grid.
Rects += 1LL*(M-j+2)*(N-i+2) * Paras[j][i]; // each parallelogram can be moved in (M-j+2)(N-i+2) places.
}
printf("%lld", Rects);
}
Example : For a 2x2 grid we have 22 possible parallelograms.
My Algorithm works and it is correct, but I need to make it a little bit faster. I wanna know how is it possible.
P.S. I've heard that I should pre-process the greatest common divisor and save it in an array which would reduce the run-time to O(n*m), but I'm not sure how to do that without using the cmmdc ( greatest common divisor ) function.
Make sure N is not smaller than M:
if( N < M ){ swap( N, M ); }
Leverage the symmetry in your loops, you only need to run j from 2 to i:
for(int j=2; j<=min( i, M+1); j++)
you don't need an extra array Paras, drop it. Instead use a temporary variable.
long long temparas = 1LL*(i-2)*(j-2) + i*j - cmmdc(i-1,j-1) -2;
long long t1 = temparas * (M-j+2)*(N-i+2);
Rects += t1;
// check if the inverse case i <-> j must be considered
if( i != j && i <= M+1 ) // j <= N+1 is always true because of j <= i <= N+1
Rects += t1;
Replace this line: b = a -(( a/b ) * b); using the remainder operator:
b = a % b;
Caching the cmmdc results would probably be possible, you can initialize the array using sort of sieve algorithm: Create an 2d array indexed by a and b, put "2" at each position where a and b are multiples of 2, then put a "3" at each position where a and b are multiples of 3, and so on, roughly like this:
int gcd_cache[N][N];
void init_cache(){
for (int u = 1; u < N; ++u){
for (int i = u; i < N; i+=u ) for (int k = u; k < N ; k+=u ){
gcd_cache[i][k] = u;
}
}
}
Not sure if it helps a lot though.
The first comment in your code states "keep it simple", so, in the light of that, why not try solving the problem mathematically and printing the result.
If you select two lines of length N from your grid, you would find the number of parallelograms in the following way:
Select two points next to each other in both lines: there is (N-1)^2
ways of doing this, since you can position the two points on N-1
positions on each of the lines.
Select two points with one space between them in both lines: there is (N-2)^2 ways of doing this.
Select two points with two, three and up to N-2 spaces between them.
The resulting number of combinations would be (N-1)^2+(N-2)^2+(N-3)^2+...+1.
By solving the sum, we get the formula: 1/6*N*(2*N^2-3*N+1). Check WolframAlpha to verify.
Now that you have a solution for two lines, you simply need to multiply it by the number of combinations of order 2 of M, which is M!/(2*(M-2)!).
Thus, the whole formula would be: 1/12*N*(2*N^2-3*N+1)*M!/(M-2)!, where the ! mark denotes factorial, and the ^ denotes a power operator (note that the same sign is not the power operator in C++, but the bitwise XOR operator).
This calculation requires less operations that iterating through the matrix.

Porting optimized Sieve of Eratosthenes from Python to C++

Some time ago I used the (blazing fast) primesieve in python that I found here: Fastest way to list all primes below N
To be precise, this implementation:
def primes2(n):
""" Input n>=6, Returns a list of primes, 2 <= p < n """
n, correction = n-n%6+6, 2-(n%6>1)
sieve = [True] * (n/3)
for i in xrange(1,int(n**0.5)/3+1):
if sieve[i]:
k=3*i+1|1
sieve[ k*k/3 ::2*k] = [False] * ((n/6-k*k/6-1)/k+1)
sieve[k*(k-2*(i&1)+4)/3::2*k] = [False] * ((n/6-k*(k-2*(i&1)+4)/6-1)/k+1)
return [2,3] + [3*i+1|1 for i in xrange(1,n/3-correction) if sieve[i]]
Now I can slightly grasp the idea of the optimizing by automaticly skipping multiples of 2, 3 and so on, but when it comes to porting this algorithm to C++ I get stuck (I have a good understanding of python and a reasonable/bad understanding of C++, but good enough for rock 'n roll).
What I currently have rolled myself is this (isqrt() is just a simple integer square root function):
template <class T>
void primesbelow(T N, std::vector<T> &primes) {
T sievemax = (N-3 + (1-(N % 2))) / 2;
T i;
T sievemaxroot = isqrt(sievemax) + 1;
boost::dynamic_bitset<> sieve(sievemax);
sieve.set();
primes.push_back(2);
for (i = 0; i <= sievemaxroot; i++) {
if (sieve[i]) {
primes.push_back(2*i+3);
for (T j = 3*i+3; j <= sievemax; j += 2*i+3) sieve[j] = 0; // filter multiples
}
}
for (; i <= sievemax; i++) {
if (sieve[i]) primes.push_back(2*i+3);
}
}
This implementation is decent and automatically skips multiples of 2, but if I could port the Python implementation I think it could be much faster (50%-30% or so).
To compare the results (in the hope this question will be successfully answered), the current execution time with N=100000000, g++ -O3 on a Q6600 Ubuntu 10.10 is 1230ms.
Now I would love some help with either understanding what the above Python implementation does or that you would port it for me (not as helpful though).
EDIT
Some extra information about what I find difficult.
I have trouble with the techniques used like the correction variable and in general how it comes together. A link to a site explaining different Eratosthenes optimizations (apart from the simple sites that say "well you just skip multiples of 2, 3 and 5" and then get slam you with a 1000 line C file) would be awesome.
I don't think I would have issues with a 100% direct and literal port, but since after all this is for learning that would be utterly useless.
EDIT
After looking at the code in the original numpy version, it actually is pretty easy to implement and with some thinking not too hard to understand. This is the C++ version I came up with. I'm posting it here in full version to help further readers in case they need a pretty efficient primesieve that is not two million lines of code. This primesieve does all primes under 100000000 in about 415 ms on the same machine as above. That's a 3x speedup, better then I expected!
#include <vector>
#include <boost/dynamic_bitset.hpp>
// http://vault.embedded.com/98/9802fe2.htm - integer square root
unsigned short isqrt(unsigned long a) {
unsigned long rem = 0;
unsigned long root = 0;
for (short i = 0; i < 16; i++) {
root <<= 1;
rem = ((rem << 2) + (a >> 30));
a <<= 2;
root++;
if (root <= rem) {
rem -= root;
root++;
} else root--;
}
return static_cast<unsigned short> (root >> 1);
}
// https://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
// https://stackoverflow.com/questions/5293238/porting-optimized-sieve-of-eratosthenes-from-python-to-c/5293492
template <class T>
void primesbelow(T N, std::vector<T> &primes) {
T i, j, k, l, sievemax, sievemaxroot;
sievemax = N/3;
if ((N % 6) == 2) sievemax++;
sievemaxroot = isqrt(N)/3;
boost::dynamic_bitset<> sieve(sievemax);
sieve.set();
primes.push_back(2);
primes.push_back(3);
for (i = 1; i <= sievemaxroot; i++) {
if (sieve[i]) {
k = (3*i + 1) | 1;
l = (4*k-2*k*(i&1)) / 3;
for (j = k*k/3; j < sievemax; j += 2*k) {
sieve[j] = 0;
sieve[j+l] = 0;
}
primes.push_back(k);
}
}
for (i = sievemaxroot + 1; i < sievemax; i++) {
if (sieve[i]) primes.push_back((3*i+1)|1);
}
}
I'll try to explain as much as I can. The sieve array has an unusual indexing scheme; it stores a bit for each number that is congruent to 1 or 5 mod 6. Thus, a number 6*k + 1 will be stored in position 2*k and k*6 + 5 will be stored in position 2*k + 1. The 3*i+1|1 operation is the inverse of that: it takes numbers of the form 2*n and converts them into 6*n + 1, and takes 2*n + 1 and converts it into 6*n + 5 (the +1|1 thing converts 0 to 1 and 3 to 5). The main loop iterates k through all numbers with that property, starting with 5 (when i is 1); i is the corresponding index into sieve for the number k. The first slice update to sieve then clears all bits in the sieve with indexes of the form k*k/3 + 2*m*k (for m a natural number); the corresponding numbers for those indexes start at k^2 and increase by 6*k at each step. The second slice update starts at index k*(k-2*(i&1)+4)/3 (number k * (k+4) for k congruent to 1 mod 6 and k * (k+2) otherwise) and similarly increases the number by 6*k at each step.
Here's another attempt at an explanation: let candidates be the set of all numbers that are at least 5 and are congruent to either 1 or 5 mod 6. If you multiply two elements in that set, you get another element in the set. Let succ(k) for some k in candidates be the next element (in numerical order) in candidates that is larger than k. In that case, the inner loop of the sieve is basically (using normal indexing for sieve):
for k in candidates:
for (l = k; ; l += 6) sieve[k * l] = False
for (l = succ(k); ; l += 6) sieve[k * l] = False
Because of the limitations on which elements are stored in sieve, that is the same as:
for k in candidates:
for l in candidates where l >= k:
sieve[k * l] = False
which will remove all multiples of k in candidates (other than k itself) from the sieve at some point (either when the current k was used as l earlier or when it is used as k now).
Piggy-Backing onto Howard Hinnant's response, Howard, you don't have to test numbers in the set of all natural numbers not divisible by 2, 3 or 5 for primality, per se. You need simply multiply each number in the array (except 1, which self-eliminates) times itself and every subsequent number in the array. These overlapping products will give you all the non-primes in the array up to whatever point you extend the deterministic-multiplicative process. Thus the first non-prime in the array will be 7 squared, or 49. The 2nd, 7 times 11, or 77, etc. A full explanation here: http://www.primesdemystified.com
As an aside, you can "approximate" prime numbers. Call the approximate prime P. Here are a few formulas:
P = 2*k+1 // not divisible by 2
P = 6*k + {1, 5} // not divisible 2, 3
P = 30*k + {1, 7, 11, 13, 17, 19, 23, 29} // not divisble by 2, 3, 5
The properties of the set of numbers found by these formulas is that P may not be prime, however all primes are in the set P. I.e. if you only test numbers in the set P for prime, you won't miss any.
You can reformulate these formulas to:
P = X*k + {-i, -j, -k, k, j, i}
if that is more convenient for you.
Here is some code that uses this technique with a formula for P not divisible by 2, 3, 5, 7.
This link may represent the extent to which this technique can be practically leveraged.

Efficiently computing vector combinations

I'm working on a research problem out of curiosity, and I don't know how to program the logic that I've in mind. Let me explain it to you:
I've four vectors, say for example,
v1 = 1 1 1 1
v2 = 2 2 2 2
v3 = 3 3 3 3
v4 = 4 4 4 4
Now what I want to do is to add them combination-wise, that is,
v12 = v1+v2
v13 = v1+v3
v14 = v1+v4
v23 = v2+v3
v24 = v2+v4
v34 = v3+v4
Till this step it is just fine. The problem is now I want to add each of these vectors one vector from v1, v2, v3, v4 which it hasn't added before. For example:
v3 and v4 hasn't been added to v12, so I want to create v123 and v124. Similarly for all the vectors like,
v12 should become:
v123 = v12+v3
v124 = v12+v4
v13 should become:
v132 // This should not occur because I already have v123
v134
v14 should become:
v142 // Cannot occur because I've v124 already
v143 // Cannot occur
v23 should become:
v231 // Cannot occur
v234 ... and so on.
It is important that I do not do all at one step at the start. Like for example, I can do (4 choose 3) 4C3 and finish it off, but I want to do it step by step at each iteration.
How do I program this?
P.S.: I'm trying to work on an modified version of an apriori algorithm in data mining.
In C++, given the following routine:
template <typename Iterator>
inline bool next_combination(const Iterator first,
Iterator k,
const Iterator last)
{
/* Credits: Thomas Draper */
if ((first == last) || (first == k) || (last == k))
return false;
Iterator itr1 = first;
Iterator itr2 = last;
++itr1;
if (last == itr1)
return false;
itr1 = last;
--itr1;
itr1 = k;
--itr2;
while (first != itr1)
{
if (*--itr1 < *itr2)
{
Iterator j = k;
while (!(*itr1 < *j)) ++j;
std::iter_swap(itr1,j);
++itr1;
++j;
itr2 = k;
std::rotate(itr1,j,last);
while (last != j)
{
++j;
++itr2;
}
std::rotate(k,itr2,last);
return true;
}
}
std::rotate(first,k,last);
return false;
}
You can then proceed to do the following:
int main()
{
unsigned int vec_idx[] = {0,1,2,3,4};
const std::size_t vec_idx_size = sizeof(vec_idx) / sizeof(unsigned int);
{
// All unique combinations of two vectors, for example, 5C2
std::size_t k = 2;
do
{
std::cout << "Vector Indicies: ";
for (std::size_t i = 0; i < k; ++i)
{
std::cout << vec_idx[i] << " ";
}
}
while (next_combination(vec_idx,
vec_idx + k,
vec_idx + vec_idx_size));
}
std::sort(vec_idx,vec_idx + vec_idx_size);
{
// All unique combinations of three vectors, for example, 5C3
std::size_t k = 3;
do
{
std::cout << "Vector Indicies: ";
for (std::size_t i = 0; i < k; ++i)
{
std::cout << vec_idx[i] << " ";
}
}
while (next_combination(vec_idx,
vec_idx + k,
vec_idx + vec_idx_size));
}
return 0;
}
**Note 1:* Because of the iterator oriented interface for the next_combination routine, any STL container that supports forward iteration via iterators can also be used, such as std::vector, std::deque and std::list just to name a few.
Note 2: This problem is well suited for the application of memoization techniques. In this problem, you can create a map and fill it in with vector sums of given combinations. Prior to computing the sum of a given set of vectors, you can lookup to see if any subset of the sums have already been calculated and use those results. Though you're performing summation which is quite cheap and fast, if the calculation you were performing was to be far more complex and time consuming, this technique would definitely help bring about some major performance improvements.
I think this problem can be solved by marking which combination har occured.
My first thought is that you may use a 3-dimension array to mark what combination has happened. But that is not very good.
How about a bit-array (such as an integer) for flagging? Such as:
Num 1 = 2^0 for vector 1
Num 2 = 2^1 for vector 2
Num 4 = 2^2 for vector 3
Num 8 = 2^3 for vector 4
When you make a compose, just add all the representative number. For example, vector 124 will have the value: 1 + 2 + 8 = 11. This value is unique for every combination.
This is just my thought. Hope it helps you someway.
EDIT: Maybe I'm not be clear enough about my idea. I'll try to explain it a bit clearer:
1) Assign for each vector a representative number. This number is the id of a vector, and it's unique. Moreover, the sum of every sub-set of those number is unique, means that if we have sum of k representative number is M; we can easily know that which vectors take part in the sum.
We do that by assign: 2^0 for vector 1; 2^1 for vector 2; 2^2 for vector 3, and so on...
With every M = sum (2^x + 2^y + 2^z + ... ) = (2^x OR 2^y OR 2^z OR ...). We know that the vector (x + 1), (y + 1), (z +1) ... take part in the sum. This can easily be checked by express the number in binary mode.
For example, we know that:
2^0 = 1 (binary)
2^1 = 10 (binary)
2^2 = 100 (binary)
...
So that if we have the sum is 10010 (binary), we know that vector(number: 10) and vector(number: 10000) join in the sum.
And for the best, the sum here can be calculated by "OR" operator, which is also easily understood if you express the number in binary.
2) Utilizing the above facts, every time before you count the sum of your vector, you can add/OR their representative number first. And you can keep track them in something like a lookup array. If the sum already exists in the lookup array, you can omit it. By that you can solve the problem.
Maybe I am misunderstanding, but isn't this equivalent to generating all subsets (power set) of 1, 2, 3, 4 and then for each element of the power set, summing the vector? For instance:
//This is pseudo C++ since I'm too lazy to type everything
//push back the vectors or pointers to vectors, etc.
vector< vector< int > > v = v1..v4;
//Populate a vector with 1 to 4
vector< int > n = 1..4
//Function that generates the power set {nil, 1, (1,2), (1,3), (1,4), (1,2,3), etc.
vector< vector < int > > power_vec = generate_power_set(n);
//One might want to make a string key by doing a Perl-style join of the subset together by a comma or something...
map< vector < int >,vector< int > > results;
//For each subset, we sum the original vectors together
for subset_iter over power_vec{
vector<int> result;
//Assumes all the vecors same length, can be modified carefully if not.
result.reserve(length(v1));
for ii=0 to length(v1){
for iter over subset from subset_iter{
result[ii]+=v[iter][ii];
}
}
results[*subset_iter] = result;
}
If that is the idea you had in mind, you still need a power set function, but that code is easy to find if you search for power set. For example,
Obtaining a powerset of a set in Java.
Maintain a list of all for choosing two values.
Create a vector of sets such that the set consists of elements from the original vector with the 4C2 elements. Iterate over the original vectors and for each one, add/create a set with elements from step 1. Maintain a vector of sets and only if the set is not present, add the result to the vector.
Sum up the vector of sets you obtained in step 2.
But as you indicated, the easiest is 4C3.
Here is something written in Python. You can adopt it to C++
import itertools
l1 = ['v1','v2','v3','v4']
res = []
for e in itertools.combinations(l1,2):
res.append(e)
fin = []
for e in res:
for l in l1:
aset = set((e[0],e[1],l))
if aset not in fin and len(aset) == 3:
fin.append(aset)
print fin
This would result:
[set(['v1', 'v2', 'v3']), set(['v1', 'v2', 'v4']), set(['v1', 'v3', 'v4']), set(['v2', 'v3', 'v4'])]
This is the same result as 4C3.