Divide array into smaller consecutive parts such that NEO value is maximal - c++

On this years Bubble Cup (finished) there was the problem NEO (which I couldn't solve), which asks
Given array with n integer elements. We divide it into several part (may be 1), each part is a consecutive of elements. The NEO value in that case is computed by: Sum of value of each part. Value of a part is sum all elements in this part multiple by its length.
Example: We have array: [ 2 3 -2 1 ]. If we divide it like: [2 3] [-2 1]. Then NEO = (2 + 3) * 2 + (-2 + 1) * 2 = 10 - 2 = 8.
The number of elements in array is smaller then 10^5 and the numbers are integers between -10^6 and 10^6
I've tried something like divide and conquer to constantly split array into two parts if it increases the maximal NEO number otherwise return the NEO of the whole array. But unfortunately the algorithm has worst case O(N^2) complexity (my implementation is below) so I'm wondering whether there is a better solution
EDIT: My algorithm (greedy) doesn't work, taking for example [1,2,-6,2,1] my algorithm returns the whole array while to get the maximal NEO value is to take parts [1,2],[-6],[2,1] which gives NEO value of (1+2)*2+(-6)+(1+2)*2=6
#include <iostream>
int maxInterval(long long int suma[],int first,int N)
{
long long int max = -1000000000000000000LL;
long long int curr;
if(first==N) return 0;
int k;
for(int i=first;i<N;i++)
{
if(first>0) curr = (suma[i]-suma[first-1])*(i-first+1)+(suma[N-1]-suma[i])*(N-1-i); // Split the array into elements from [first..i] and [i+1..N-1] store the corresponding NEO value
else curr = suma[i]*(i-first+1)+(suma[N-1]-suma[i])*(N-1-i); // Same excpet that here first = 0 so suma[first-1] doesn't exist
if(curr > max) max = curr,k=i; // find the maximal NEO value for splitting into two parts
}
if(k==N-1) return max; // If the max when we take the whole array then return the NEO value of the whole array
else
{
return maxInterval(suma,first,k+1)+maxInterval(suma,k+1,N); // Split the 2 parts further if needed and return it's sum
}
}
int main() {
int T;
std::cin >> T;
for(int j=0;j<T;j++) // Iterate over all the test cases
{
int N;
long long int NEO[100010]; // Values, could be long int but just to be safe
long long int suma[100010]; // sum[i] = sum of NEO values from NEO[0] to NEO[i]
long long int sum=0;
int k;
std::cin >> N;
for(int i=0;i<N;i++)
{
std::cin >> NEO[i];
sum+=NEO[i];
suma[i] = sum;
}
std::cout << maxInterval(suma,0,N) << std::endl;
}
return 0;
}

This is not a complete solution but should provide some helpful direction.
Combining two groups that each have a positive sum (or one of the sums is non-negative) would always yield a bigger NEO than leaving them separate:
m * a + n * b < (m + n) * (a + b) where a, b > 0 (or a > 0, b >= 0); m and n are subarray lengths
Combining a group with a negative sum with an entire group of non-negative numbers always yields a greater NEO than combining it with only part of the non-negative group. But excluding the group with the negative sum could yield an even greater NEO:
[1, 1, 1, 1] [-2] => m * a + 1 * (-b)
Now, imagine we gradually move the dividing line to the left, increasing the sum b is combined with. While the expression on the right is negative, the NEO for the left group keeps decreasing. But if the expression on the right gets positive, relying on our first assertion (see 1.), combining the two groups would always be greater than not.
Combining negative numbers alone in sequence will always yield a smaller NEO than leaving them separate:
-a - b - c ... = -1 * (a + b + c ...)
l * (-a - b - c ...) = -l * (a + b + c ...)
-l * (a + b + c ...) < -1 * (a + b + c ...) where l > 1; a, b, c ... > 0
O(n^2) time, O(n) space JavaScript code:
function f(A){
A.unshift(0);
let negatives = [];
let prefixes = new Array(A.length).fill(0);
let m = new Array(A.length).fill(0);
for (let i=1; i<A.length; i++){
if (A[i] < 0)
negatives.push(i);
prefixes[i] = A[i] + prefixes[i - 1];
m[i] = i * (A[i] + prefixes[i - 1]);
for (let j=negatives.length-1; j>=0; j--){
let negative = prefixes[negatives[j]] - prefixes[negatives[j] - 1];
let prefix = (i - negatives[j]) * (prefixes[i] - prefixes[negatives[j]]);
m[i] = Math.max(m[i], prefix + negative + m[negatives[j] - 1]);
}
}
return m[m.length - 1];
}
console.log(f([1, 2, -5, 2, 1, 3, -4, 1, 2]));
console.log(f([1, 2, -4, 1]));
console.log(f([2, 3, -2, 1]));
console.log(f([-2, -3, -2, -1]));
Update
This blog provides that we can transform the dp queries from
dp_i = sum_i*i + max(for j < i) of ((dp_j + sum_j*j) + (-j*sum_i) + (-i*sumj))
to
dp_i = sum_i*i + max(for j < i) of (dp_j + sum_j*j, -j, -sum_j) ⋅ (1, sum_i, i)
which means we could then look at each iteration for an already seen vector that would generate the largest dot product with our current information. The math alluded to involves convex hull and farthest point query, which are beyond my reach to implement at this point but will make a study of.

Related

Compute series without being able to store values?

Problem statement[here]
Let be S a infinite secuence of integers:
S0 = a;
S1 = b;
Si = |Si-2 - Si-1| for all i >= 2.
You have two integers a and b. You must answer some queries about the n-th element in the sequence.(means print the nth number in the sequence i.e S(n) )
( 0 <= a,b <= 10^18),( 1 <= q <= 100000 )
What I Tried(This would give a runtime error) :
#include <bits/stdc++.h>
using namespace std;
long long int q,a,b,arr[100002];/*Can't declare an array of required size */
int main() {
// your code goes here
scanf("%lld%lld",&a,&b);
arr[0]=a,arr[1]=b;
scanf("%d",&q);
int p[100002];
long long int m = -1;//stores max index asked
for(int i=0;i<q;i++)
{
scanf("%lld",&p[i]);
m = (m>p[i])?m:p[i];
}
for(int i=2;i<=m;i++)//calculates series upto that index
{
arr[i]=abs(arr[i-1]-arr[i-2]);
}
for(int i=0;i<q;i++)
{
printf("%lld\n",arr[p[i]]);
}
return 0;
}
Given : qi fits in 64 bit integer. since index can be very large and i cant declare that bit an array, how should i approach this problem(since brute force would give TLE). Thanks!
HA! There is a solution that doesn't require (complete) iteration:
Considering some values Si and Sj, where i, j > 1. Then, looking at how the numbers of the sequence are built (using the absolute value), we can conclude that both numbers are positive.
Then the absolute value of their difference is guaranteed to be less (or equal) than the larger of the two.
Assuming it is strictly less than the larger of the two, within the next two steps, the larger value of the original values will go "out of scope". From that we can conclude that in this case, the numbers of the sequence are getting smaller and smaller.
(*) If the difference is equal to the larger one, then the other number must have been 0. In the next step, one of two things might happen:
a) The larger goes out of scope, then the next two numbers are the calculated difference (which is equal to the larger) and 0, which will yield again the larger value. Then we have the same situation as in ...
b) The zero goes out of scope. Then the next step will compute the difference between the larger and the calculated difference (which is equal to the larger), resulting in 0. In the next step, this leads back to the original (*) situation.
Result: A repeating pattern of L, L, 0, ...
Some examples:
3, 1, 2, 1, 1, 0, 1, 1, 0, ...
1, 3, 2, 1, 1, 0, 1, 1, 0, ...
3.5, 1, 2.5, 1.5, 1, .5, .5, 0, .5, .5, 0, ...
.1, 1, .9, .1, .8, .7, .1, .6, .5, .1, .4, .3, .1, .2, .1, .1, 0, ...
Applying that to the code: As soon as one value is 0, no more iteration is required, the next two numbers will be the same as the previous, then there will be again a 0 and so on:
// A and B could also be negative, that wouldn't change the algorithm,
// but this way the implementation is easier
uint64_t sequence(uint64_t A, uint64_t B, size_t n) {
if (n == 0) {
return A;
}
uint64_t prev[2] = {A, B};
for (size_t it = 1u; it < n; ++it) {
uint64_t next =
(prev[0] > prev[1]) ?
(prev[0] - prev[1]) :
(prev[1] - prev[0]);
if (next == 0) {
size_t remaining = n - it - 1;
if (remaining % 3 == 0) {
return 0;
}
return prev[0]; // same as prev[1]
}
prev[0] = prev[1];
prev[1] = next;
}
return prev[1];
}
Live demo here (play with the a and b values if you like).
If you have repeated queries for the same A and B, you could cache all values until next == 0 in a std::vector, giving you really constant time for the following queries.
I'm also pretty sure that there's a pattern before the sequence reaches 0, but I wasn't able to find it.
I just noticed that I missed that it should be the absolute value of the difference ...
If it's fast enough, here is an iterative version:
// deciding on a concrete type is hard ...
uint64_t sequence (uint64_t A, uint64_t B, uint64_t n) {
if (n == 0) {
return A;
}
uint64_t prev[2] = {A, B};
for (auto it = 1u; it < n; ++it) {
auto next =
(prev[0] > prev[1]) ?
(prev[0] - prev[1]) :
(prev[1] - prev[0]);
prev[0] = prev[1];
prev[1] = next;
}
return prev[1];
}
As you see you don't need to store all values, only the last two numbers are needed to compute the next one.
If this isn't fast enough you could add memorisation: Store the pairs of prev values in an ordered std::map (mapping n to those pairs). You can then start from the entry with the next, lower value of n instead of from the beginning. Of course you need to manage that map then, too: Keep it small and filled with "useful" values.
This is not a programming problem, it's an algorithmic one. Let's look at the first numbers of that sequence:
a
b
a-b
b-(a-b) = 2b-a
(a-b)-(b-(a-b)) = 2(a-b)-b = 2a-3b
2b-a-(2a-3b) = 5b-3a
2a-3b-(5b-3a) = 5a-8b
...
Looking only at the absolute value of the coefficients shows ...
b: 0 1 1 2 3 5 8 ...
a: (1) 0 1 1 2 3 5 ...
... that this is about the Fibonacci sequence. Then, there's also the sign, but this is pretty easy:
b: - + - + - ...
a: + - + - + ...
So the nth number in your sequence should be equal to
f(0) = a
f(n) = (-1)^n * fib(n-1) * a +
(-1)^(n-1) * fib(n) * b
Of course now we have to calculate the nth Fibonacci number, but fortunately there's already a solution for that:
fib(n) = (phi^n - chi^n) / (phi - chi)
with
phi = (1 + sqr(5)) / 2
chi = 1 - phi
So, bringing that to code:
unsigned long fib(unsigned n) {
double const phi = (1 + sqrt(5)) / 2.0;
double const chi = 1 - phi;
return (pow(phi, n) - pow(chi, n)) / (phi - chi);
}
long sequence (long A, long B, unsigned n) {
if(n ==0) {
return A;
}
auto part_a = fib(n-1) * A;
auto part_b = fib (n) * B;
return (n % 2 == 0) ? (part_a - part_b) : (part_b - part_a);
}
Some live demo is here, but this gets problematic when approaching larger numbers (I suspect the fib getting incorrect).
The demo contains also the iterative version of the sequence, as control. If that's fast enough for you, use that instead. No need to store anything more than the last two numbers.
To improve this further, you could use a lookup table with holes for the Fibonacci numbers, i.e. remembering every tenth (and their successor) number of the sequence.

How to reduce execution time in C++ for the following code?

I have written this code which has an execution time of 3.664 sec but the time limit is 3 seconds.
The question is this-
N teams participate in a league cricket tournament on Mars, where each
pair of distinct teams plays each other exactly once. Thus, there are a total
of (N × (N­1))/2 matches. An expert has assigned a strength to each team,
a positive integer. Strangely, the Martian crowds love one­sided matches
and the advertising revenue earned from a match is the absolute value of
the difference between the strengths of the two matches. Given the
strengths of the N teams, find the total advertising revenue earned from all
the matches.
Input format
Line 1 : A single integer, N.
Line 2 : N space ­separated integers, the strengths of the N teams.
#include<iostream>
using namespace std;
int main()
{
int n;
cin>>n;
int stren[200000];
for(int a=0;a<n;a++)
cin>>stren[a];
long long rev=0;
for(int b=0;b<n;b++)
{
int pos=b;
for(int c=pos;c<n;c++)
{
if(stren[pos]>stren[c])
rev+=(long long)(stren[pos]-stren[c]);
else
rev+=(long long)(stren[c]-stren[pos]);
}
}
cout<<rev;
}
Can you please give me a solution??
Rewrite your loop as:
sort(stren);
for(int b=0;b<n;b++)
{
rev += (2 * b - n + 1) * static_cast<long long>(stren[b]);
}
Live code here
Why does it workYour loops make all pairs of 2 numbers and add the difference to rev. So in a sorted array, bth item is subtracted (n-1-b) times and added b times. Hence the number 2 * b - n + 1
There can be 1 micro optimization that possibly is not needed:
sort(stren);
for(int b = 0, m = 1 - n; b < n; b++, m += 2)
{
rev += m * static_cast<long long>(stren[b]);
}
In place of the if statement, use
rev += std::abs(stren[pos]-stren[c]);
abs returns the positive difference between two integers. This will be much quicker than an if test and ensuing branching. The (long long) cast is also unnecessary although the compiler will probably optimise that out.
There are other optimisations you could make, but this one should do it. If your abs function is poorly implemented on your system, you could always make use of this fast version for computing the absolute value of i:
(i + (i >> 31)) ^ (i >> 31) for a 32 bit int.
This has no branching at all and would beat even an inline ternary! (But you should use int32_t as your data type; if you have 64 bit int then you'll need to adjust my formula.) But we are in the realms of micro-optimisation here.
for(int b = 0; b < n; b++)
{
for(int c = b; c < n; c++)
{
rev += abs(stren[b]-stren[c]);
}
}
This should give you a speed increase, might be enough.
An interesting approach might be to collapse down the strengths from an array - if that distribution is pretty small.
So:
std::unordered_map<int, int> strengths;
for (int i = 0; i < n; ++i) {
int next;
cin >> next;
++strengths[next];
}
This way, we can reduce the number of things we have to sum:
long long rev = 0;
for (auto a = strengths.begin(); a != strengths.end(); ++a) {
for (auto b = std::next(a), b != strengths.end(); ++b) {
rev += abs(a->first - b->first) * (a->second * b->second);
// ^^^^ stren diff ^^^^^^^^ ^^ number of occurences ^^
}
}
cout << rev;
If the strengths tend to be repeated a lot, this could save a lot of cycles.
What exactly we are doing in this problem is: For all combinations of pairs of elements, we are adding up the absolute values of the differences between the elements of the pair. i.e. Consider the sample input
3 10 3 5
Ans (Take only absolute values) = (3-10) + (3-3) + (3-5) + (10-3) + (10-5) + (3-5) = 7 + 0 + 2 + 7 + 5 + 2 = 23
Notice that I have fixed 3, iterated through the remaining elements, found the differences and added them to Ans, then fixed 10, iterated through the remaining elements and so on till the last element
Unfortunately, N(N-1)/2 iterations are required for the above procedure, which wouldn't be ok for the time limit.
Could we better it?
Let's sort the array and repeat this procedure. After sorting, the sample input is now 3 3 5 10
Let's start by fixing the greatest element, 10 and iterating through the array like how we did before (of course, the time complexity is the same)
Ans = (10-3) + (10-3) + (10-5) + (5-3) + (5-3) + (3-3) = 7 + 7 + 5 + 2 + 2 = 23
We could rearrange the above as
Ans = (10)(3)-(3+3+5) + 5(2) - (3+3) + 3(1) - (3)
Notice a pattern? Let's generalize it.
Suppose we have an array of strengths arr[N] of size N indexed from 0
Ans = (arr[N-1])(N-1) - (arr[0] + arr[1] + ... + arr[N-2]) + (arr[N-2])(N-2) - (arr[0] + arr[1] + arr[N-3]) + (arr[N-3])(N-3) - (arr[0] + arr[1] + arr[N-4]) + ... and so on
Right. So let's put this new idea to work. We'll introduce a 'sum' variable. Some basic DP to the rescue.
For i=0 to N-1
sum = sum + arr[i]
Ans = Ans + (arr[i+1]*(i+1)-sum)
That's it, you just have to sort the array and iterate only once through it. Excluding the sorting part, it's down to N iterations from N(N-1)/2, I suppose that's called O(N) time EDIT: That is O(N log N) time overall
Hope it helped!

Finding the smallest possible number which cannot be represented as sum of 1,2 or other numbers in the sequence

I am a newbie in C++ and need logical help in the following task.
Given a sequence of n positive integers (n < 10^6; each given integer is less than 10^6), write a program to find the smallest positive integer, which cannot be expressed as a sum of 1, 2, or more items of the given sequence (i.e. each item could be taken 0 or 1 times). Examples: input: 2 3 4, output: 1; input: 1 2 6, output: 4
I cannot seem to construct the logic out of it, why the last output is 4 and how to implement it in C++, any help is greatly appreciated.
Here is my code so far:
#include<iostream>
using namespace std;
const int SIZE = 3;
int main()
{
//Lowest integer by default
int IntLowest = 1;
int x = 0;
//Our sequence numbers
int seq;
int sum = 0;
int buffer[SIZE];
//Loop through array inputting sequence numbers
for (int i = 0; i < SIZE; i++)
{
cout << "Input sequence number: ";
cin >> seq;
buffer[i] = seq;
sum += buffer[i];
}
int UpperBound = sum + 1;
int a = buffer[x] + buffer[x + 1];
int b = buffer[x] + buffer[x + 2];
int c = buffer[x + 1] + buffer[x + 2];
int d = buffer[x] + buffer[x + 1] + buffer[x + 2];
for (int y = IntLowest - 1; y < UpperBound; y++)
{
//How should I proceed from here?
}
return 0;
}
What the answer of Voreno suggests is in fact solving 0-1 knapsack problem (http://en.wikipedia.org/wiki/Knapsack_problem#0.2F1_Knapsack_Problem). If you follow the link you can read how it can be done without constructing all subsets of initial set (there are too much of them, 2^n). And it would work if the constraints were a bit smaller, like 10^3.
But with n = 10^6 it still requires too much time and space. But there is no need to solve knapsack problem - we just need to find first number we can't get.
The better solution would be to sort the numbers and then iterate through them once, finding for each prefix of your array a number x, such that with that prefix you can get all numbers in interval [1..x]. The minimal number that we cannot get at this point is x + 1. When you consider the next number a[i] you have two options:
a[i] <= x + 1, then you can get all numbers up to x + a[i],
a[i] > x + 1, then you cannot get x + 1 and you have your answer.
Example:
you are given numbers 1, 4, 12, 2, 3.
You sort them (and get 1, 2, 3, 4, 12), start with x = 0, consider each element and update x the following way:
1 <= x + 1, so x = 0 + 1 = 1.
2 <= x + 1, so x = 1 + 2 = 3.
3 <= x + 1, so x = 3 + 3 = 6.
4 <= x + 1, so x = 6 + 4 = 10.
12 > x + 1, so we have found the answer and it is x + 1 = 11.
(Edit: fixed off-by-one error, added example.)
I think this can be done in O(n) time and O(log2(n)) memory complexities.
Assuming that a BSR (highest set bit index) (floor(log2(x))) implementation in O(1) is used.
Algorithm:
1 create an array of (log2(MAXINT)) buckets, 20 in case of 10^6, Each bucket contains the sum and min values (init: min = 2^(i+1)-1, sum = 0). (lazy init may be used for small n)
2 one pass over the input, storing each value in the buckets[bsr(x)].
for (x : buffer) // iterate input
buckets[bsr(x)].min = min(buckets[bsr(x)].min, x)
buckets[bsr(x)].sum += x
3 Iterate over buckets, maintaining unreachable:
int unreachable = 1 // 0 is always reachable
for(b : buckets)
if (unreachable >= b.min)
unreachable += b.sum
else
break
return unreachable
This works because, assuming we are at bucket i, lets consider the two cases:
unreachable >= b.min is true: because this bucket contains values in the range [2^i...2^(i+1)-1], this implies that 2^i <= b.min. in turn, b.min <= unreachable. therefor unreachable+b.min >= 2^(i+1). this means that all values in the bucket may be added (after adding b.min all the other values are smaller) i.e. unreachable += b.sum.
unreachable >= b.min is false: this means that b.min (the smallest number the the remaining sequence) is greater than unreachable. thus we need to return unreachable.
The output of the second input is 4 because that is the smallest positive number that cannot be expressed as a sum of 1,2 or 6 if you can take each item only 0 or 1 times. I hope this can help you understand more:
You have 3 items in that list: 1,2,6
Starting from the smallest positive integer, you start checking if that integer can be the result of the sum of 1 or more numbers of the given sequence.
1 = 1+0+0
2 = 0+2+0
3 = 1+2+0
4 cannot be expressed as a result of the sum of one of the items in the list (1,2,6). Thus 4 is the smallest positive integer which cannot be expressed as a sum of the items of that given sequence.
The last output is 4 because:
1 = 1
2 = 2
1 + 2 = 3
1 + 6 = 7
2 + 6 = 8
1 + 2 + 6 = 9
Therefore, the lowest integer that cannot be represented by any combination of your inputs (1, 2, 6) is 4.
What the question is asking:
Part 1. Find the largest possible integer that can be represented by your input numbers (ie. the sum of all the numbers you are given), that gives the upper bound
UpperBound = sum(all_your_inputs) + 1
Part 2. Find all the integers you can get, by combining the different integers you are given. Ie if you are given a, b and c as integers, find:
a + b, a + c, b + c, and a + b + c
Part 2) + the list of integers, gives you all the integers you can get using your numbers.
cycle for each integer from 1 to UpperBound
for i = 1 to UpperBound
if i not = a number in the list from point 2)
i = your smallest integer
break
This is a clumsy way of doing it, but I'm sure that with some maths it's possible to find a better way?
EDIT: Improved solution
//sort your input numbers from smallest to largest
input_numbers = sort(input_numbers)
//create a list of integers that have been tried numbers
tried_ints = //empty list
for each input in input_numbers
//build combinations of sums of this input and any of the previous inputs
//add the combinations to tried_ints, if not tried before
for 1 to input
//check whether there is a gap in tried_ints
if there_is_gap
//stop the program, return the smallest integer
//the first gap number is the smallest integer

Find two missing numbers

We have a machine with O(1) memory and we want to pass n numbers (one by one) in the first pass, and then we exclude the two numbers and we will pass n-2 numbers to the machine.
write an algorithm that finds missing numbers.
It can be done with O(1) memory.
You only need a few integers to keep track of some running sums. The integers do not require log n bits (where n is the number of input integers), they only require 2b+1 bits, where b is the number of bits in an individual input integer.
When you first read the stream add all the numbers and all of their squares, i.e. for each input number, n, do the following:
sum += n
sq_sum += n*n
Then on the second stream do the same thing for two different values, sum2 and sq_sum2. Now do the following maths:
sum - sum2 = a + b
sq_sum - sq_sum2 = a^2 + b^2
(a + b)(a + b) = a^2 + b^2 + 2ab
(a + b)(a + b) - (a^2 + b^2) = 2ab
(sum*sum - sq_sum) = 2ab
(a - b)(a - b) = a^2 + b^2 - 2ab
= sq_sum - (sum*sum - sq_sum) = 2sq_sum - sum*sum
sqrt(2sq_sum - sum*sum) = sqrt((a - b)(a - b)) = a - b
((a + b) - (a - b)) / 2 = b
(a + b) - b = a
You need 2b+1 bits in all intermediate results because you are storing products of two input integers, and in one case multiplying one of those values by two.
Assuming the numbers are ranging from 1..N and 2 of them are missing - x and y, you can do the following:
Use Gauss formula: sum = N(N+1)/2
sum - actual_sum = x + y
Use product of numbers: product = 1*2..*N = N!
product - actual_product = x * y
Resolve x,y and you have your missing numbers.
In short - go through the array and sum up each element to get the actual_sum, multiply each element to get actual_product. Then resolve the two equations for x an y.
It cannot be done with O(1) memory.
Assume you have a constant k bits of memory - then you can have 2^k possible states for your algorithm.
However - input is not limited, and assume there are (2^k) + 1 possible answers for (2^k) + 1 different problem cases, from piegeonhole principle, you will return the same answer twice for 2 problems with different answers, and thus your algorithm is wrong.
The following came to my mind as soon as I finished reading the question. But the answers above suggest that it is not possible with O(1) memory or that there should be a constraint on the range of numbers. Tell me if my understanding of the question is wrong. Ok, so here goes
You have O(1) memory - which means you have constant amount of memory.
When the n numbers are passed to you 1st time, just keep adding them in one variable and keep multiplying them in another. So at the end of 1st pass you have the sum and product of all the numbers in 2 variables S1 and P1. You have used 2 variable till now (+1 if you reading the numbers in memory).
When the (n-2) numbers are passed to you the second time, do the same. Store the sum and product of the (n-2) numbers in 2 other variables S2 and P2. You have used 4 variables till now (+1 if you reading the numbers in memory).
If the two missing numbers are x and y, then
x + y = S1 - S2
x*y = P1/P2;
You have two equations in two variables. Solve them.
So you have used a constant amount of memory (independent of n).
void Missing(int arr[], int size)
{
int xor = arr[0]; /* Will hold xor of all elements */
int set_bit_no; /* Will have only single set bit of xor */
int i;
int n = size - 2;
int x = 0, y = 0;
/* Get the xor of all elements in arr[] and {1, 2 .. n} */
for(i = 1; i < size; i++)
xor ^= arr[i];
for(i = 1; i <= n; i++)
xor ^= i;
/* Get the rightmost set bit in set_bit_no */
set_bit_no = xor & ~(xor-1);
/* Now divide elements in two sets by comparing rightmost set
bit of xor with bit at same position in each element. */
for(i = 0; i < size; i++)
{
if(arr[i] & set_bit_no)
x = x ^ arr[i]; /*XOR of first set in arr[] */
else
y = y ^ arr[i]; /*XOR of second set in arr[] */
}
for(i = 1; i <= n; i++)
{
if(i & set_bit_no)
x = x ^ i; /*XOR of first set in arr[] and {1, 2, ...n }*/
else
y = y ^ i; /*XOR of second set in arr[] and {1, 2, ...n } */
}
printf("\n The two repeating missing elements are are %d & %d ", x, y);
}
Please look at the solution link below. It explains an XOR method.
This method is more efficient than any of the methods explained above.
It might be the same as Victor above, but there is an explanation as to why this works.
Solution here
Here is the simple solution which does not require any quadratic formula or multiplication:
Let say B is the sum of two missing numbers.
The set of two missing numbers will be one from:
(1,B-1),(2,B-1)...(B-1,1)
Therefore, we know that one of those two numbers will be less than or equal to the half of B.
We know that we can calculate the B (sum of both missing number).
So, once we have B, we will find the sum of all numbers in the list which are less than or equal to B/2 and subtract that from the sum of (1 to B/2) to get the first number. And then, we get the second number by subtracting first number from B. In below code, rem_sum is B.
public int[] findMissingTwoNumbers(int [] list, int N){
if(list.length == 0 || list.length != N - 2)return new int[0];
int rem_sum = (N*(N + 1))/2;
for(int i = 0; i < list.length; i++)rem_sum -= list[i];
int half = rem_sum/2;
if(rem_sum%2 == 0)half--; //both numbers cannot be the same
int rem_half = getRemHalf(list,half);
int [] result = {rem_half, rem_sum - rem_half};
return result;
}
private int getRemHalf(int [] list, int half){
int rem_half = (half*(half + 1))/2;
for(int i = 0; i < list.length; i++){
if(list[i] <= half)rem_half -= list[i];
}
return rem_half;
}

Porting optimized Sieve of Eratosthenes from Python to C++

Some time ago I used the (blazing fast) primesieve in python that I found here: Fastest way to list all primes below N
To be precise, this implementation:
def primes2(n):
""" Input n>=6, Returns a list of primes, 2 <= p < n """
n, correction = n-n%6+6, 2-(n%6>1)
sieve = [True] * (n/3)
for i in xrange(1,int(n**0.5)/3+1):
if sieve[i]:
k=3*i+1|1
sieve[ k*k/3 ::2*k] = [False] * ((n/6-k*k/6-1)/k+1)
sieve[k*(k-2*(i&1)+4)/3::2*k] = [False] * ((n/6-k*(k-2*(i&1)+4)/6-1)/k+1)
return [2,3] + [3*i+1|1 for i in xrange(1,n/3-correction) if sieve[i]]
Now I can slightly grasp the idea of the optimizing by automaticly skipping multiples of 2, 3 and so on, but when it comes to porting this algorithm to C++ I get stuck (I have a good understanding of python and a reasonable/bad understanding of C++, but good enough for rock 'n roll).
What I currently have rolled myself is this (isqrt() is just a simple integer square root function):
template <class T>
void primesbelow(T N, std::vector<T> &primes) {
T sievemax = (N-3 + (1-(N % 2))) / 2;
T i;
T sievemaxroot = isqrt(sievemax) + 1;
boost::dynamic_bitset<> sieve(sievemax);
sieve.set();
primes.push_back(2);
for (i = 0; i <= sievemaxroot; i++) {
if (sieve[i]) {
primes.push_back(2*i+3);
for (T j = 3*i+3; j <= sievemax; j += 2*i+3) sieve[j] = 0; // filter multiples
}
}
for (; i <= sievemax; i++) {
if (sieve[i]) primes.push_back(2*i+3);
}
}
This implementation is decent and automatically skips multiples of 2, but if I could port the Python implementation I think it could be much faster (50%-30% or so).
To compare the results (in the hope this question will be successfully answered), the current execution time with N=100000000, g++ -O3 on a Q6600 Ubuntu 10.10 is 1230ms.
Now I would love some help with either understanding what the above Python implementation does or that you would port it for me (not as helpful though).
EDIT
Some extra information about what I find difficult.
I have trouble with the techniques used like the correction variable and in general how it comes together. A link to a site explaining different Eratosthenes optimizations (apart from the simple sites that say "well you just skip multiples of 2, 3 and 5" and then get slam you with a 1000 line C file) would be awesome.
I don't think I would have issues with a 100% direct and literal port, but since after all this is for learning that would be utterly useless.
EDIT
After looking at the code in the original numpy version, it actually is pretty easy to implement and with some thinking not too hard to understand. This is the C++ version I came up with. I'm posting it here in full version to help further readers in case they need a pretty efficient primesieve that is not two million lines of code. This primesieve does all primes under 100000000 in about 415 ms on the same machine as above. That's a 3x speedup, better then I expected!
#include <vector>
#include <boost/dynamic_bitset.hpp>
// http://vault.embedded.com/98/9802fe2.htm - integer square root
unsigned short isqrt(unsigned long a) {
unsigned long rem = 0;
unsigned long root = 0;
for (short i = 0; i < 16; i++) {
root <<= 1;
rem = ((rem << 2) + (a >> 30));
a <<= 2;
root++;
if (root <= rem) {
rem -= root;
root++;
} else root--;
}
return static_cast<unsigned short> (root >> 1);
}
// https://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
// https://stackoverflow.com/questions/5293238/porting-optimized-sieve-of-eratosthenes-from-python-to-c/5293492
template <class T>
void primesbelow(T N, std::vector<T> &primes) {
T i, j, k, l, sievemax, sievemaxroot;
sievemax = N/3;
if ((N % 6) == 2) sievemax++;
sievemaxroot = isqrt(N)/3;
boost::dynamic_bitset<> sieve(sievemax);
sieve.set();
primes.push_back(2);
primes.push_back(3);
for (i = 1; i <= sievemaxroot; i++) {
if (sieve[i]) {
k = (3*i + 1) | 1;
l = (4*k-2*k*(i&1)) / 3;
for (j = k*k/3; j < sievemax; j += 2*k) {
sieve[j] = 0;
sieve[j+l] = 0;
}
primes.push_back(k);
}
}
for (i = sievemaxroot + 1; i < sievemax; i++) {
if (sieve[i]) primes.push_back((3*i+1)|1);
}
}
I'll try to explain as much as I can. The sieve array has an unusual indexing scheme; it stores a bit for each number that is congruent to 1 or 5 mod 6. Thus, a number 6*k + 1 will be stored in position 2*k and k*6 + 5 will be stored in position 2*k + 1. The 3*i+1|1 operation is the inverse of that: it takes numbers of the form 2*n and converts them into 6*n + 1, and takes 2*n + 1 and converts it into 6*n + 5 (the +1|1 thing converts 0 to 1 and 3 to 5). The main loop iterates k through all numbers with that property, starting with 5 (when i is 1); i is the corresponding index into sieve for the number k. The first slice update to sieve then clears all bits in the sieve with indexes of the form k*k/3 + 2*m*k (for m a natural number); the corresponding numbers for those indexes start at k^2 and increase by 6*k at each step. The second slice update starts at index k*(k-2*(i&1)+4)/3 (number k * (k+4) for k congruent to 1 mod 6 and k * (k+2) otherwise) and similarly increases the number by 6*k at each step.
Here's another attempt at an explanation: let candidates be the set of all numbers that are at least 5 and are congruent to either 1 or 5 mod 6. If you multiply two elements in that set, you get another element in the set. Let succ(k) for some k in candidates be the next element (in numerical order) in candidates that is larger than k. In that case, the inner loop of the sieve is basically (using normal indexing for sieve):
for k in candidates:
for (l = k; ; l += 6) sieve[k * l] = False
for (l = succ(k); ; l += 6) sieve[k * l] = False
Because of the limitations on which elements are stored in sieve, that is the same as:
for k in candidates:
for l in candidates where l >= k:
sieve[k * l] = False
which will remove all multiples of k in candidates (other than k itself) from the sieve at some point (either when the current k was used as l earlier or when it is used as k now).
Piggy-Backing onto Howard Hinnant's response, Howard, you don't have to test numbers in the set of all natural numbers not divisible by 2, 3 or 5 for primality, per se. You need simply multiply each number in the array (except 1, which self-eliminates) times itself and every subsequent number in the array. These overlapping products will give you all the non-primes in the array up to whatever point you extend the deterministic-multiplicative process. Thus the first non-prime in the array will be 7 squared, or 49. The 2nd, 7 times 11, or 77, etc. A full explanation here: http://www.primesdemystified.com
As an aside, you can "approximate" prime numbers. Call the approximate prime P. Here are a few formulas:
P = 2*k+1 // not divisible by 2
P = 6*k + {1, 5} // not divisible 2, 3
P = 30*k + {1, 7, 11, 13, 17, 19, 23, 29} // not divisble by 2, 3, 5
The properties of the set of numbers found by these formulas is that P may not be prime, however all primes are in the set P. I.e. if you only test numbers in the set P for prime, you won't miss any.
You can reformulate these formulas to:
P = X*k + {-i, -j, -k, k, j, i}
if that is more convenient for you.
Here is some code that uses this technique with a formula for P not divisible by 2, 3, 5, 7.
This link may represent the extent to which this technique can be practically leveraged.