How to reduce execution time in C++ for the following code? - c++

I have written this code which has an execution time of 3.664 sec but the time limit is 3 seconds.
The question is this-
N teams participate in a league cricket tournament on Mars, where each
pair of distinct teams plays each other exactly once. Thus, there are a total
of (N × (N­1))/2 matches. An expert has assigned a strength to each team,
a positive integer. Strangely, the Martian crowds love one­sided matches
and the advertising revenue earned from a match is the absolute value of
the difference between the strengths of the two matches. Given the
strengths of the N teams, find the total advertising revenue earned from all
the matches.
Input format
Line 1 : A single integer, N.
Line 2 : N space ­separated integers, the strengths of the N teams.
#include<iostream>
using namespace std;
int main()
{
int n;
cin>>n;
int stren[200000];
for(int a=0;a<n;a++)
cin>>stren[a];
long long rev=0;
for(int b=0;b<n;b++)
{
int pos=b;
for(int c=pos;c<n;c++)
{
if(stren[pos]>stren[c])
rev+=(long long)(stren[pos]-stren[c]);
else
rev+=(long long)(stren[c]-stren[pos]);
}
}
cout<<rev;
}
Can you please give me a solution??

Rewrite your loop as:
sort(stren);
for(int b=0;b<n;b++)
{
rev += (2 * b - n + 1) * static_cast<long long>(stren[b]);
}
Live code here
Why does it workYour loops make all pairs of 2 numbers and add the difference to rev. So in a sorted array, bth item is subtracted (n-1-b) times and added b times. Hence the number 2 * b - n + 1
There can be 1 micro optimization that possibly is not needed:
sort(stren);
for(int b = 0, m = 1 - n; b < n; b++, m += 2)
{
rev += m * static_cast<long long>(stren[b]);
}

In place of the if statement, use
rev += std::abs(stren[pos]-stren[c]);
abs returns the positive difference between two integers. This will be much quicker than an if test and ensuing branching. The (long long) cast is also unnecessary although the compiler will probably optimise that out.
There are other optimisations you could make, but this one should do it. If your abs function is poorly implemented on your system, you could always make use of this fast version for computing the absolute value of i:
(i + (i >> 31)) ^ (i >> 31) for a 32 bit int.
This has no branching at all and would beat even an inline ternary! (But you should use int32_t as your data type; if you have 64 bit int then you'll need to adjust my formula.) But we are in the realms of micro-optimisation here.

for(int b = 0; b < n; b++)
{
for(int c = b; c < n; c++)
{
rev += abs(stren[b]-stren[c]);
}
}
This should give you a speed increase, might be enough.

An interesting approach might be to collapse down the strengths from an array - if that distribution is pretty small.
So:
std::unordered_map<int, int> strengths;
for (int i = 0; i < n; ++i) {
int next;
cin >> next;
++strengths[next];
}
This way, we can reduce the number of things we have to sum:
long long rev = 0;
for (auto a = strengths.begin(); a != strengths.end(); ++a) {
for (auto b = std::next(a), b != strengths.end(); ++b) {
rev += abs(a->first - b->first) * (a->second * b->second);
// ^^^^ stren diff ^^^^^^^^ ^^ number of occurences ^^
}
}
cout << rev;
If the strengths tend to be repeated a lot, this could save a lot of cycles.

What exactly we are doing in this problem is: For all combinations of pairs of elements, we are adding up the absolute values of the differences between the elements of the pair. i.e. Consider the sample input
3 10 3 5
Ans (Take only absolute values) = (3-10) + (3-3) + (3-5) + (10-3) + (10-5) + (3-5) = 7 + 0 + 2 + 7 + 5 + 2 = 23
Notice that I have fixed 3, iterated through the remaining elements, found the differences and added them to Ans, then fixed 10, iterated through the remaining elements and so on till the last element
Unfortunately, N(N-1)/2 iterations are required for the above procedure, which wouldn't be ok for the time limit.
Could we better it?
Let's sort the array and repeat this procedure. After sorting, the sample input is now 3 3 5 10
Let's start by fixing the greatest element, 10 and iterating through the array like how we did before (of course, the time complexity is the same)
Ans = (10-3) + (10-3) + (10-5) + (5-3) + (5-3) + (3-3) = 7 + 7 + 5 + 2 + 2 = 23
We could rearrange the above as
Ans = (10)(3)-(3+3+5) + 5(2) - (3+3) + 3(1) - (3)
Notice a pattern? Let's generalize it.
Suppose we have an array of strengths arr[N] of size N indexed from 0
Ans = (arr[N-1])(N-1) - (arr[0] + arr[1] + ... + arr[N-2]) + (arr[N-2])(N-2) - (arr[0] + arr[1] + arr[N-3]) + (arr[N-3])(N-3) - (arr[0] + arr[1] + arr[N-4]) + ... and so on
Right. So let's put this new idea to work. We'll introduce a 'sum' variable. Some basic DP to the rescue.
For i=0 to N-1
sum = sum + arr[i]
Ans = Ans + (arr[i+1]*(i+1)-sum)
That's it, you just have to sort the array and iterate only once through it. Excluding the sorting part, it's down to N iterations from N(N-1)/2, I suppose that's called O(N) time EDIT: That is O(N log N) time overall
Hope it helped!

Related

How to compute first N digits of Euler number in c++? [duplicate]

Can anyone explain how this code for computing of e works? Looks very easy for such complicated task, but I can't even understand the process. It has been created by Xavier Gourdon in 1999.
int main() {
int N = 9009, a[9009], x = 0;
for (int n = N - 1; n > 0; --n) {
a[n] = 1;
}
a[1] = 2, a[0] = 0;
while (N > 9) {
int n = N--;
while (--n) {
a[n] = x % n;
x = 10 * a[n-1] + x/n;
}
printf("%d", x);
}
return 0;
}
I traced the algorithm back to a 1995 paper by Stanley Rabinowitz and Stan Wagon. It's quite interesting.
A bit of background first. Start with the ordinary decimal representation of e:
e = 2.718281828...
This can be expressed as an infinite sum as follows:
e = 2 + 1⁄10(7 + 1⁄10(1 + 1⁄10(8 + 1⁄10(2 + 1⁄10(8 + 1⁄10(1 ...
Obviously this isn't a particularly useful representation; we just have the same digits of e wrapped up inside a complicated expression.
But look what happens when we replace these 1⁄10 factors with the reciprocals of the natural numbers:
e = 2 + 1⁄2(1 + 1⁄3(1 + 1⁄4(1 + 1⁄5(1 + 1⁄6(1 + 1⁄7(1 ...
This so-called mixed-radix representation gives us a sequence consisting of the digit 2 followed by a repeating sequence of 1's. It's easy to see why this works. When we expand the brackets, we end up with the well-known Taylor series for e:
e = 1 + 1 + 1/2! + 1/3! + 1/4! + 1/5! + 1/6! + 1/7! + ...
So how does this algorithm work? Well, we start by filling an array with the mixed-radix number (0; 2; 1; 1; 1; 1; 1; ...). To generate each successive digit, we simply multiply this number by 10 and spit out the leftmost digit.*
But since the number is represented in mixed-radix form, we have to work in a different base at each digit. To do this, we work from right to left, multiplying the nth digit by 10 and replacing it with the resulting value modulo n. If the result was greater than or equal to n, we carry the value x/n to the next digit to the left. (Dividing by n changes the base from 1/n! to 1/(n-1)!, which is what we want). This is effectively what the inner loop does:
while (--n) {
a[n] = x % n;
x = 10 * a[n-1] + x/n;
}
Here, x is initialized to zero at the start of the program, and the initial 0 at the start of the array ensures that it is reset to zero every time the inner loop finishes. As a result, the array will gradually fill with zeroes from the right as the program runs. This is why n can be initialized with the decreasing value N-- at each step of the outer loop.
The additional 9 digits at the end of the array are presumably included to safeguard against rounding errors. When this code is run, x reaches a maximum value of 89671, which means the quotients will be carried across multiple digits.
Notes:
This is a type of spigot algorithm, because it outputs successive digits of e using simple integer arithmetic.
As noted by Rabinowitz and Wagon in their paper, this algorithm was actually invented 50 years ago by A.H.J. Sale
* Except at the first iteration where it outputs two digits ("27")

sum multiples in a given range

well I want to sum up the multiples of 3 and 5. Not too hard if I want just the sum upon to a given number, e.g. -> up to 60 the sum is 870.
But what if I want just the first 15 multiples?
well one way is
void summation (const unsigned long number_n, unsigned long &summe,unsigned int &counter );
void summation (const unsigned long number_n, unsigned long &summe,unsigned int &counter )
{
unsigned int howoften = 0;
summe = 0;
for( unsigned long i = 1; i <=number_n; i++ )
if (howoften <= counter-1)
{
if( !( i % 3 ) || !( i % 5 ) )
{
summe += i;
howoften++;
}
}
counter = howoften;
return;
}
But as expected the runtime is not accceptable for a counter like 1.500.000 :-/
Hm I tried a lot of things but I cannot find a solution by my own.
I also tried a faster summation algorithm like (dont care bout overflow at this point):
int sum(int N);
int sum(int N)
{
int S1, S2, S3;
S1 = ((N / 3)) * (2 * 3 + (N / 3 - 1) * 3) / 2;
S2 = ((N / 5)) * (2 * 5 + (N / 5 - 1) * 5) / 2;
S3 = ((N / 15)) *(2 * 15 + (N / 15 - 1) * 15) / 2;
return S1 + S2 - S3;
}
or even
unsigned long sum1000 (unsigned long target);
unsigned long sum1000 (unsigned long target)
{
unsigned int summe = 0;
for (unsigned long i = 0; i<=target; i+=3) summe+=i;
for (unsigned long i = 0; i<=target; i+=5) summe+=i;
for (unsigned long i = 0; i<=target; i+=15) summe-=i;
return summe;
}
But I'm not smart enough to set up an algorithm which is fast enough (I say 5-10 sec. are ok)
The whole sum of the multiples is not my problem, the first N multiples are :)
Thanks for reading, and if u have any ideas, it would be great
Some prerequisites:
(dont care bout overflow at this point)
Ok, so lets ignore that completely.
Next, the sum of all numbers from 1 till n can be calculated from (see eg here):
int sum(int n) {
return (n * (n+1)) / 2;
}
Note that n*(n+1) is an even number for any n, so using integer artihmetics for /2 is not an issue.
How does this help to get sum of numbers divisible by 3? Lets start with even numbers (divisble by 2). We write out the long form of the sum above:
1 + 2 + 3 + 4 + ... + n
multiply each term by 2:
2 + 4 + 6 + 8 + ... + 2*n
now I hope you see that this sum contains all numbers that are divisible by 2 up to 2*n. Those numbers are the first n numbers that are divisble by 2.
Hence, the sum of the fist n numbers that are divisble by 2 is 2 * sum(n). We can generalize that to write a function that returns the sum of the first n numbers that are divisble by m:
int sum_div_m( int n, int m) {
return sum(n) * m;
}
First I want to reproduce your inital example "up to 60 the sum is 870". For that we consider that
60/3 == 20 -> there are 20 numbers divisble by 3 and we get their sum from sum_div_m(20,3)
60/5 == 12 -> there are 12 numbers divisible by 5 and we get their sum from sum_div_m(12,5)
we cannot simply add the above two results because then we would count some numbers double. Those numbers are those divisible by 3 and 5, ie divisible by 15
60/15 == 4 -> there are 4 numbers divisble by 3 and 5 and we get their sum from sum_div_m(4,15).
Putting it together, the sum of all numbers divisible by 3 or 5 up to 60 is
int x = sum_div_m( 20,3) + sum_div_m( 12,5) - sum_div_m( 4,15);
Finally, back to your actual question:
But what if I want just the first 15 multiples?
Above we saw that there are
n == x/3 + x/5 - x/15
numbers that are divisble by 3 or 5 in the range 0...x. All division are using integer arithmetics. We already had the example of 60 with 20+12-4 == 28 divisble numbers. Another example is x=10 where there are n = 3 + 2 - 0 = 5 numbers divisible by 3 or 5 (3,5,6,9,10). We have to be a bit careful with integer arithmetics, but no big deal:
15*n == 5*x + 3*x - x
-> 15*n == 7*x
-> x == 15*n/7
Quick test: 15*28/7 == 60, looks correct.
Putting it all together the sum of the first n numbers divisible by 3 or 5 is
int sum_div_3_5(int n) {
int x = (15*n)/7;
return sum_div_m(x/3, 3) + sum_div_m(x/5, 5) - sum_div_m(x/15, 15);
}
To check that this is correct we can again try sum_div_3_5(28) to see that it returns 870 (because there are 28 numbers divisble by 3 or 5 up to 60 and that was the initial example).
PS Turned out that the question is really only about doing the maths. Though that isnt a big surprise. When you want to write efficient code you should primarily take care to use the right algorithm. Optimizations based on a given algorithm often are less effective than choosing a better algorithm. Once you chose an algorithm, often it does not pay off to try to be "clever" because compilers are much better at optimizing. For example this code:
int main(){
int x = 0;
int n = 60;
for (int i=0; i <= n; ++i) x += i;
return x;
}
will be be optimized by most compilers to a simple return 1830; when optimizations are turned on because compilers do know how to add all numbers from 1 to n. See here.
You can do it in compile time recursively by using class templates/meta functions if your value is known in compile time. So there will be no runtime cost.
Ex:
template<int n>
struct Sum{
static const int value = n + Sum<n-1>::value;
};
template<>
struct Sum<0>{
static constexpr int value = 0;
};
int main()
{
constexpr auto x = Sum<100>::value;
// x is known (5050) in compile time
return 0;
}

Divide array into smaller consecutive parts such that NEO value is maximal

On this years Bubble Cup (finished) there was the problem NEO (which I couldn't solve), which asks
Given array with n integer elements. We divide it into several part (may be 1), each part is a consecutive of elements. The NEO value in that case is computed by: Sum of value of each part. Value of a part is sum all elements in this part multiple by its length.
Example: We have array: [ 2 3 -2 1 ]. If we divide it like: [2 3] [-2 1]. Then NEO = (2 + 3) * 2 + (-2 + 1) * 2 = 10 - 2 = 8.
The number of elements in array is smaller then 10^5 and the numbers are integers between -10^6 and 10^6
I've tried something like divide and conquer to constantly split array into two parts if it increases the maximal NEO number otherwise return the NEO of the whole array. But unfortunately the algorithm has worst case O(N^2) complexity (my implementation is below) so I'm wondering whether there is a better solution
EDIT: My algorithm (greedy) doesn't work, taking for example [1,2,-6,2,1] my algorithm returns the whole array while to get the maximal NEO value is to take parts [1,2],[-6],[2,1] which gives NEO value of (1+2)*2+(-6)+(1+2)*2=6
#include <iostream>
int maxInterval(long long int suma[],int first,int N)
{
long long int max = -1000000000000000000LL;
long long int curr;
if(first==N) return 0;
int k;
for(int i=first;i<N;i++)
{
if(first>0) curr = (suma[i]-suma[first-1])*(i-first+1)+(suma[N-1]-suma[i])*(N-1-i); // Split the array into elements from [first..i] and [i+1..N-1] store the corresponding NEO value
else curr = suma[i]*(i-first+1)+(suma[N-1]-suma[i])*(N-1-i); // Same excpet that here first = 0 so suma[first-1] doesn't exist
if(curr > max) max = curr,k=i; // find the maximal NEO value for splitting into two parts
}
if(k==N-1) return max; // If the max when we take the whole array then return the NEO value of the whole array
else
{
return maxInterval(suma,first,k+1)+maxInterval(suma,k+1,N); // Split the 2 parts further if needed and return it's sum
}
}
int main() {
int T;
std::cin >> T;
for(int j=0;j<T;j++) // Iterate over all the test cases
{
int N;
long long int NEO[100010]; // Values, could be long int but just to be safe
long long int suma[100010]; // sum[i] = sum of NEO values from NEO[0] to NEO[i]
long long int sum=0;
int k;
std::cin >> N;
for(int i=0;i<N;i++)
{
std::cin >> NEO[i];
sum+=NEO[i];
suma[i] = sum;
}
std::cout << maxInterval(suma,0,N) << std::endl;
}
return 0;
}
This is not a complete solution but should provide some helpful direction.
Combining two groups that each have a positive sum (or one of the sums is non-negative) would always yield a bigger NEO than leaving them separate:
m * a + n * b < (m + n) * (a + b) where a, b > 0 (or a > 0, b >= 0); m and n are subarray lengths
Combining a group with a negative sum with an entire group of non-negative numbers always yields a greater NEO than combining it with only part of the non-negative group. But excluding the group with the negative sum could yield an even greater NEO:
[1, 1, 1, 1] [-2] => m * a + 1 * (-b)
Now, imagine we gradually move the dividing line to the left, increasing the sum b is combined with. While the expression on the right is negative, the NEO for the left group keeps decreasing. But if the expression on the right gets positive, relying on our first assertion (see 1.), combining the two groups would always be greater than not.
Combining negative numbers alone in sequence will always yield a smaller NEO than leaving them separate:
-a - b - c ... = -1 * (a + b + c ...)
l * (-a - b - c ...) = -l * (a + b + c ...)
-l * (a + b + c ...) < -1 * (a + b + c ...) where l > 1; a, b, c ... > 0
O(n^2) time, O(n) space JavaScript code:
function f(A){
A.unshift(0);
let negatives = [];
let prefixes = new Array(A.length).fill(0);
let m = new Array(A.length).fill(0);
for (let i=1; i<A.length; i++){
if (A[i] < 0)
negatives.push(i);
prefixes[i] = A[i] + prefixes[i - 1];
m[i] = i * (A[i] + prefixes[i - 1]);
for (let j=negatives.length-1; j>=0; j--){
let negative = prefixes[negatives[j]] - prefixes[negatives[j] - 1];
let prefix = (i - negatives[j]) * (prefixes[i] - prefixes[negatives[j]]);
m[i] = Math.max(m[i], prefix + negative + m[negatives[j] - 1]);
}
}
return m[m.length - 1];
}
console.log(f([1, 2, -5, 2, 1, 3, -4, 1, 2]));
console.log(f([1, 2, -4, 1]));
console.log(f([2, 3, -2, 1]));
console.log(f([-2, -3, -2, -1]));
Update
This blog provides that we can transform the dp queries from
dp_i = sum_i*i + max(for j < i) of ((dp_j + sum_j*j) + (-j*sum_i) + (-i*sumj))
to
dp_i = sum_i*i + max(for j < i) of (dp_j + sum_j*j, -j, -sum_j) ⋅ (1, sum_i, i)
which means we could then look at each iteration for an already seen vector that would generate the largest dot product with our current information. The math alluded to involves convex hull and farthest point query, which are beyond my reach to implement at this point but will make a study of.

C++: What are some general ways to make code more efficient for use with large numbers?

Please when answering this question try to be as general as possible to help the wider community, rather than just specifically helping my issue (although helping my issue would be great too ;) )
I seem to be encountering this problem time and time again with the simple problems on Project Euler. Most commonly are the problems that require a computation of the prime numbers - these without fail always fail to terminate for numbers greater than about 60,000.
My most recent issue is with Problem 12:
The sequence of triangle numbers is generated by adding the natural numbers. So the 7th triangle number would be 1 + 2 + 3 + 4 + 5 + 6 + 7 = 28. The first ten terms would be:
1, 3, 6, 10, 15, 21, 28, 36, 45, 55, ...
Let us list the factors of the first seven triangle numbers:
1: 1
3: 1,3
6: 1,2,3,6
10: 1,2,5,10
15: 1,3,5,15
21: 1,3,7,21
28: 1,2,4,7,14,28
We can see that 28 is the first triangle number to have over five divisors.
What is the value of the first triangle number to have over five hundred divisors?
Here is my code:
#include <iostream>
#include <vector>
#include <cmath>
using namespace std;
int main() {
int numberOfDivisors = 500;
//I begin by looping from 1, with 1 being the 1st triangular number, 2 being the second, and so on.
for (long long int i = 1;; i++) {
long long int triangularNumber = (pow(i, 2) + i)/2
//Once I have the i-th triangular, I loop from 1 to itself, and add 1 to count each time I encounter a divisor, giving the total number of divisors for each triangular.
int count = 0;
for (long long int j = 1; j <= triangularNumber; j++) {
if (triangularNumber%j == 0) {
count++;
}
}
//If the number of divisors is 500, print out the triangular and break the code.
if (count == numberOfDivisors) {
cout << triangularNumber << endl;
break;
}
}
}
This code gives the correct answers for smaller numbers, and then either fails to terminate or takes an age to do so!
So firstly, what can I do with this specific problem to make my code more efficient?
Secondly, what are some general tips both for myself and other new C++ users for making code more efficient? (I.e. applying what we learn here in the future.)
Thanks!
The key problem is that your end condition is bad. You are supposed to stop when count > 500, but you look for an exact match of count == 500, therefore you are likely to blow past the correct answer without detecting it, and keep going ... maybe forever.
If you fix that, you can post it to code review. They might say something like this:
Break it down into separate functions for finding the next triangle number, and counting the factors of some number.
When you find the next triangle number, you execute pow. I perform a single addition.
For counting the number of factors in a number, a google search might help. (e.g. http://www.cut-the-knot.org/blue/NumberOfFactors.shtml ) You can build a list of prime numbers as you go, and use that to quickly find a prime factorization, from which you can compute the number of factors without actually counting them. When the numbers get big, that loop gets big.
Tldr: 76576500.
About your Euler problem, some math:
Preliminary 1:
Let's call the n-th triangle number T(n).
T(n) = 1 + 2 + 3 + ... + n = (n^2 + n)/2 (sometimes attributed to Gauss, sometimes someone else). It's not hard to figure it out:
1+2+3+4+5+6+7+8+9+10 =
(1+10) + (2+9) + (3+8) + (4+7) + (5+6) =
11 + 11 + 11 + 11 + 11 =
55 =
110 / 2 =
(10*10 + 10)/2
Because of its definition, it's trivial that T(n) + n + 1 = T(n+1), and that with a<b, T(a)<T(b) is true too.
Preliminary 2:
Let's call the divisor count D. D(1)=1, D(4)=3 (because 1 2 4).
For a n with c non-repeating prime factors (not just any divisors, but prime factors, eg. n = 42 = 2 * 3 * 7 has c = 3), D(n) is c^2: For each factor, there are two possibilites (use it or not). The 9 possibile divisors for the examples are: 1, 2, 3, 7, 6 (2*3), 14 (2*7), 21 (3*7), 42 (2*3*7).
More generally with repeating, the solution for D(n) is multiplying (Power+1) together. Example 126 = 2^1 * 3^2 * 7^1: Because it has two 3, the question is no "use 3 or not", but "use it 1 time, 2 times or not" (if one time, the "first" or "second" 3 doesn't change the result). With the powers 1 2 1, D(126) is 2*3*2=12.
Preliminary 3:
A number n and n+1 can't have any common prime factor x other than 1 (technically, 1 isn't a prime, but whatever). Because if both n/x and (n+1)/x are natural numbers, (n+1)/x - n/x has to be too, but that is 1/x.
Back to Gauss: If we know the prime factors for a certain n and n+1 (needed to calculate D(n) and D(n+1)), calculating D(T(n)) is easy. T(N) = (n^2 + n) / 2 = n * (n+1) / 2. As n and n+1 don't have common prime factors, just throwing together all factors and removing one 2 because of the "/2" is enough. Example: n is 7, factors 7 = 7^1, and n+1 = 8 = 2^3. Together it's 2^3 * 7^1, removing one 2 is 2^2 * 7^1. Powers are 2 1, D(T(7)) = 3*2 = 6. To check, T(7) = 28 = 2^2 * 7^1, the 6 possible divisors are 1 2 4 7 14 28.
What the program could do now: Loop through all n from 1 to something, always factorize n and n+1, use this to get the divisor count of the n-th triangle number, and check if it is >500.
There's just the tiny problem that there are no efficient algorithms for prime factorization. But for somewhat small numbers, todays computers are still fast enough, and keeping all found factorizations from 1 to n helps too for finding the next one (for n+1). Potential problem 2 are too large numbers for longlong, but again, this is no problem here (as can be found out with trying).
With the described process and the program below, I got
the 12375th triangle number is 76576500 and has 576 divisors
#include <iostream>
#include <vector>
#include <cstdint>
using namespace std;
const int limit = 500;
vector<uint64_t> knownPrimes; //2 3 5 7...
//eg. [14] is 1 0 0 1 ... because 14 = 2^1 * 3^0 * 5^0 * 7^1
vector<vector<uint32_t>> knownFactorizations;
void init()
{
knownPrimes.push_back(2);
knownFactorizations.push_back(vector<uint32_t>(1, 0)); //factors for 0 (dummy)
knownFactorizations.push_back(vector<uint32_t>(1, 0)); //factors for 1 (dummy)
knownFactorizations.push_back(vector<uint32_t>(1, 1)); //factors for 2
}
void addAnotherFactorization()
{
uint64_t number = knownFactorizations.size();
size_t len = knownPrimes.size();
for(size_t i = 0; i < len; i++)
{
if(!(number % knownPrimes[i]))
{
//dividing with a prime gets a already factorized number
knownFactorizations.push_back(knownFactorizations[number / knownPrimes[i]]);
knownFactorizations[number][i]++;
return;
}
}
//if this failed, number is a newly found prime
//because a) it has no known prime factors, so it must have others
//and b) if it is not a prime itself, then it's factors should've been
//found already (because they are smaller than the number itself)
knownPrimes.push_back(number);
len = knownFactorizations.size();
for(size_t s = 0; s < len; s++)
{
knownFactorizations[s].push_back(0);
}
knownFactorizations.push_back(knownFactorizations[0]);
knownFactorizations[number][knownPrimes.size() - 1]++;
}
uint64_t calculateDivisorCountOfN(uint64_t number)
{
//factors for number must be known
uint64_t res = 1;
size_t len = knownFactorizations[number].size();
for(size_t s = 0; s < len; s++)
{
if(knownFactorizations[number][s])
{
res *= (knownFactorizations[number][s] + 1);
}
}
return res;
}
uint64_t calculateDivisorCountOfTN(uint64_t number)
{
//factors for number and number+1 must be known
uint64_t res = 1;
size_t len = knownFactorizations[number].size();
vector<uint32_t> tmp(len, 0);
size_t s;
for(s = 0; s < len; s++)
{
tmp[s] = knownFactorizations[number][s]
+ knownFactorizations[number+1][s];
}
//remove /2
tmp[0]--;
for(s = 0; s < len; s++)
{
if(tmp[s])
{
res *= (tmp[s] + 1);
}
}
return res;
}
int main()
{
init();
uint64_t number = knownFactorizations.size() - 2;
uint64_t DTn = 0;
while(DTn <= limit)
{
number++;
addAnotherFactorization();
DTn = calculateDivisorCountOfTN(number);
}
uint64_t tn;
if(number % 2) tn = ((number+1)/2)*number;
else tn = (number/2)*(number+1);
cout << "the " << number << "th triangle number is "
<< tn << " and has " << DTn << " divisors" << endl;
return 0;
}
About your general question about speed:
1) Algorithms.
How to know them? For (relatively) simple problems, either reading a book/Wikipedia/etc. or figuring it out if you can. For harder stuff, learning more basic things and gaining experience is necessary before it's even possible to understand them, eg. studying CS and/or maths ... number theory helps a lot for your Euler problem. (It will help less to understand how a MP3 file is compressed ... there are many areas, it's not possible to know everything.).
2a) Automated compiler optimizations of frequently used code parts / patterns
2b) Manual timing what program parts are the slowest, and (when not replacing it with another algorithm) changing it in a way that eg. requires less data send to slow devices (HDD, hetwork...), less RAM memory access, less CPU cycles, works better together with OS scheduler and memory management strategies, uses the CPU pipeline/caches better etc.etc. ... this is both education and experience (and a big topic).
And because long variables have a limited size, sometimes it is necessary to use custom types that use eg. a byte array to store a single digit in each byte. That way, it's possible to use the whole RAM for a single number if you want to, but the downside is you/someone has to reimplement stuff like addition and so on for this kind of number storage. (Of course, libs for that exist already, without writing everything from scratch).
Btw., pow is a floating point function and may get you inaccurate results. It's not appropriate to use it in this case.

Find two missing numbers

We have a machine with O(1) memory and we want to pass n numbers (one by one) in the first pass, and then we exclude the two numbers and we will pass n-2 numbers to the machine.
write an algorithm that finds missing numbers.
It can be done with O(1) memory.
You only need a few integers to keep track of some running sums. The integers do not require log n bits (where n is the number of input integers), they only require 2b+1 bits, where b is the number of bits in an individual input integer.
When you first read the stream add all the numbers and all of their squares, i.e. for each input number, n, do the following:
sum += n
sq_sum += n*n
Then on the second stream do the same thing for two different values, sum2 and sq_sum2. Now do the following maths:
sum - sum2 = a + b
sq_sum - sq_sum2 = a^2 + b^2
(a + b)(a + b) = a^2 + b^2 + 2ab
(a + b)(a + b) - (a^2 + b^2) = 2ab
(sum*sum - sq_sum) = 2ab
(a - b)(a - b) = a^2 + b^2 - 2ab
= sq_sum - (sum*sum - sq_sum) = 2sq_sum - sum*sum
sqrt(2sq_sum - sum*sum) = sqrt((a - b)(a - b)) = a - b
((a + b) - (a - b)) / 2 = b
(a + b) - b = a
You need 2b+1 bits in all intermediate results because you are storing products of two input integers, and in one case multiplying one of those values by two.
Assuming the numbers are ranging from 1..N and 2 of them are missing - x and y, you can do the following:
Use Gauss formula: sum = N(N+1)/2
sum - actual_sum = x + y
Use product of numbers: product = 1*2..*N = N!
product - actual_product = x * y
Resolve x,y and you have your missing numbers.
In short - go through the array and sum up each element to get the actual_sum, multiply each element to get actual_product. Then resolve the two equations for x an y.
It cannot be done with O(1) memory.
Assume you have a constant k bits of memory - then you can have 2^k possible states for your algorithm.
However - input is not limited, and assume there are (2^k) + 1 possible answers for (2^k) + 1 different problem cases, from piegeonhole principle, you will return the same answer twice for 2 problems with different answers, and thus your algorithm is wrong.
The following came to my mind as soon as I finished reading the question. But the answers above suggest that it is not possible with O(1) memory or that there should be a constraint on the range of numbers. Tell me if my understanding of the question is wrong. Ok, so here goes
You have O(1) memory - which means you have constant amount of memory.
When the n numbers are passed to you 1st time, just keep adding them in one variable and keep multiplying them in another. So at the end of 1st pass you have the sum and product of all the numbers in 2 variables S1 and P1. You have used 2 variable till now (+1 if you reading the numbers in memory).
When the (n-2) numbers are passed to you the second time, do the same. Store the sum and product of the (n-2) numbers in 2 other variables S2 and P2. You have used 4 variables till now (+1 if you reading the numbers in memory).
If the two missing numbers are x and y, then
x + y = S1 - S2
x*y = P1/P2;
You have two equations in two variables. Solve them.
So you have used a constant amount of memory (independent of n).
void Missing(int arr[], int size)
{
int xor = arr[0]; /* Will hold xor of all elements */
int set_bit_no; /* Will have only single set bit of xor */
int i;
int n = size - 2;
int x = 0, y = 0;
/* Get the xor of all elements in arr[] and {1, 2 .. n} */
for(i = 1; i < size; i++)
xor ^= arr[i];
for(i = 1; i <= n; i++)
xor ^= i;
/* Get the rightmost set bit in set_bit_no */
set_bit_no = xor & ~(xor-1);
/* Now divide elements in two sets by comparing rightmost set
bit of xor with bit at same position in each element. */
for(i = 0; i < size; i++)
{
if(arr[i] & set_bit_no)
x = x ^ arr[i]; /*XOR of first set in arr[] */
else
y = y ^ arr[i]; /*XOR of second set in arr[] */
}
for(i = 1; i <= n; i++)
{
if(i & set_bit_no)
x = x ^ i; /*XOR of first set in arr[] and {1, 2, ...n }*/
else
y = y ^ i; /*XOR of second set in arr[] and {1, 2, ...n } */
}
printf("\n The two repeating missing elements are are %d & %d ", x, y);
}
Please look at the solution link below. It explains an XOR method.
This method is more efficient than any of the methods explained above.
It might be the same as Victor above, but there is an explanation as to why this works.
Solution here
Here is the simple solution which does not require any quadratic formula or multiplication:
Let say B is the sum of two missing numbers.
The set of two missing numbers will be one from:
(1,B-1),(2,B-1)...(B-1,1)
Therefore, we know that one of those two numbers will be less than or equal to the half of B.
We know that we can calculate the B (sum of both missing number).
So, once we have B, we will find the sum of all numbers in the list which are less than or equal to B/2 and subtract that from the sum of (1 to B/2) to get the first number. And then, we get the second number by subtracting first number from B. In below code, rem_sum is B.
public int[] findMissingTwoNumbers(int [] list, int N){
if(list.length == 0 || list.length != N - 2)return new int[0];
int rem_sum = (N*(N + 1))/2;
for(int i = 0; i < list.length; i++)rem_sum -= list[i];
int half = rem_sum/2;
if(rem_sum%2 == 0)half--; //both numbers cannot be the same
int rem_half = getRemHalf(list,half);
int [] result = {rem_half, rem_sum - rem_half};
return result;
}
private int getRemHalf(int [] list, int half){
int rem_half = (half*(half + 1))/2;
for(int i = 0; i < list.length; i++){
if(list[i] <= half)rem_half -= list[i];
}
return rem_half;
}