Finding GCD of a set of numbers? - c++

So, I was asked this question in an interview. Given a group of numbers (not necessarily distinct), I have to find the multiplication of GCD's of all possible subsets of the given group of numbers.
My approach which I told the interviewer:
1. Recursively generate all possible subsets of the given set.
2a. For a particular subset of the given set:
2b. Find GCD of that subset using the Euclid's Algorithm.
3. Multiply it in the answer being obtained.
Assume GCD of an empty set to be 1.
However, there will be 2^n subsets and this won't work optimally if the n is large. How can I optimise it?

Assume that each array element is an integer in the range 1..U for some U.
Let f(x) be the number of subsets with GCD(x). The solution to the problem is then the sum of d^f(d) for all distinct factors 1 <= d <= U.
Let g(x) be the number of array elements divisible by x.
We have
f(x) = 2^g(x) - SUM(x | y, f(y))
We can compute g(x) in O(n * sqrt(U)) by enumerating all divisors of every array element. f(x) can be computed in O(U log U) from high to low values, by enumerating every multiple of x in the straightforward manner.

Pre - Requisite :
Fermat's little theorem (there is a generalised theorem too) , simple maths , Modular exponentiation
Explanation : Notations : A[] stands for our input array
Clearly the constraints 1<=N<=10^5 , tell me that either you need a O(N * LOG N ) solution , dont try to think DP as its complexity according to me will be N * max(A[i]) i.e. approx. 10^5 * 10 ^ 6 . Why? because you need the GCD of the subsets to make a transition.
Ok , moving on
We can think of clubbing the subsets with the same GCD so as to make the complexity.
So , lets decrement an iterator i from 10^6 to 1 trying to make the set with GCD i !
Now to make the subset with GCD(i) I can club it with any i*j where j is a non negative Integer. Why ?
GCD(i , i*j ) = i
Now ,
We can build a frequency table for any element as the number is quite reachable!
Now , during the contest what I did was , keep the number of subsets with gcd(i) at f2[i]
hence what we do is sum frequency of all elements from j*i where j varies from 1 to floor(i/j)
now the subsets with a common divisor(and not GCD) as i is (2^sum - 1) .
Now we have to subtract from this sum the subsets with GCD greater than i and having i as a common divisor of gcd as i.
This can also be done within the same loop by taking summation of f2[i*j] where j varies from 1 to floor(i/j)
Now the subsets with GCD i equal to 2^sum -1 - summation of f2[ij] Just multiply i ( No . of subsets with GCD i times ) i.e. power ( i , 2^sum -1 - summation of f2[ij] ) . But now to calculate this the exponent part can overflow but you can take its % with given MOD-1 as MOD was prime! (Fermat little theorem) using modular exponentiation
Here is a snippet of my code as I am unsure that can we post the code now!
for(i=max_ele; i >= 1;--i)
{
to_add=F[i];
to_subtract = 0 ;
for(j=2 ;j*i <= max_ele;++j)
{
to_add+=F[j*i];
to_subtract+=F2[j*i];
to_subtract>=(MOD-1)?(to_subtract%=(MOD-1)):0;
}
subsets = (((power(2 , to_add , MOD-1) ) - 1) - to_subtract)%(MOD-1) ;
if(subsets<0)
subsets = (subsets%(MOD-1) +MOD-1)%(MOD-1);
ans = ans * power(i , subsets , MOD);
F2[i]= subsets;
ans %=MOD;
}
I feel like I had complicated the things by using F2, I feel like we can do it without F2 by not taking j = 1. but it's okay I haven't thought about it and this is how I managed to get AC .

Related

Count Divisors of Product from L to R

I have been solving a problem but then got stuck upon its subpart which is as follows:
Given an array of N elements whose ith element is A[i] and we are given Q queries of the type [L,R].
For each query output the number of divisors of product from Lth element to Rth element.
More formally, for each query lets define P as P = A[L] * A[L+1] * A[L+2] * ...* A[R].
Output the number of divisors of P modulo 998244353.
Constraints : 1<= N,Q <= 100000, 1<= A[i] <= 1000000.
My Approach,
For each index i, I have defined a map< int, int > which stores the prime divisor and its count in the product up to [1, i].
I am extracting the prime divisors of a number in O(LogN) using Sieve.
Then for each query (lets say {L,R} ), I am iterating through the map of Lth element and subtracting the count of each each key from the map of Rth element.
And then I am answering the query using the result:
if N = a^p * b^q * c^r ...(a,b,c being primes)
the number of divisors = (p+1)(q+1)(r+1)..
The time complexity of above solution is O(ND + QD), where D = number of distinct prime numbers upto 1000000. In worst case D = 78498.
Is there more efficient solution than this?
There is a more efficient solution for this. But it is slightly complicated. Here are steps to get to the necessary data structure.
Define a data type prime_factor that is a struct that contains a prime and a count.
Define a data type prime_factorization that is a vector of the first data type in ascending size of the primes. This can store the factorization of a number.
Write a function that takes a number, and turns its prime factorization into a prime_factorization
Write a function that takes 2 prime_factorization vectors and merges them into the factorization of the product of the two.
For each number in your array, compute its prime factorization. That gets stored in an array.
For each pair in your array, compute the prime factorization of the product. We will only need half of them. So elements 0, 1 go into one factorization, 2, 3 into the next and so on.
Repeat step 6 O(log(N)) times. So you have a vector of the factorization of each number, pairs, fours, eights, and so on. This results in approximately 2N precomputed factorization vectors. Most vectors are small though a few can be up to O(D) in size (where D is the number of distinct primes). Most of the merges should be very, very fast.
And now you have all of your data prepared. It can't take more than O(log(N)) times the space that storing the prime factors required by itself. (Less than that normally, though, because repeats among the small primes get gathered together in one prime_factor.)
Any range is the union of at most O(log(N)) of these computed vectors. For example the range 10..25 can be broken up into 10..11, 12..15, 16..24, 25. Arrange these intervals from smallest to largest and merge them. Then compute your answer from the result.
An exact analysis is complicated. But I assure you that query time is bounded above by O(Q * D * log(N)) and normally is much less than that.
UPDATE:
How do you find those intervals?
The answer is that you need to identify the number divisible by the highest power of 2 in the range, and then fill out both sides from there. And you figure that out by dividing by 2 (rounding down) until the range is of length 1. Then multiply the top boundary by 2 to find that mid-point.
For example if your range was 35-53 you would start by dividing by 2 to get 35-53, 17-26, 8-13, 4-6, 2-3. That was 2^4 we divided by. our power of 2 mid-point is 3*2^4 = 48. Our intervals above that midpoint are then 48-52, 53-53. Our intervals below are 40-47, 36-39, 35-35. And each of them is of length a power of 2 and starts at a number divisible by that power of 2.

A problem of taking combination for set theory

Given an array A with size N. Value of a subset of Array A is defined as product of all numbers in that subset. We have to return the product of values of all possible non-empty subsets of array A %(10^9+7).
E.G. array A {3,5}
` Value{3} = 3,
Value{5} = 5,
Value{3,5} = 5*3 = 15
answer = 3*5*15 %(10^9+7).
Can someone explain the mathematics behind the problem. I am thinking of solving it by combination to solve it efficiently.
I have tried using brute force it gives correct answer but it is way too slow.
Next approach is using combination. Now i think that if we take all the sets and multiply all the numbers in those set then we will get the correct answer. Thus i have to find out how many times a number is coming in calculation of answer. In the example 5 and 3 both come 2 times. If we look closely, each number in a will come same number of times.
You're heading in the right direction.
Let x be an element of the given array A. In our final answer, x appears p number of times, where p is equivalent to the number of subsets of A possible that include x.
How to calculate p? Once we have decided that we will definitely include x in our subset, we have two choices for the rest N-1 elements: either include them in set or do not. So, we conclude p = 2^(N-1).
So, each element of A appears exactly 2^(N-1) times in the final product. All remains is to calculate the answer: (a1 * a2 * ... * an)^p. Since the exponent is very large, you can use binary exponentiation for fast calculation.
As Matt Timmermans suggested in comments below, we can obtain our answer without actually calculating p = 2^(N-1). We first calculate the product a1 * a2 * ... * an. Then, we simply square this product n-1 times.
The corresponding code in C++:
int func(vector<int> &a) {
int n = a.size();
int m = 1e9+7;
if(n==0) return 0;
if(n==1) return (m + a[0]%m)%m;
long long ans = 1;
//first calculate ans = (a1*a2*...*an)%m
for(int x:a){
//negative sign does not matter since we're squaring
if(x<0) x *= -1;
x %= m;
ans *= x;
ans %= m;
}
//now calculate ans = [ ans^(2^(n-1)) ]%m
//we do this by squaring ans n-1 times
for(int i=1; i<n; i++){
ans = ans*ans;
ans %= m;
}
return (int)ans;
}
Let,
A={a,b,c}
All possible subset of A is ={{},{a},{b},{c},{a,b},{b,c},{c,a},{a,b,c,d}}
Here number of occurrence of each of the element are 4 times.
So if A={a,b,c,d}, then numbers of occurrence of each of the element will be 2^3.
So if the size of A is n, number of occurrence of eachof the element will be 2^(n-1)
So final result will be = a1^p*a2^pa3^p....*an^p
where p is 2^(n-1)
We need to solve x^2^(n-1) % mod.
We can write x^2^(n-1) % mod as x^(2^(n-1) % phi(mod)) %mod . link
As mod is a prime then phi(mod)=mod-1.
So at first find p= 2^(n-1) %(mod-1).
Then find Ai^p % mod for each of the number and multiply with the final result.
I read the previous answers and I was understanding the process of making sets. So here I am trying to put it in as simple as possible for people so that they can apply it to similar problems.
Let i be an element of array A. Following the approach given in the question, i appears p number of times in final answer.
Now, how do we make different sets. We take sets containing only one element, then sets containing group of two, then group of 3 ..... group of n elements.
Now we want to know for every time when we are making set of certain numbers say group of 3 elements, how many of these sets contain i?
There are n elements so for sets of 3 elements which always contains i, combinations are (n-1)C(3-1) because from n-1 elements we can chose 3-1 elements.
if we do this for every group, p = [ (n-1)C(x-1) ] , m going from 1 to n. Thus, p= 2^(n-1).
Similarly for every element i, p will be same. Thus we get
final answer= A[0]^p *A[1]^p...... A[n]^p

Printing sum of non abundant numbers in haskell

This is a Project Euler Problem 23: Non-abundant Sums.
A perfect number is a number for which the sum of its proper divisors is exactly equal to the number. For example, the sum of the proper divisors of 28 would be 1 + 2 + 4 + 7 + 14 = 28, which means that 28 is a perfect number.
A number n is called deficient if the sum of its proper divisors is less than n and it is called abundant if this sum exceeds n.
As 12 is the smallest abundant number, 1 + 2 + 3 + 4 + 6 = 16, the smallest number that can be written as the sum of two abundant numbers is 24. By mathematical analysis, it can be shown that all integers greater than 28123 can be written as the sum of two abundant numbers. However, this upper limit cannot be reduced any further by analysis even though it is known that the greatest number that cannot be expressed as the sum of two abundant numbers is less than this limit.
Find the sum of all the positive integers which cannot be written as the sum of two abundant numbers.
Here the sumOfPD function returns the sum of proper divisors.
I wrote the following code which doesn't work.
sumOfPD :: Integral a => a -> a
sumOfPD x = sum([y | y <- [1..x], rem x y == 0]) - x
main = do
print (sum ([x + y | x <- [1..], y <- [1..], x + y < 28124, sumOfPD x <= x, sumOfPD y <= y]))
I'm new to Haskell. Please help me resolve error.
You have two problems. One is largely mathematical and one is largely about Haskell semantics. Both stem from a lack of care and clarity of thought; you should think more carefully and slowly about how to write a program which does less work to get to the answer. I'm not going to write down any solution or correct version (indeed project Euler discourages sharing solutions) as that won't help you and it won't help anyone who comes across this by google.
In your sum in main you're counting some numbers multiple times. For example $1+2+4+5+10=21>20$ so 20 is abundant. Your list includes $32=12+20=20+12$ at least twice. Note [32,32] /= [32]. Also note that this isn't just an issue with counting $x+y$ and $y+x$, there might be some numbers which are the sum of two ambiguous in two (non-trivially) different ways.
Due to the nature of list comprehensions in Haskell, in main, x will only ever take a value of 1 as the values considered are (x,y)=(1,1),(1,2),(1,3),(1,4),... and then each of those values is tested. There is a point after which all values are rejected as x+y>=28124 but you never move on to the next x value. Indeed all values are rejected as 1 is not abundant. Try changing [1..] to [1..n] where n is something you should decide on. Alternatively, change it to a list of abundant numbers up to some limit. Cf takeWhile and filter

Sum of products of Fibonacci numbers [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
Given a series
Fib(1) * Fib (n+2) + Fib(2) * Fib(n+1) + Fib(3) * Fib(n) + ...... + Fib(n-1) * Fib(4)
or Summation Fib(x) * Fib (n-x+3) where x varies from 1 to n-1
where Fib(n) is nth number of Fibonacci series
to evaluate this series Fib(n) can be calculated using matrix exponentiation .
But the complexity for this is logn and for the n terms it would be nlogn .
I want this series to get reduced to a single term or some other optimization to improve *the time complexity .*
Any suggestions ??
I can't reduce the sum to a single term, but it can be reduced to a sum of five terms, which reduces the complexity to O(log n) arithmetic operations.
However, Fib(n) has Θ(n) bits, so the number of bit-operations is not logarithmic. There is a multiplication of a number the size of Fib(n) with n-1, so the number of bit-operations is M(n,log n), where M(a,b) is the bit-operation complexity of a multiplication of an a-bit number with a b-bit number. For the naive algorithm, M(a,b) = a*b, so the number of bit-operations in the below algorithm is O(n*log n).
The fact that allows this reduction is that Fibonacci numbers (like all numbers in a sequence defined by a linear recurrence) can be written as the sum of pure exponential terms, in particular
Fib(n) = (α^n - β^n) / (α - β)
where
α = (1 + √5)/2; β = (1 - √5)/2.
In addition to the Fibonacci numbers, I also use the Lucas numbers, which follow the same recurrence as the Fibonacci numbers,
Luc(n) = α^n + β^n
so the sequence of Lucas numbers (starting from index 0) begins with
2 1 3 4 7 11 18 29 47 ...
The relation Luc(n) = Fib(n+1) + Fib(n-1) allows an easy conversion between Fibonacci and Lucas numbers, and computation of Luc(n) in O(log n) steps can reuse the Fibonacci code.
So with the representation of Fibonacci numbers given above, we find
(α - β)^2 * Fib(k) * Fib(n+3-k) = (α^k - β^k) * (α^(n+3-k) - β^(n+3-k))
= α^(n+3) + β^(n+3) - (α^k * β^(n+3-k)) - (α^(n+3-k) * β^k)
= Luc(n+3) - ((-1)^k * α^(2k) * β^(n+3)) - ((-1)^k * α^(n+3) * β^(2k))
using the relation α * β = -1.
Now, since α - β = √5 the summation k = 1, ..., n-1 yields
n-1 n-1 n-1
5 * ∑ Fib(k)*Fib(n+3-k) = (n-1)*Luc(n+3) - β^(n+3) * ∑ (-α²)^k - α^(n+3) * ∑ (-β²)^k
k=1 k=1 k=1
The geometric sums can be written in closed form, and a bit of juggling yields the formula
n-1
∑ Fib(k)*Fib(n+3-k) = [5*(n-1)*Luc(n+3) + Luc(n+2) + 2*Luc(n+1) - 2*Luc(n-3) + Luc(n-4)]/25
k=1
Steps to follow:
Define std::vector<int> and fill it with all fibonacci numbers which you need. Compute these numbers using dynamic programming; that is, make use of results which you already have computed. Don't compute the same value more than once!
Once you have the vector filled with all the numbers you need, apply the formula Fib(x) * Fib (n-x+3) in a loop and compute the sum of the products.
assuming f(n)=f(n-1)+f(n-2) for n>2 and f(1)=f(2)=1, f(0)=0
Let E(n)=sum(k=1 .. n-1, f(k)f(n-k+3)) and E'(n)=sum(k=0 .. n, f(k)f(n-k))
Clearly E(n)=E'(n+3) - ( f(0)f(n+3) + f(n)f(3) + f(n+1)f(2) + f(n+2)f(1) + f(n+3)f(0) )
On wolfram you'll find : E'(n)= (nL(n)-f(n))/5, with L(n)=f(n+1)+f(n-1)
from this :
5E(n)=(n+3)( f(n+4)+f(n+2) ) - 5(2f(n) + f(n+1) +f(n+2))
5E(n)=(n+3)( 4f(n+1)+3f(n) ) - 10f(n+1) -15f(n)
5E(n)= (4n+2)f(n+1) + (3n-6)f(n)
Should be simple to evaluate, the complexity should be the same as the fibonnaci algorithm used,
If I know what a Lucas number is, then I can use the identity...
F(n)*F(m) = [L(m+n) - (-1)^n*L(m-n)]/5
See this reduces the sum of products of Fibonacci numbers into a sum of Lucas numbers.
Even better, since n+m is constant for each term, the sum reduces further. There are n-1 terms in this sum, so the sum reduces to a sum of Lucas numbers.
[(n-1)*L(n+3) + L(n+1) - L(n-1) + L(n-3) - ... + (-1)^(n-1)*L(5-n)]/5
As a test, for n = 5, I get 40, which is consistent with the direct sum of products:
1*13 + 1*8 + 2*5 + 3*3 = 40
For n = 1000, I get a sum of:
82283375600539014079871356568026158421560221654785733943009487102720
211767741849325389562067921531531130739623611293922046989610820831567088
516047002196966545744637588824274730947688693969572937880383134671205375
Even better, some extra work will let that sum of Lucas numbers contract further since the Lucas numbers of negative index have a simple relation to those with positive index.
As far as computing the k'th Lucas (as well as the k'th Fibonacci) number, it can be done quite efficiently as shown here.

Find {E1,..En} (E1+E2+..En=N, N is given) with the following property that E1* E2*..En is Maximum

Given the number N, write a program that computes the numbers E1, E2, ...En with the following properties:
1) N = E1 + E2 + ... + En;
2) E1 * E2 * ... En is maximum.
3) E1..En, are integers. No negative values :)
How would you do that ? I have a solution based on divide et impera but i want to check if is optimal.
Example: N=10
5,5 S=10,P=25
3,2,3,2 S=10,P=36
No need for an algorithm, mathematic intuition can do it on its own:
Step 1: prove that a result set with numbers higher than 3 is at most as good as a result set with only 3's and 2's
Given any number x in your result set, one might consider whether it would be better to divide it into two numbers.
The sum should still be x.
When x is even, The maximum for t (x - t) is reached when t = x/2 , and except for the special case x = 2, then it is greater than x, and for the special case x = 4, equal to x (see note 1).
When x is odd, The maximum for t (x - t) is reached when t = (x ± 1)/2.
What does this show? Only that you should only have 3's and 2's in your final set, because otherwise it is suboptimal (or equivalent to an optimal set).
Step 2: you should have as many 3's as possible
Now, as 3² > 2³, you should have as many 3's as possible as long as the remainder is not 1.
Conclusion: for every N >= 3:
If N = 0 mod 3, then the result set is only 3's
If N = 1 mod 3, then the result set has one pair of 2's (or a 4) and the rest is 3's
If N = 2 mod 3, then the result set has one 2 and the rest is 3's
Please correct this post. The times when I was writing well-structured mathematical proofs is far away...
Note 1: (2,4) is the only pair of distinct integers such that x^y = y^x. You can prove that with:
x^y = y^x
y ln(x) = x ln(y)
ln(x)/x = ln(y) / y
and the function ln(t)/t is strictly decreasing after its global maximum, reached between 2 and 3, so if you want two distinct integers such that ln(x)/x = ln(y)/y, one of them must be lower or equal to 2. From that you can infer that only (2,4) works
This is not a complete solution, but might help.
First off note that if you fix n, and two of the terms E_i and E_j differ by more than one (for example 3 and 8), then you can do better by "equalizing" them as much as possible, i.e., if the number p = E_i + E_j is even, you do better both terms by p/2. If p is odd, you do better by replacing them with p/2 and p/2+1 (where / is integer division).
That said, then if you knew what the optimal number of terms, n, was, you'd be done: let all E_i's equal N/n and N/n+1 (again integer division), so that their sum is still N (this is now a straightforward problem).
So the question now is what is the optimal n. Suppose for the moment that you are allowed to use real numbers. Then the solution would be N/n for each term and you could write the product as (N/n)^n. If you differentiate this with respect to n and find its root you find that n should be equal to N/e (where e is the Neper number, also known as Euler's number, e = 2.71828....). Therefore, I'd look for a solution where either n = floor(N/e) or n = floor(N/e)+1, and then choose all the E_i's equal to either N/n or N/n+1, as above.
Hope that helps.
The Online Encycolpedia of Integer Sequences gives a recurrence relation for the solution to this problem.
I'll leave it up to someone else to compare complexities. Not sure I can figure out the complexity of OP's method.