Printing sum of non abundant numbers in haskell - list

This is a Project Euler Problem 23: Non-abundant Sums.
A perfect number is a number for which the sum of its proper divisors is exactly equal to the number. For example, the sum of the proper divisors of 28 would be 1 + 2 + 4 + 7 + 14 = 28, which means that 28 is a perfect number.
A number n is called deficient if the sum of its proper divisors is less than n and it is called abundant if this sum exceeds n.
As 12 is the smallest abundant number, 1 + 2 + 3 + 4 + 6 = 16, the smallest number that can be written as the sum of two abundant numbers is 24. By mathematical analysis, it can be shown that all integers greater than 28123 can be written as the sum of two abundant numbers. However, this upper limit cannot be reduced any further by analysis even though it is known that the greatest number that cannot be expressed as the sum of two abundant numbers is less than this limit.
Find the sum of all the positive integers which cannot be written as the sum of two abundant numbers.
Here the sumOfPD function returns the sum of proper divisors.
I wrote the following code which doesn't work.
sumOfPD :: Integral a => a -> a
sumOfPD x = sum([y | y <- [1..x], rem x y == 0]) - x
main = do
print (sum ([x + y | x <- [1..], y <- [1..], x + y < 28124, sumOfPD x <= x, sumOfPD y <= y]))
I'm new to Haskell. Please help me resolve error.

You have two problems. One is largely mathematical and one is largely about Haskell semantics. Both stem from a lack of care and clarity of thought; you should think more carefully and slowly about how to write a program which does less work to get to the answer. I'm not going to write down any solution or correct version (indeed project Euler discourages sharing solutions) as that won't help you and it won't help anyone who comes across this by google.
In your sum in main you're counting some numbers multiple times. For example $1+2+4+5+10=21>20$ so 20 is abundant. Your list includes $32=12+20=20+12$ at least twice. Note [32,32] /= [32]. Also note that this isn't just an issue with counting $x+y$ and $y+x$, there might be some numbers which are the sum of two ambiguous in two (non-trivially) different ways.
Due to the nature of list comprehensions in Haskell, in main, x will only ever take a value of 1 as the values considered are (x,y)=(1,1),(1,2),(1,3),(1,4),... and then each of those values is tested. There is a point after which all values are rejected as x+y>=28124 but you never move on to the next x value. Indeed all values are rejected as 1 is not abundant. Try changing [1..] to [1..n] where n is something you should decide on. Alternatively, change it to a list of abundant numbers up to some limit. Cf takeWhile and filter

Related

how to find the minimum number of primatics that sum to a given number

Given a number N (<=10000), find the minimum number of primatic numbers which sum up to N.
A primatic number refers to a number which is either a prime number or can be expressed as power of prime number to itself i.e. prime^prime e.g. 4, 27, etc.
I tried to find all the primatic numbers using seive and then stored them in a vector (code below) but now I am can't see how to find the minimum of primatic numbers that sum to a given number.
Here's my sieve:
#include<algorithm>
#include<vector>
#define MAX 10000
typedef long long int ll;
ll modpow(ll a, ll n, ll temp) {
ll res=1, y=a;
while (n>0) {
if (n&1)
res=(res*y)%temp;
y=(y*y)%temp;
n/=2;
}
return res%temp;
}
int isprimeat[MAX+20];
std::vector<int> primeat;
//Finding all prime numbers till 10000
void seive()
{
ll i,j;
isprimeat[0]=1;
isprimeat[1]=1;
for (i=2; i<=MAX; i++) {
if (isprimeat[i]==0) {
for (j=i*i; j<=MAX; j+=i) {
isprimeat[j]=1;
}
}
}
for (i=2; i<=MAX; i++) {
if (isprimeat[i]==0) {
primeat.push_back(i);
}
}
isprimeat[4]=isprimeat[27]=isprimeat[3125]=0;
primeat.push_back(4);
primeat.push_back(27);
primeat.push_back(3125);
}
int main()
{
seive();
std::sort(primeat.begin(), primeat.end());
return 0;
}
One method could be to store all primatics less than or equal to N in a sorted list - call this list L - and recursively search for the shortest sequence. The easiest approach is "greedy": pick the largest spans / numbers as early as possible.
for N = 14 you'd have L = {2,3,4,5,7,8,9,11,13}, so you'd want to make an algorithm / process that tries these sequences:
13 is too small
13 + 13 -> 13 + 2 will be too large
11 is too small
11 + 11 -> 11 + 4 will be too large
11 + 3 is a match.
You can continue the process by making the search function recurse each time it needs another primatic in the sum, which you would aim to have occur a minimum number of times. To do so you can pick the largest -> smallest primatic in each position (the 1st, 2nd etc primatic in the sum), and include another number in the sum only if the primatics in the sum so far are small enough that an additional primatic won't go over N.
I'd have to make a working example to find a small enough N that doesn't result in just 2 numbers in the sum. Note that because you can express any natural number as the sum of at most 4 squares of natural numbers, and you have a more dense set L than the set of squares, so I'd think it rare you'd have a result of 3 or more for any N you'd want to compute by hand.
Dynamic Programming approach
I have to clarify that 'greedy' is not the same as 'dynamic programming', it can give sub-optimal results. This does have a DP solution though. Again, i won't write the final process in code but explain it as a point of reference to make a working DP solution from.
To do this we need to build up solutions from the bottom up. What you need is a structure that can store known solutions for all numbers up to some N, this list can be incrementally added to for larger N in an optimal way.
Consider that for any N, if it's primatic then the number of terms for N is just 1. This applies for N=2-5,7-9,11,13,16,17,19. The number of terms for all other N must be at least two, which means either it's a sum of two primatics or a sum of a primatic and some other N.
The first few examples that aren't trivial:
6 - can be either 2+4 or 3+3, all the terms here are themselves primatic so the minimum number of terms for 6 is 2.
10 - can be either 2+8, 3+7, 4+6 or 5+5. However 6 is not primatic, and taking that solution out leaves a minimum of 2 terms.
12 - can be either 2+10, 3+9, 4+8, 5+7 or 6+6. Of these 6+6 and 2+10 contain non-primatics while the others do not, so again 2 terms is the minimum.
14 - ditto, there exist two-primatic solutions: 3+11, 5+9, 7+7.
The structure for storing all of these solutions needs to be able to iterate across solutions of equal rank / number of terms. You already have a list of primatics, this is also the list of solutions that need only one term.
Sol[term_length] = list(numbers). You will also need a function / cache to look up some N's shortest-term-length, eg S(N) = term_length iif N in Sol[term_length]
Sol[1] = {2,3,4,5 ...} and Sol[2] = {6,10,12,14 ...} and so on for Sol[3] and onwards.
Any solution can be found using one term from Sol[1] that is primatic. Any solution requiring two primatics will be found in Sol[2]. Any solution requiring 3 will be in Sol[3] etc.
What you need to recognize here is that a number S(N) = 3 can be expressed Sol[1][a] + Sol[1][b] + Sol[1][c] for some a,b,c primatics, but it can also be expressed as Sol[1][a] + Sol[2][d], since all Sol[2] must be expressible as Sol[1][x] + Sol[1][y].
This algorithm will in effect search Sol[1] for a given N, then look in Sol[1] + Sol[K] with increasing K, but to do this you will need S and Sol structures roughly in the form shown here (or able to be accessed / queried in a similar manner).
Working Example
Using the above as a guideline I've put this together quickly, it even shows which multi-term sum it uses.
https://ideone.com/7mYXde
I can explain the code in-depth if you want but the real DP section is around lines 40-64. The recursion depth (also number of additional terms in the sum) is k, a simple dual-iterator while loop checks if a sum is possible using the kth known solutions and primatics, if it is then we're done and if not then check k+1 solutions, if any. Sol and S work as described.
The only confusing part might be the use of reverse iterators, it's just to make != end() checking consistent for the while condition (end is not a valid iterator position but begin is, so != begin would be written differently).
Edit - FYI, the first number that takes at least 3 terms is 959 - had to run my algorithm to 1000 numbers to find it. It's summed from 6 + 953 (primatic), no matter how you split 6 it's still 3 terms.

Finding GCD of a set of numbers?

So, I was asked this question in an interview. Given a group of numbers (not necessarily distinct), I have to find the multiplication of GCD's of all possible subsets of the given group of numbers.
My approach which I told the interviewer:
1. Recursively generate all possible subsets of the given set.
2a. For a particular subset of the given set:
2b. Find GCD of that subset using the Euclid's Algorithm.
3. Multiply it in the answer being obtained.
Assume GCD of an empty set to be 1.
However, there will be 2^n subsets and this won't work optimally if the n is large. How can I optimise it?
Assume that each array element is an integer in the range 1..U for some U.
Let f(x) be the number of subsets with GCD(x). The solution to the problem is then the sum of d^f(d) for all distinct factors 1 <= d <= U.
Let g(x) be the number of array elements divisible by x.
We have
f(x) = 2^g(x) - SUM(x | y, f(y))
We can compute g(x) in O(n * sqrt(U)) by enumerating all divisors of every array element. f(x) can be computed in O(U log U) from high to low values, by enumerating every multiple of x in the straightforward manner.
Pre - Requisite :
Fermat's little theorem (there is a generalised theorem too) , simple maths , Modular exponentiation
Explanation : Notations : A[] stands for our input array
Clearly the constraints 1<=N<=10^5 , tell me that either you need a O(N * LOG N ) solution , dont try to think DP as its complexity according to me will be N * max(A[i]) i.e. approx. 10^5 * 10 ^ 6 . Why? because you need the GCD of the subsets to make a transition.
Ok , moving on
We can think of clubbing the subsets with the same GCD so as to make the complexity.
So , lets decrement an iterator i from 10^6 to 1 trying to make the set with GCD i !
Now to make the subset with GCD(i) I can club it with any i*j where j is a non negative Integer. Why ?
GCD(i , i*j ) = i
Now ,
We can build a frequency table for any element as the number is quite reachable!
Now , during the contest what I did was , keep the number of subsets with gcd(i) at f2[i]
hence what we do is sum frequency of all elements from j*i where j varies from 1 to floor(i/j)
now the subsets with a common divisor(and not GCD) as i is (2^sum - 1) .
Now we have to subtract from this sum the subsets with GCD greater than i and having i as a common divisor of gcd as i.
This can also be done within the same loop by taking summation of f2[i*j] where j varies from 1 to floor(i/j)
Now the subsets with GCD i equal to 2^sum -1 - summation of f2[ij] Just multiply i ( No . of subsets with GCD i times ) i.e. power ( i , 2^sum -1 - summation of f2[ij] ) . But now to calculate this the exponent part can overflow but you can take its % with given MOD-1 as MOD was prime! (Fermat little theorem) using modular exponentiation
Here is a snippet of my code as I am unsure that can we post the code now!
for(i=max_ele; i >= 1;--i)
{
to_add=F[i];
to_subtract = 0 ;
for(j=2 ;j*i <= max_ele;++j)
{
to_add+=F[j*i];
to_subtract+=F2[j*i];
to_subtract>=(MOD-1)?(to_subtract%=(MOD-1)):0;
}
subsets = (((power(2 , to_add , MOD-1) ) - 1) - to_subtract)%(MOD-1) ;
if(subsets<0)
subsets = (subsets%(MOD-1) +MOD-1)%(MOD-1);
ans = ans * power(i , subsets , MOD);
F2[i]= subsets;
ans %=MOD;
}
I feel like I had complicated the things by using F2, I feel like we can do it without F2 by not taking j = 1. but it's okay I haven't thought about it and this is how I managed to get AC .

Find {E1,..En} (E1+E2+..En=N, N is given) with the following property that E1* E2*..En is Maximum

Given the number N, write a program that computes the numbers E1, E2, ...En with the following properties:
1) N = E1 + E2 + ... + En;
2) E1 * E2 * ... En is maximum.
3) E1..En, are integers. No negative values :)
How would you do that ? I have a solution based on divide et impera but i want to check if is optimal.
Example: N=10
5,5 S=10,P=25
3,2,3,2 S=10,P=36
No need for an algorithm, mathematic intuition can do it on its own:
Step 1: prove that a result set with numbers higher than 3 is at most as good as a result set with only 3's and 2's
Given any number x in your result set, one might consider whether it would be better to divide it into two numbers.
The sum should still be x.
When x is even, The maximum for t (x - t) is reached when t = x/2 , and except for the special case x = 2, then it is greater than x, and for the special case x = 4, equal to x (see note 1).
When x is odd, The maximum for t (x - t) is reached when t = (x ± 1)/2.
What does this show? Only that you should only have 3's and 2's in your final set, because otherwise it is suboptimal (or equivalent to an optimal set).
Step 2: you should have as many 3's as possible
Now, as 3² > 2³, you should have as many 3's as possible as long as the remainder is not 1.
Conclusion: for every N >= 3:
If N = 0 mod 3, then the result set is only 3's
If N = 1 mod 3, then the result set has one pair of 2's (or a 4) and the rest is 3's
If N = 2 mod 3, then the result set has one 2 and the rest is 3's
Please correct this post. The times when I was writing well-structured mathematical proofs is far away...
Note 1: (2,4) is the only pair of distinct integers such that x^y = y^x. You can prove that with:
x^y = y^x
y ln(x) = x ln(y)
ln(x)/x = ln(y) / y
and the function ln(t)/t is strictly decreasing after its global maximum, reached between 2 and 3, so if you want two distinct integers such that ln(x)/x = ln(y)/y, one of them must be lower or equal to 2. From that you can infer that only (2,4) works
This is not a complete solution, but might help.
First off note that if you fix n, and two of the terms E_i and E_j differ by more than one (for example 3 and 8), then you can do better by "equalizing" them as much as possible, i.e., if the number p = E_i + E_j is even, you do better both terms by p/2. If p is odd, you do better by replacing them with p/2 and p/2+1 (where / is integer division).
That said, then if you knew what the optimal number of terms, n, was, you'd be done: let all E_i's equal N/n and N/n+1 (again integer division), so that their sum is still N (this is now a straightforward problem).
So the question now is what is the optimal n. Suppose for the moment that you are allowed to use real numbers. Then the solution would be N/n for each term and you could write the product as (N/n)^n. If you differentiate this with respect to n and find its root you find that n should be equal to N/e (where e is the Neper number, also known as Euler's number, e = 2.71828....). Therefore, I'd look for a solution where either n = floor(N/e) or n = floor(N/e)+1, and then choose all the E_i's equal to either N/n or N/n+1, as above.
Hope that helps.
The Online Encycolpedia of Integer Sequences gives a recurrence relation for the solution to this problem.
I'll leave it up to someone else to compare complexities. Not sure I can figure out the complexity of OP's method.

haskell, counting how many prime numbers are there in a list

i m a newbie to haskell, currently i need a function 'f' which, given two integers, returns the number of prime numbers in between them (i.e., greater than the first integer but smaller than the second).
Main> f 2 4
1
Main> f 2 10
3
here is my code so far, but it dosent work. any suggestions? thanks..
f :: Int -> Int -> Int
f x y
| x < y = length [ n | n <- [x..y], y 'mod' n == 0]
| otherwise = 0
Judging from your example, you want the number of primes in the open interval (x,y), which in Haskell is denoted [x+1 .. y-1].
Your primality testing is flawed; you're testing for factors of y.
To use a function name as an infix operator, use backticks (`), not single quotes (').
Try this instead:
-- note: no need for the otherwise, since [x..y] == [] if x>y
nPrimes a b = length $ filter isPrime [a+1 .. b-1]
Exercise for the reader: implement isPrime. Note that it only takes one argument.
Look at what your list comprehension does.
n <- [x..y]
Draw n from a list ranging from x to y.
y `mod` n == 0
Only select those n which evenly divide y.
length (...)
Find how many such n there are.
What your code currently does is find out how many of the numbers between x and y (inclusive) are factors of y. So if you do f 2 4, the list will be [2, 4] (the numbers that evenly divide 4), and the length of that is 2. If you do f 2 10, the list will be `[2, 5, 10] (the numbers that evenly divide 10), and the length of that is 3.
It is important to try to understand for yourself why your code doesn't work. In this case, it's simply the wrong algorithm. For algorithms that find whether a number is prime, among many other sources, you can check the wikipedia article: Primality test.
I you want to work with large intervals, then it might be a better idea to compute a list of primes once (instead of doing a isPrime test for every number):
primes = -- A list with all prime numbers
candidates = [a+1 .. b-1]
myprimes = intersectSortedLists candidates primes
nPrimes = length $ myprimes

Array: mathematical sequence

An array of integers A[i] (i > 1) is defined in the following way: an element A[k] ( k > 1) is the smallest number greater than A[k-1] such that the sum of its digits is equal to the sum of the digits of the number 4* A[k-1] .
You need to write a program that calculates the N th number in this array based on the given first element A[1] .
INPUT:
In one line of standard input there are two numbers seperated with a single space: A[1] (1 <= A[1] <= 100) and N (1 <= N <= 10000).
OUTPUT:
The standard output should only contain a single integer A[N] , the Nth number of the defined sequence.
Input:
7 4
Output:
79
Explanation:
Elements of the array are as follows: 7, 19, 49, 79... and the 4th element is solution.
I tried solving this by coding a separate function that for a given number A[k] calculates the sum of it's digits and finds the smallest number greater than A[k-1] as it says in the problem, but with no success. The first testing failed because of a memory limit, the second testing failed because of a time limit, and now i don't have any possible idea how to solve this. One friend suggested recursion, but i don't know how to set that.
Anyone who can help me in any way please write, also suggest some ideas about using recursion/DP for solving this problem. Thanks.
This has nothing to do with recursion and almost nothing with dynamic programming. You just need to find viable optimizations to make it fast enough. Just a hint, try to understand this solution:
http://codepad.org/LkTJEILz
Here is a simple solution in python. It only uses iteration, recursion is unnecessary and inefficient even for a quick and dirty solution.
def sumDigits(x):
sum = 0;
while(x>0):
sum += x % 10
x /= 10
return sum
def homework(a0, N):
a = [a0]
while(len(a) < N):
nextNum = a[len(a)-1] + 1
while(sumDigits(nextNum) != sumDigits(4 * a[len(a)-1])):
nextNum += 1
a.append(nextNum)
return a[N-1]
PS. I know we're not really supposed to give homework answers, but it appears the OP is in an intro to C++ class so probably doesn't know python yet, hopefully it just looks like pseudo code. Also the code is missing many simple optimizations which would probably make it too slow for a solution as is.
It is rather recursive.
The kernel of the problem is:
Find the smallest number N greater than K having digitsum(N) = J.
If digitsum(K) == J then test if N = K + 9 satisfies the condition.
If digitsum(K) < J then possibly N differs from K only in the ones digit (if the digitsum can be achieved without exceeding 9).
Otherwise if digitsum(K) <= J the new ones digit is 9 and the problem recurses to "Find the smallest number N' greater than (K/10) having digitsum(N') = J-9, then N = N'*10 + 9".
If digitsum(K) > J then ???
In every case N <= 4 * K
9 -> 18 by the first rule
52 -> 55 by the second rule
99 -> 189 by the third rule, the first rule is used during recursion
25 -> 100 requires the fourth case, which I had originally not seen the need for.
Any more counterexamples?