Related
Have a problem which asks to find the sum of the positive even numbers and negative odd numbers, 1 to 100 (So 1+2-3+4....+98-99+100). Here is what I have done so far and the correct sum should be 52 if I am doing my math correctly but I come out with a sum of 50. Any suggestions?
lst = range(1,101)
>>> total = 0
>>> for x in lst:
... if x % 2:
... total -= x
... else:
... total += x
...
>>> total
50
I believe your code is right and your math is wrong. Here are three ways to solve the problem.
your solution:
lst = range(1,101)
total = 0
for x in lst:
if x % 2:
total -= x
else:
total += x
print(total)
50
Sum of even numbers plus the sum of odd numbers:
def sumForLoop(max):
positiveEven = sum(range(2,max+1,2))
negativeOdd = -sum(range(1,max+1, 2))
print(positiveEven + negativeOdd)
sumForLoop(100)
50
A formula for the total:
def sumFormula(max):
print(-1**100 *math.floor(max/2))
sumFormula(100)
50
I've written code in Python to calculate sum of amicable numbers below 10000:
def amicable(a, b):
total = 0
result = 0
for i in range(1, a):
if a % i == 0:
total += i
for j in range(1, b):
if b % j == 0:
result += j
if total == b and result == a:
return True
return False
sum_of_amicables = 0
for m in range (1, 10001):
for n in range (1, 10001):
if amicable(m, n) == True and m != n:
sum_of_amicables = sum_of_amicables + m + n
Code is running more than 20 minutes in Python 2.7.11. Is it ok? How can I improve it?
optimized to O(n)
def sum_factors(n):
result = []
for i in xrange(1, int(n**0.5) + 1):
if n % i == 0:
result.extend([i, n//i])
return sum(set(result)-set([n]))
def amicable_pair(number):
result = []
for x in xrange(1,number+1):
y = sum_factors(x)
if sum_factors(y) == x and x != y:
result.append(tuple(sorted((x,y))))
return set(result)
run it
start = time.time()
print (amicable_pair(10000))
print time.time()-start
result
set([(2620, 2924), (220, 284), (6232, 6368), (1184, 1210), (5020, 5564)])
0.180204153061
takes only 0.2 seconds on macbook pro
Lets break down the code and improve the parts of code that is taking so much time.
1-
If you replace if amicable(m, n) == True and m != n: with if m != n and amicable(m, n) == True:, it will save you 10000 calls to amicable method (the most expensive method) for which m != n will be false.
2- In the amicable method you are looping 1 to n to find all the factors for both of the numbers. You need a better algorithm to find the factors. You can use the one mentioned here. It will reduce your O(n) complexity to O(sqrt(n)) for finding factors.
def factors(n):
return set(reduce(list.__add__,
([i, n//i] for i in range(1, int(n**0.5) + 1) if n % i == 0)))
Considering both the points above your code will be
def amicable(a, b):
if sum(factors(a) - {a}) == b and sum(factors(b) - {b}) == a:
return True
return False
sum_of_amicables = 0
for m in range (1, 10001):
for n in range (1, 10001):
if m!= n and amicable(m, n) == True:
sum_of_amicables = sum_of_amicables + m + n
This final code took 10 minutes to run for me, which is half the time you have mentioned.
I was further able to optimize it to 1:30 minutes by optimizing factors method.
There are 10000 * 10000 calls to factors method. And factors is called for each number 10000 times. That is, it calculates factors 10000 times for the same number. So we can optimize it by caching the results of previous factors calculation instead of calculating them at every call.
Here is how I modified factors to cache the results.
def factors(n, cache={}):
if cache.get(n) is not None:
return cache[n]
cache[n] = set(reduce(list.__add__,
([i, n//i] for i in range(1, int(n**0.5) + 1) if n % i == 0)))
return cache[n]
Full Code: (Runtime 1:30 minutes)
So the full and final code becomes
def factors(n, cache={}):
if cache.get(n) is not None:
return cache[n]
cache[n] = set(reduce(list.__add__,
([i, n//i] for i in range(1, int(n**0.5) + 1) if n % i == 0)))
return cache[n]
def amicable(a, b):
if sum(factors(a) - {a}) == b and sum(factors(b) - {b}) == a:
return True
return False
sum_of_amicables = 0
for m in range (1, 10001):
for n in range (1, 10001):
if m!= n and amicable(m, n) == True:
sum_of_amicables = sum_of_amicables + m + n
You can still further improve it.
Hint: sum is also called 10000 times for each number.
Note that you don't need to have a double loop. Just loop M from 1 to 10000,
factorize each M and calculate sum of divisors: S(M). Then check that N = S(M)-M has the same sum of divisors. This is a straight-forward algorithm derived from the definition of an amicable pair.
There are a lot of further tricks to optimize amicable pairs search. It's possible to find all amicable numbers below 1,000,000,000 in just a fraction of a second. Read this in-depth article, you can also check reference C++ code from that article.
Adding to the answer:
def sum_factors(self, n):
s = 1
for i in range(2, int(math.sqrt(n))+1):
if n % i == 0:
s += i
s += n/i
return s
def amicable_pair(self, number):
result = 0
for x in range(1,number+1):
y = self.sum_factors(x)
if self.sum_factors(y) == x and x != y:
result += x
return result
No need for sets or arrays. Improvinging storage and clarity.
#fetching two numbers from the user
num1=int(input("Enter first number"));
num2=int(input("enter the second number"));
fact1=[];
fact2=[];
factsum1=0;
factsum2=0;
#finding the factors of the both numbers
for i in range(1,num1):
if(num1%i==0):
fact1.append(i)
for j in range(1,num2):
if(num2%j==0):
fact2.append(j)
print ("factors of {} is {}".format(num1,fact1));
print ("factors of {} is {}".format(num2,fact2));
#add the elements in the list
for k in range(len(fact1)):
factsum1=factsum1+fact1[k]
for l in range(len(fact2)):
factsum2=factsum2+fact2[l]
print (factsum1);
print (factsum2);
#compare them
if(factsum1==num2 and factsum2==num1 ):
print "both are amicable";
else:
print "not amicable ";
this is my owm understanding of the concept
hi all read code and comments carefully you can easily understand
def amicable_number(number):
list_of_tuples=[]
amicable_pair=[]
for i in range(2,number+1): # in which range you want to find amicable
divisors = 1 # initialize the divisor
sum_of_divisors=0 #here we add the divisors
while divisors < i: # here we take one number and add their divisors
if i%divisors ==0: #checking condition of complete divison
sum_of_divisors += divisors
divisors += 1
list_of_tuples.append((i,sum_of_divisors)) #append that value and sum of there divisors
for i in list_of_tuples:
#with the help of these loops we find amicable with duplicacy
for j in list_of_tuples:
if i[0] == j[1] and i[1] == j[0] and j[0] != j[1]: #condition of amicable number
amicable_pair.append((j[0],i[0])) # append the amicable pair
# i write this for_loop for removing the duplicacy if i will mot use this for loop this
# be print both (x,y) and (y,x) but we need only one among them
for i in amicable_pair:
for j in amicable_pair[1:len(amicable_pair)]: #subscript the list
if i[0] == j[1]:
amicable_pair.remove(i) # remove the duplicacy
print('list of amicable pairs number are: \n',amicable_pair)
amicable_number(284) #call the function
Simple solution to find amicable numbers with loops
I found all the friendly pairs in 9 seconds using this algorithm:
sum_of, friendly, sum_them_all = 0, 0, 0
friendly_list = []
for k in range(1, 10001):
# Let's find the sum of divisors (k not included)
for l in range(1, k):
if k%l == 0:
sum_of += l
# Let's find the sum of divisors for previously found sum of divisors
for m in range(1, sum_of):
if sum_of%m == 0:
friendly += m
# If the sum of divisors of sum of divisors of the first number equals
# with the first number then we add it to the friendly list
if k == friendly and k != sum_of:
if [sum_of, k] in friendly_list:
continue
else:
friendly_list.append([k, sum_of])
# Reset the variables for the next round
sum_of = 0
friendly = 0
# Let's loop through the list, print out the items and also sum all of them
for n in friendly_list:
print(n)
for m in n:
sum_them_all += m
print(sum_them_all)
Full code runtime 10 seconds in Lenovo IdeaPad5 (Ryzen5)
I am trying to write a function in python that returns a list of all the fibonacci numbers in a certain range but my code wont work it simply returns [0]. What is the problem?
from math import sqrt
def F(n):
return int(((1+sqrt(5))**n-(1-sqrt(5))**n)/(2**n*sqrt(5)))
def Frange(x):
A = [0]
while max(A) < x:
H = 1
for i in range(H):
A.append(F(i))
H = H+1
return A
You set H = 1 as the first statement in your while loop; so every time you enter the for loop, H = 1 and you'll only get the Fibonacci number for n=0
You need to set H = 1 outside the while loop:
def Frange(x):
A = [0]
H = 1
while max(A) < x:
for i in range(H):
A.append(F(i))
H = H+1
return A
You could have solved this yourself very easily by printing various values inside the loops, such as print H.
I found another error and the improved code is:
from math import sqrt
def F(n):
return int(((1+sqrt(5))**n-(1-sqrt(5))**n)/(2**n*sqrt(5)))
def Frange2(x):
A = [0]
H = 1
while max(A) < x:
if F(H) < x:
A.append(F(H))
else:
break
H = H+1
return A
The fastest and most popular and uncomplicated solution to calculating a list of fibonacci numbers in a range is
def fib3(n): #FASTEST YET
fibs= [0,1] #list from bottom up
for i in range(2, n+1):
fibs.append(fibs[-1]+fibs[-2])
return fibs
This function stores the computed fibonacci numbers in a list and later uses them as 'cached' numbers to compute further.
Hope it helps!
So I need to find an efficient way to iterate over the big list in python.
Given: array of integers and number(length of a sublist)
Constraints: array up to 100K elements, elements in range(1,2**31)
Task: For every sublist find difference between max and min number. Print out the biggest difference.
Ex: [4,6,3,4,8,1,9], number = 3
As far as I understand I have to go through every sublist:
[4,6,3] max - min = 6 - 3 = 3
[6,3,4] 3
[3,4,8] 5
[4,8,1] 7
[8,1,9] 8
final max = 8
So my solution is:
import time
def difference(arr, number):
maxDiff = 0
i = 0
while i+number != len(arr)+1:
diff = max(arr[i:i+number]) - min(arr[i:i+number])
if diff > maxDiff:
maxDiff = diff
i += 1
print maxDiff
length = 2**31
arr = random.sample(xrange(length),100000) #array wasn't given. My sample
t0 = time.clock()
difference(arr,3)
print 'It took :',time.clock() - t0
Answer:
2147101251
It took : 5.174262
I also did the same with for loops which gives worse time:
def difference(arr,d):
maxDiff = 0
if len(arr) == 0:
maxDiff = 0
elif len(arr) == 1:
maxDiff = arr[0]
else:
i = 0
while i + d != len(arr)+1:
array = []
for j in xrange(d):
array.append(arr[i + j])
diff = max(array) - min(array)
if diff > maxDiff:
maxDiff = diff
i += 1
print maxDiff
length = 2**31
arr = random.sample(xrange(length),100000) #array wasn't given. My sample
t0 = time.clock()
difference(arr,1000)
print 'It took :',time.clock() - t0
Answer:
2147331163
It took : 14.104639
My challenge was to reduce time to 2 sec.
What would be the most efficient way to do this???
Based on answer and comment of #rchang and #gknicker I was able to get improvement. I'm wondering if there is something else I can do?
def difference(arr,d):
window = arr[:d]
arrayLength = len(arr)
maxArrayDiff = max(arr) - min(arr)
maxDiff = 0
while d < arrayLength:
localMax = max(window)
if localMax > maxDiff:
diff = localMax - min(window)
if diff == maxArrayDiff:
return diff
break
elif diff > maxDiff:
maxDiff = diff
window.pop(0)
window.append(arr[d])
d += 1
return maxDiff
#arr = [3,4,6,15,7,2,14,8,1,6,1,2,3,10,1]
length = 2**31
arr = random.sample(xrange(length),100000)
t0 = time.clock()
print difference(arr,1000)
print 'It took :',time.clock() - t0
Answer:
2147274599
It took : 2.54171
Not bad. Any other suggestions?
Here is my attempt at solving this.
I have experimented and measured quite a bit and have come to the following conclusions:
The subset_length has significant influence on the performance.
numpy min/max is much faster than the build in functions, but only for large arrays, below, lets say 50, the buildins are faster.
This has the effect that for subset_length of
below 10 your latest version is the fastest
between 10 and 50 a version of my algorithm without numpy (not posted (yet)) is fastest
above 50 my algorithm is the fastest
at 1000 this algorithm outperforms yours by a factor of 100
be aware that array has to be a numpy.array() and subset_length must be 3 or more.
def difference_np(array, subset_length):
assert subset_length > 2, "subset_length must be larger than 2"
length = array.size
total_diff = array.max()-array.min()
current_min = array[:subset_length].min()
current_max = array[:subset_length].max()
max_diff = current_max - current_min
max_diff_index = 0
index = subset_length
while index < length:
i_new = index
i_old = index-number
index += 1
new = array[i_new]
old = array[i_old]
# the idea here is to avoid calculating the
# min/max over the entire subset as much as possible,
# so we treat every edge case separately.
if new < current_min:
current_min = new
if old == current_max:
current_max = array[i_old+1:i_new-1].max()
elif new > current_max:
current_max = new
if old == current_min:
current_min = array[i_old+1:i_new-1].min()
elif old == current_min:
current_min = array[i_old+1:i_new].min()
elif old == current_max:
current_max = array[i_old+1:i_new].max()
else:
continue
current_diff = current_max-current_min
if current_diff > max_diff:
max_diff = current_diff
max_diff_index = i_old
# shortcut-condition
if max_diff == total_diff:
print('shortcut at', (index-1)/(length-subset_length), '%' )
break
return max_diff, max_diff_index
I'm not certain if the shortcut-condition is all that effective, as it rarely applies and costs two full iterations of the input array.
EDIT
An other margin for improvement exists if the algorithm uses list.pop(0). As list is optimized for right hand side operations, list.pop(0) is relatively expensive. With collections.deque there exists an alternative that provides a fast left hand side pop: deque.popleft(). Is brings quite a bit of improvement to the overall speed.
Here the non-numpy collections.deque based version of my algorithm:
def difference_deque(array, subset_length):
assert subset_length > 1, "subset_length must be larger than 1"
length = len(array)
total_diff = max(array)-min(array)
current_slice = collections.deque(array[:subset_length])
current_min = min(current_slice)
current_max = max(current_slice)
max_diff = current_max - current_min
max_diff_index = 0
index = subset_length
while index < length:
i_new = index
i_old = index-number
index += 1
new = array[i_new]
old = current_slice.popleft()
if new < current_min:
current_min = new
if old == current_max:
current_max = max(current_slice)
current_slice.append(new)
elif new > current_max:
current_max = new
if old == current_min:
current_min = min(current_slice)
current_slice.append(new)
elif old == current_min:
current_slice.append(new)
current_min = min(current_slice)
elif old == current_max:
current_slice.append(new)
current_max = max(current_slice)
else:
current_slice.append(new)
continue
current_diff = current_max-current_min
if current_diff > max_diff:
max_diff = current_diff
max_diff_index = i_old+1
# shortcut-condition
if max_diff == total_diff:
print('shortcut at', (index-1)/(length-number), '%' )
break
return max_diff, max_diff_index
It skews the runtime rankings a bit:
- up to 10 your algorithm (with deque) is best
- up to 100 my algorithm (with deque) is best
- above 100 my algorithm (with numpy) is best
I came up with this optimization that might shave some time off your first implementation. Instead of using slices to isolate the numbers to consider for each iteration, I used a slice one time to initialize the "window". On each iteration, the "rightmost" element gets added to the window and the "leftmost" element gets evicted.
import time
import random
def difference(arr, number):
thisSlice = arr[:number-1]
arrSize = len(arr)
maxDiff = -1000
while number < arrSize:
# Put the new element onto the window's tail
thisSlice.append(arr[number])
thisDiff = max(thisSlice) - min(thisSlice)
if thisDiff > maxDiff: maxDiff = thisDiff
number += 1
# Get rid of the "leftmost" element, we won't need it for next iteration
thisSlice.pop(0)
print maxDiff
if __name__ == '__main__':
length = 2**31
arr = random.sample(xrange(length),100000)
t0 = time.clock()
difference(arr, 1000)
print 'It took :', time.clock() - t0
At least on my laptop, this doesn't come down to below 2 seconds, but I did see some gains compared to the first implementation you posted. On average, your first solution ran on my laptop between 4.2 to 4.3 seconds. This piecemeal window construction version ran on average between 3.5 and 3.6 seconds.
Hope it helps.
I think you can use one of the various numpy rolling window functions using as_strided magic -- say the one I've just stolen from here:
def rolling_window(a, window):
shape = a.shape[:-1] + (a.shape[-1] - window + 1, window)
strides = a.strides + (a.strides[-1],)
return np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides)
Using your original difference but with return instead of print, and with arr being a numpy array:
>>> w = 3
>>> %timeit old_d = difference(arr, w)
1 loops, best of 3: 718 ms per loop
>>> %timeit q = rolling_window(arr, w); ma=q.max(1);mi=q.min(1); new_d=(ma-mi).max()
100 loops, best of 3: 5.68 ms per loop
and
>>> w = 1000
>>> %timeit old_d = difference(arr, w)
1 loops, best of 3: 25.1 s per loop
>>> %timeit q = rolling_window(arr, w); ma=q.max(1);mi=q.min(1); new_d=(ma-mi).max()
1 loops, best of 3: 326 ms per loop
You might have guessed that I'm doing project euler #12 by the title. My brute force solution took much too long, so I went looking for optimizations that I could understand.
I'm interested in extending the strategy outlined here
The way I've tried to tackle this is by using the Sieve of Eratosthenes to get prime factors like this:
divs = []
multiples = set()
for i in xrange(2, n + 1):
if i not in multiples:
if n % i == 0:
divs.append(i)
multiples.update(xrange(2*i, n+1, i))
return divs
This itself is a problem because line 8 will yield an overflow error long before the program gets within the range of the answer (76576500).
Now, assuming I'm able to get the prime factors, how can I find their respective multiplicities efficiently?
Borrowing from the other answer:
The number a1^k1*a2^k2*...an^kn has number of factor = (k1+1)*(k2+1)...(kn+1)
You can get the prime numbers below a certain number using the following code:
Courtesy of Fastest way to list all primes below N
n = number
def primesfrom2to(n):
""" Input n>=6, Returns a array of primes, 2 <= p < n """
sieve = numpy.ones(n/3 + (n%6==2), dtype=numpy.bool)
for i in xrange(1,int(n**0.5)/3+1):
if sieve[i]:
k=3*i+1|1
sieve[ k*k/3 ::2*k] = False
sieve[k*(k-2*(i&1)+4)/3::2*k] = False
return numpy.r_[2,3,((3*numpy.nonzero(sieve)[0][1:]+1)|1)]
primes = primesfrom2to(n).tolist() # list of primes.
primes = map(int, primes)
factors = {}
for prime in primes:
n = number
factor = 0
while True:
if n%prime == 0:
factor += 1
n /= prime
factors[prime] = factor
else: break
factors will give you the multiplicity of the prime factors.
My standard prime-numbers script is appended below; it provides the Sieve of Eratsothenes to generate primes, a Miller-Rabin primality test, a function that factors integers using a 2,3,5-wheel and Pollard's rho method, the number-theoretic function sigma that calculates the sum of the x'th powers of the divisors of an integer, using the method that you reference in your post, and a function that computes the aliquot sequence starting from a given integer. Given that script, it is easy to solve Project Euler 12, remembering that sigma with x=0 returns the count of the divisors of an integer:
$ python
Python 2.6.8 (unknown, Jun 9 2012, 11:30:32)
[GCC 4.5.3] on cygwin
Type "help", "copyright", "credits" or "license" for more information.
>>> execfile('primes.py')
>>> factors(76576500)
[2, 2, 3, 3, 5, 5, 5, 7, 11, 13, 17]
>>> sigma(0,76576500)
576
>>> i, t = 1, 1
>>> while sigma(0, t) < 500:
... i += 1; t += i
...
>>> print t
76576500
You can run the program at http://programmingpraxis.codepad.org/V5LiI8V9, and you'll find lots of prime-number stuff at my blog. Here's the code:
# prime numbers
def primes(n): # sieve of eratosthenes
i, p, ps, m = 0, 3, [2], n // 2
sieve = [True] * m
while p <= n:
if sieve[i]:
ps.append(p)
for j in range((p*p-3)/2, m, p):
sieve[j] = False
i, p = i+1, p+2
return ps
# from random import randint
seed = 17500728 # RIP j s bach
def random(): # float on range [0,1)
global seed
seed = (69069 * seed + 1234567) % 4294967296
return seed / 4294967296.0
def randint(lo,hi): # int on range [lo,hi)
return int((hi - lo) * random()) + lo
def isPrime(n, k=5): # miller-rabin
if n < 2: return False
for p in [2,3,5,7,11,13,17,19,23,29]:
if n % p == 0: return n == p
s, d = 0, n-1
while d % 2 == 0:
s, d = s+1, d/2
for i in range(k):
x = pow(randint(2, n-1), d, n)
if x == 1 or x == n-1: continue
for r in range(1, s):
x = (x * x) % n
if x == 1: return False
if x == n-1: break
else: return False
return True
# from fractions import gcd
def gcd(a,b): # greatest common divisor
if b == 0: return a
return gcd(b, a % b)
def insertSorted(x, xs): # insert x in order
i, ln = 0, len(xs)
while i < ln and xs[i] < x: i += 1
xs.insert(i,x)
return xs
def factors(n, b2=-1, b1=10000): # 2,3,5-wheel, then rho
if -1 <= n <= 1: return [n]
if n < -1: return [-1] + factors(-n)
wheel = [1,2,2,4,2,4,2,4,6,2,6]
w, f, fs = 0, 2, []
while f*f <= n and f < b1:
while n % f == 0:
fs.append(f)
n /= f
f, w = f + wheel[w], w+1
if w == 11: w = 3
if n == 1: return fs
h, t, g, c = 1, 1, 1, 1
while not isPrime(n):
while b2 <> 0 and g == 1:
h = (h*h+c)%n # the hare runs
h = (h*h+c)%n # twice as fast
t = (t*t+c)%n # as the tortoise
g = gcd(t-h, n); b2 -= 1
if b2 == 0: return fs
if isPrime(g):
while n % g == 0:
fs = insertSorted(g, fs)
n /= g
h, t, g, c = 1, 1, 1, c+1
return insertSorted(n, fs)
def sigma(x, n, fs=[]): # sum of x'th powers of divisors of n
def add(s, p, m):
if x == 0: return s * (m+1)
return s * (p**(x*(m+1))-1) / (p**x-1)
if fs == []: fs = factors(n)
prev, mult, sum = fs.pop(0), 1, 1
while len(fs) > 0:
fact = fs.pop(0)
if fact <> prev:
sum, prev, mult = add(sum, prev, mult), fact, 1
else: mult += 1
return add(sum, prev, mult)
def aliquot(n): # print aliquot sequence
s, ss, k, fs = n, [n], 0, factors(n)
print n, k, s, fs
while s > 1:
s, k = sigma(1,s,fs) - s, k + 1
fs = factors(s)
print n, k, s, fs
if s in ss: return "cycle"
ss.append(s)
return ss.pop(-2)
Your approach for factorization is far from optimal, even if you limit yourself to relatively simple algorithms (i.e. not Brent's algorithm or anything more advanced).
Each time you find a prime factor, divide by that factor until it is no longer divisible. The number of times you can do that is the multiplicity.
Continue with the quotient after division, not your original number.
Find factors and divide until the remaining quotient is less than the square of your divisor. In that case the quotient is 1 or a prime (the last prime factor with multiplicity 1).
To find the factor it is enough to do trial division by 2 and odd numbers starting from 3. Any non-primes will not be a problem, because its prime factors will already be removed before it is reached.
Use the correct data structure to represent the prime factors together with their multiplicity (a map or multiset).
You can also compute the number of divisors directly, without storing the factorization. Each time you find a prime factor and its multiplicity, you can accumulate the result by multiplying with the corresponding factor from the formula for the number of divisors.
If you need to do many factorizations of numbers that are not too big, you can precompute an array with the smallest divisor for each array index, and use that to quickly find divisors.
example: 28 = 2^2 * 7 ^ 1
number of factors = (2 + 1) * (1 + 1) = 6
in general a ^ k1 * a ^ k2 .. a ^ kn
number of factors = (k1 + 1) * (k2 + 1) ... (kn + 1)