Processing list of lists

Processing list of lists - list

I have a list of lists and I want to process the information inside.
lis = [[1,2,3,4],[1,5,6]]
I want to loop through this list of lists, such that I get 1 *(2*1/1) * (3*2/2) * (4*3/3) and so on. Also, I know that multiplying and dividing by the same number returns the number you started with but I want to implicitly state it in the code. Performing this operation on the list should return
list = [[24],[30]]

I don't know what you are trying to say with the arithmetic you wrote out, but this is probably what you're looking for.
from functools import reduce
lis = [[1,2,3,4],[1,5,6]]
list = [[reduce(lambda x, y: x*y, l)] for l in lis]
In a loop:
list = []
for l in lis:
# do stuff here
list.append(reduce(lambda x, y: x*y, l))
print(list)
output:
[[24], [30]]
See list comprehension, lambda expressions and reduce
NOTE: For those trying to do this in a version before Python 3, I believe reduce is a built-in function.

The multiplication and division by the same number *1/1 either *2/2 in 1 *(2*1/1) * (3*2/2) * (4*3/3) do not make much sense to me.
Using this function you may get the result that you are striving for, i.e. [[24], [30]]
def list_mult(list_in):
list_out = []
for i in list_in:
result = 1
sub_list = []
for j in i:
result = result * j
sub_list.append(result)
list_out.append(sub_list)
print(list_out)
Calling the function with list_mult([[1,2,3,4],[5,6]]) will give [[24], [30]].

Related

Why is python's built in sum function slow when used to flatten a list of lists?

When trying to flatten a list of lists using python 2.7's built-in sum function, I've ran across some performance issues - not only was the computation slow, but the iterative approach yielded much faster results.
The short code below seems to illustrate this performance gap:
import timeit
def sum1(arrs):
return sum(arrs, [])
def sum2(arrs):
s = []
for arr in arrs:
s += arr
return s
def main():
array_of_arrays = [[0] for _ in range(1000)]
print timeit.timeit(lambda: sum1(array_of_arrays), number=100)
print timeit.timeit(lambda: sum2(array_of_arrays), number=100)
if __name__=='__main__':
main()
On my laptop, I get as output:
>> 0.247241020203
>> 0.0043830871582
Could anyone explain to me why is it so?

Your sum2 uses +=:
for arr in arrs:
s += arr
sum does not use +=. sum is defined to use +. The difference is that s += arr is allowed to perform the operation by mutating the existing s list, while s = s + arr must construct a new list, copying the buffers of the old lists.
With +=, Python can use an efficient list resizing strategy that requires an amount of copying proportional to the size of the final list. For N lists of length K each, this takes time proportional to N*K.
With +, Python cannot do that. For every s = s + arr, Python must copy the entire s and arr lists to construct the new s. For N lists of size K each, the total time spent copying is proportional to N**2 * K, much worse.
Because of this, you should pretty much never use sum to concatenate sequences.

Subset sum variant with a non-zero target sum

I have an array of integers and need to apply a variant of the subset sum algorithm on it, except that instead of finding a set of integers whose sum is 0 I am trying to find a set of integers whose sum is n. I am unclear as to how to adapt one of the standard subset sum algorithms to this variant and was hoping for any insight into the problem.

This is subset sum problem, which is NP-Complete (there is no known efficient solution to NP-Complete problems), but if your numbers are relatively small integers - there is an efficient pseudo polynomial solution to it that follows the recurrence:
D(x,i) = false x<0
D(0,i) = true
D(x,0) = false x != 0
D(x,i) = D(x,i-1) OR D(x-arr[i],i-1)
Later, you need to step back on your choices, see where you decided to "reduce" (take the element), and where you decided not to "reduce" (not take the element), on the generated matrix.
This thread and this thread discuss how to get the elements for similar problems.
Here is a python code (taken from the thread I linked to) that does the trick.
If you are not familiar with python - read it as pseudo code, it's pretty easy to understand python!.
arr = [1,2,4,5]
n = len(arr)
SUM = 6
#pre processing:
D = [[True] * (n+1)]
for x in range(1,SUM+1):
D.append([False]*(n+1))
#DP solution to populate D:
for x in range(1,SUM+1):
for i in range(1,n+1):
D[x][i] = D[x][i-1]
if x >= arr[i-1]:
D[x][i] = D[x][i] or D[x-arr[i-1]][i-1]
print D
#get a random solution:
if D[SUM][n] == False:
print 'no solution'
else:
sol = []
x = SUM
i = n
while x != 0:
possibleVals = []
if D[x][i-1] == True:
possibleVals.append(x)
if x >= arr[i-1] and D[x-arr[i-1]][i-1] == True:
possibleVals.append(x-arr[i-1])
#by here possibleVals contains 1/2 solutions, depending on how many choices we have.
#chose randomly one of them
from random import randint
r = possibleVals[randint(0,len(possibleVals)-1)]
#if decided to add element:
if r != x:
sol.append(x-r)
#modify i and x accordingly
x = r
i = i-1
print sol

You can solve this by using dynamic programming.
Lets assume that:
N - is the sum that required (your first input).
M - is the number of summands available (your second input).
a1...aM - are the summands available.
f[x] is true when you can reach the sum of x, and false otherwise
Now the solution:
Initially f[0] = true and f[1..N] = false - we can reach only the sum of zero without taking any summand.
Now you can iterate over all ai, where i in [1..M], and with each of them perform next operation:
f[x + ai] = f[x + ai] || f[x], for each x in [M..ai] - the order of processing is relevant!
Finally you output f[N].
This solution has the complexity of O(N*M), so it is not very useful when you either have large input numbers or large number of summands.

Append integer to global list inside function haskell

I'll use a simple example for what I'm trying to do.
Say I have the list:
nums = []
Now I have the function:
allNums n = nums.append(n)
So if I run the function:
allNums 6
The list nums should have the values
[6]
I know nums.append doesn't work, but what code could replace that.

Simple Answer:
You can't do that. Haskell is a pure, functional language, that means:
A function does not have any side effect.
A function does always return the same result when called with the same parameters.
A function may or may not be called, but you don't have to care about that. If it wasn't called, it wasn't needed, but because the function does not have any side effects, you won't find out.
Complex answer:
You could use the State Monad to implement something that behaves a bit like this, but this is probably out of reach for you yet.

I'm suggesting to use an infinite list instead of appending to global variable.
It's true haskell is pure functional. But also it's lazy. Every part of data is not calculated until is really needed. It also applies to collections. So you could even define a collection with elements based on previous elements of same collection.
Consider following code:
isPrime n = all (\p -> (n `mod` p) /= 0 ) $ takeWhile (\p ->p * p <= n) primes
primes = 2 : ( filter isPrime $ iterate (+1) 3 )
main = putStrLn $ show $ take 100 primes
definition of isPrime is trivia when primes list is defined. It takes pack of primes which is less or equivalent to square root of examining number
takeWhile (\p ->p * p <= n) primes
then it checks if number have only non-zero remainders in division by all of these numbers
all (\p -> (n `mod` p) /= 0 )
the $ here is an application operator
Next using this definition we taking all numbers starting from 3:
iterate (+1) 3
And filtering primes from them.
filter isPrime
Then we just prepending the first prime to it:
primes = 2 : ( ... )
So primes becomes an infinite self-referred list.
You may ask: why we prepending 2 and just no starting filtering numbers from it like:
primes = filter isPrime $ iterate (+1) 2
You could check this leads to uncomputable expression because the isPrime function needs at least one known member of primes to apply the takeWhile to it.
As you can see primes is well defined and immutable while it could have as many elements as you'll need in your logic.

How can I remove similar but not duplicate items from a list?

I have a list:
values = [[6.23234121,6.23246575],[1.352672,1.352689],[6.3245,123.35323,2.3]]
What is a way I can go through this list and remove all items that are within say 0.01 to other elements in the same list.
I know how to do it for a specific set of lists using del, but I want it to be general for if values has n lists in it and each list has n elements.
What I want to happen is perform some operation on this list
values = [[6.23234121,6.23246575],[1.352672,1.352689],[6.3245,123.35323,2.3]]
and get this output
new_values = [[6.23234121],[1.352672],[6.3245,123.35323,2.3]]

I'm going to write a function to do this for a single list, eg
>>> compact([6.23234121,6.23246575], tol=.01)
[6.23234121]
You can then get it to work on your nested structure through just [compact(l) for l in lst].
Each of these methods will keep the first element that doesn't have anything closer to it in the list; for #DSM's example of [0, 0.005, 0.01, 0.015, 0.02] they'd all return [0, 0.0.15] (or, if you switch > to >=, [0, 0.01, 0.02]). If you want something different, you'll have to define exactly what it is more carefully.
First, the easy approach, similar to David's answer. This is O(n^2):
def compact(lst, tol):
new = []
for el in lst:
if all(abs(el - x) > tol for x in new):
new.append(el)
return compact
On three-element lists, that's perfectly nice. If you want to do it on three million-element lists, though, that's not going to cut it. Let's try something different:
import collections
import math
def compact(lst, tol):
round_digits = -math.log10(tol) - 1
seen = collections.defaultdict(set)
new = []
for el in lst:
rounded = round(seen, round_digits)
if all(abs(el - x) > tol for x in seen[rounded]):
seen[rounded].add(el)
new.append(el)
return new
If your tol is 0.01, then round_digits is 1. So 6.23234121 is indexed in seen as just 6.2. When we then see 6.23246575, we round it to 6.2 and look that up in the index, which should contain all numbers that could possibly be within tol of the number we're looking up. Then we still have to check distances to those numbers, but only on the very few numbers that are in that index bin, instead of the entire list.
This approach is O(n k), where k is the average number of elements that'll fall within one such bin. It'll only be helpful if k << n (as it typically would be, but that depends on the distribution of the numbers you're using relative to tol). Note that it also uses probably more than twice as much memory as the other approach, which could be an issue for very large lists.
Another option would be to sort the list first; then you only have to look at the previous and following elements to check for a conflict.

More elegant way for updating a slice of a list

Given a list, I would like to apply some set of operations to a subset(slice) of the list, and store the result of each transformation in the original list.
My background is in Ada, which led me to make the following mistake:
Number_List = [0,1,2,3,4,5,6,7,8,9]
for Index, Number in enumerate(Number_List[1:]):
Number_List[Index] = Number + 1
Giving a new Number_List of: 2,3,4,5,6,7,8,9,10,9 and teaching me that a slice of an array is re-indexed to 0.
I've moved to the following, which is cumbersome but functional.
Number_List = [0,1,2,3,4,5,6,7,8,9]
for Index in range(1,len(Number_List))
Number_List[Index] = Number_List[Index]+1
I am looking for a more elegant way to do this.

enumerate takes a start parameter:
Number_List = [0,1,2,3,4,5,6,7,8,9]
for Index, Number in enumerate(Number_List[1:], start=1):
Number_List[Index] = Number + 1
You can also write
Number_List[1:] = [x+1 for x in Number_List[1:]]

from itertools import islice
number_list[start:stop] = (x + 1 for x in islice(number_list, start, stop))
Alternatively, use number_list[start:stop] instead of islice, but that creates another slice needlessly. This updates the list in-places either way thanks to slice assignment and avoids an explicit loop.

You can use list comprehensions and slices to great effect:
vals = range(10) #gives your example numbers
vals[1:] = [v + 1 for v in vals[1:]]

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Processing list of lists - list

Related

Why is python's built in sum function slow when used to flatten a list of lists?

Subset sum variant with a non-zero target sum

Append integer to global list inside function haskell

How can I remove similar but not duplicate items from a list?

More elegant way for updating a slice of a list

Categories

Resources