I am trying to create a program that will find the difference between all pairs in a list. For example
[2,4,6]
Would then make a list containing the difference
[2,2]
Is there a way to do this
Itertools Recipes: pairwise
from itertools import tee
def pairwise(iterable):
"s -> (s0,s1), (s1,s2), (s2, s3), ..."
a, b = tee(iterable)
next(b, None)
return zip(a, b)
def diffs(iterable):
return [b - a for a, b in pairwise(iterable)]
print(diffs([2,4,6]))
[L[i+1] - L[i] for i in range(len(L)-1)] will do it.
Some other ways also using a list comprehension:
[L[i+1] - L[i] for i in range(len(L[:-1]))]
[L[i] - L[i-1] for i in range(1, len(L[1:]))]
Using map:
list(map(lambda i: L[i+1]-L[i], range(len(L[:-1]))))
list(map(lambda i: L[i]-L[i-1], range(1, len(L[1:]))))
Using map and the operator module:
list(map(operator.sub, L[1:], L[:-1]))
Using zip (this one is probably the nicest way, imo):
[x - y for x, y in zip(L[1:], L[:-1])]
A more verbose approach if you aren't familiar with list comprehensions or with map (GET FAMILIAR!):
def differences(L1,L2):
L = []
for V1,V2 in zip(L1,L2):
L.append(V2-V1)
return L
diffs = differences(L[:-1],L[1:])
And a similar, but much better way to do it using a generator:
def differences(L1,L2):
for V1,V2 in zip(L1,L2):
yield V2-V1
diffs = list(differences(L[:-1],L[1:]))
And here is the generator comprehension equivalent of the above generator(notice it's almost exactly the same as the last list comprehension above, except it uses the list function instead of brackets):
list(V2-V1 for V1,V2 in zip(L[:-1],L[1:]))
Study all of these ways of doing it very closely and you will learn a lot of Python.
Related
I'm trying to compare elements of 2 lists of lists in python. I want to create a new list (ph) which has a 1 if elements of lists from the 1st list of lists are in the elements of the 2nd list of lists.
However, this seems to compare the whole list and not individual elements. The code is below. Many thanks for the help! :)
import numpy as np
import pandas as pd
abc = [[1,800000,3],[4,5,6],[100000,7,8]]
l = [[
[i for i in range(0, 100000)],
[i for i in range(200000,300000)],
[i for i in range(400000,500000)],
[i for i in range(600000,700000)],
[i for i in range(800000,900000)],
[i for i in range(1000000,1100000)]
]]
ph = []
for i in abc:
for j in l:
if l[0] == abc[0]:
ph.append(1)
else:
ph.append(0)
print(ph)
The goal of your problem is somewhat unclear to me. Correct me if I'm wrong but what you want is: for each sublist of abc, get a boolean describing if all its elements are anywhere in l. Is that it ?
If it is indeed the case, here's my answer.
First of all, your second list is not a list of lists but a list of lists of lists. Hence, I removed a nested list in my code.
abc = [[1,800000,3],[4,5,6],[100000,7,8]]
L = [
[i for i in range(0, 100000)],
[i for i in range(200000,300000)],
[i for i in range(400000,500000)],
[i for i in range(600000,700000)],
[i for i in range(800000,900000)],
[i for i in range(1000000,1100000)]
]
flattened_L = sum(L, [])
print(
list(map(lambda sublist: all(x in flattened_L for x in sublist), abc))
)
# returns [True, True, False]
My code first flattens L so that is becomes easy to check whether any element is in it or not. Then, for each sublist in abc, it checks if all elements are in this flattened list.
Note: my code returns a list of boolean. If you absolutely need integers value (0 and 1), which you shouldn't, you can wrap int around all.
I am trying to use zip in a pythonic way but in Julia. Given two lists:
a =[2;3;4;5;6]
b =[0;7;8;9;10]
I would like to create the following list comprehension,
c = [x for (x,y) in zip(a, b) if (x<y) else y]
that returns c = [0;3;4;5;6]. Instead I get syntax: expected "]" returned.
You have to rewrite your comprehension such that the condition is in the generator's "body":
c = [x < y ? x : y for (x, y) in zip(a, b)]
The if-condition in comprehensions is purely for filtering at the moment (although it might be possible to add the meaning you want).
If I have a nested list like:
l = [['AB','BCD','TGH'], ['UTY','AB','WEQ'],['XZY','LIY']]
In this example, 'AB' is common to the first two nested lists. How can I remove 'AB' in both lists while keeping the other elements as is? In general how can I remove a element from every nested list that occurs in two or more nested lists so that each nested list is unique?
l = [['BCD','TGH'],['UTY','WEQ'],['XZY','LIY']]
Is it possible to do this with a for loop?
Thanks
from collections import Counter
from itertools import chain
counts = Counter(chain(*ls)) # find counts
result = [[e for e in l if counts[e] == 1] for l in ls] # take uniqs
One option is to do something like this:
from collections import Counter
counts = Counter([b for a in l for b in a])
for a in l:
for b in a:
if counts[b] > 1:
a.remove(b)
Edit: If you want to avoid the (awfully useful standard library) collections module (cf. the comment), you could replace counts above by the following custom counter:
counts = {}
for a in l:
for b in a:
if b in counts:
counts[b] += 1
else:
counts[b] = 1
A somewhat short solution without imports would be to create a reduced version of the original list first, then iterate through the original list and remove elements with counts greater than 1:
lst = lst = [['AB','BCD','TGH'], ['UTY','AB','WEQ'],['XZY','LIY']]
reduced_lst = [y for x in lst for y in x]
output_lst = []
for chunk in lst:
chunk_copy = chunk[:]
for elm in chunk:
if reduced_lst.count(elm)>1:
chunk_copy.remove(elm)
output_lst.append(chunk_copy)
print(output_lst)
Should print:
[['BCD', 'TGH'], ['UTY', 'WEQ'], ['XZY', 'LIY']]
I hope this proves useful.
I have such rdd1 in pySpark: (please excuse any minor syntax errors):
[(id1,(1,2,3)), (id2,(3,4,5))]
I have another rdd2 holding such: (2,3,4).
Now I want to see for each element of rdd2 in how many rdd1 sublists it occurs, e.g. of expected output rdd (or collected list I dont care)
(2, [id1]),(3,[id1,id2]),(4,[id2])
This is what I have so far (note that rdd2 must be the first item in the line/algorithm)
rdd2.map(lambda x: (x, x in rdd.map(lambda y:y[1])))
Even though thus would me give only true/false as second item of the pair tuple I could live with it, but even thus does not work. Failing when trying to perform a map on rdd2 inside the anonymous function of the rdd1 map.
Any idea how to get this going in the right direction?
If rrd2 is relatively small (fits in memory):
pairs1 = rdd1.flatMap(lambda (k, vals): ((v, k) for v in vals))
vals_set = sc.broadcast(set(rdd2.collect()))
(pairs1
.filter(lambda (k, v): k in vals_set.value)
.groupByKey())
If not, you can take pairs1 from a previous part and use join:
pairs2 = rdd2.map(lambda x: (x, None))
(pairs2
.leftOuterJoin(pairs1)
.map(lambda (k, (_, v)): (k, v))
.groupByKey())
As always, if this only an intermediate structure you should consider reduceByKey, aggregateByKey or combineByKey instead of groupByKey. If it is a final structure you can add .mapValues(list).
Finally you can try to use Spark Data Frames:
df1 = sqlContext.createDataFrame(
rdd1.flatMap(lambda (v, keys): ({'k': k, 'v': v} for k in keys)))
df2 = sqlContext.createDataFrame(rdd2.map(lambda k: {'k': k}))
(df1
.join(df2, df1.k == df2.k, 'leftsemi')
.map(lambda r: (r.k, r.v)).groupByKey())
I've a problem with the nested lists. I want to compute the lenght of the intersection of two nested lists with the python language. My lists are composed as follows:
list1 = [[1,2], [2,3], [3,4]]
list2 = [[1,2], [6,7], [4,5]]
output_list = [[1,2]]
How can i compute the intersection of the two lists?
I think there are two reasonable approaches to solving this issue.
If you don't have very many items in your top level lists, you can simply check if each sub-list in one of them is present in the other:
intersection = [inner_list for inner in list1 if inner_list in list2]
The in operator will test for equality, so different list objects with the same contents be found as expected. This is not very efficient however, since a list membership test has to iterate over all of the sublists. In other words, its performance is O(len(list1)*len(list2)). If your lists are long however, it may take more time than you want it to.
A more asymptotically efficient alternative approach is to convert the inner lists to tuples and turn the top level lists into sets. You don't actually need to write any loops yourself for this, as map and the set type's & operator will take care of it all for you:
intersection_set = set(map(tuple, list1)) & set(map(tuple, list2))
If you need your result to be a list of lists, you can of course, convert the set of tuples back into a list of lists:
intersection_list = list(map(list, intersection_set))
What about using sets in python?
>>> set1={(1,2),(2,3),(3,4)}
>>> set2={(1,2),(6,7),(4,5)}
>>> set1 & set2
set([(1, 2)])
>>> len(set1 & set2)
1
import json
list1 = [[1,2], [2,3], [3,4]]
list2 = [[1,2], [6,7], [4,5]]
list1_str = map(json.dumps, list1)
list2_str = map(json.dumps, list2)
output_set_str = set(list1_str) & set(list2_str)
output_list = map(json.loads, output_set_str)
print output_list