Intersection of a set and list of dictionaries - python-2.7

I have a set my_set = ("a","b","c","d","z") and a list my_list=[{"a",0.5},{"c",0.6},{"b",0.9},{"z",0.5},{"m",0.0}]. I would like to have a list with items containing keys in my_set only. In this case the result I would like to have is new_list=[{"a",0.5},{"c",0.6},{"b",0.9},{"z",0.5}]
The list and set is large. Is there an efficient way to accomplish this?

Assuming that that's actually a set and a list of dicts, as stated in the question, you can try this:
In [1]: my_set = set(["a","b","c","d","z"])
In [2]: my_list=[{"a":0.5},{"c":0.6},{"b":0.9},{"z":0.5},{"m":0.0}]
In [3]: [d for d in my_list if all(k in my_set for k in d)]
Out[3]: [{'a': 0.5}, {'c': 0.6}, {'b': 0.9}, {'z': 0.5}]
This simply uses a list comprehension to check that all the keys in the dicts are contained in the set. This will have complexity of O(nm), for n dicts in the list, with m keys each (m being 1 in your case) and assuming that set-lookup is always O(1).
Note, however, that you do not really need a list of dictionaries, since all the keys seem to be different (in this example, at least), so a single dictionary would be enough.

Related

Removing tuple from nested list with tuple and lists

I have a nested output in the form of this:
[(3, [['Avg_Order_Insert_Size', '572.9964086553654'], ['Avg_Order_Insert_Size', '34.858670832934195'],
['Avg_Order_Insert_Size', '22.09531308137768']])]
And I want to get rid of the int (3) which is in a tuple, and save the lists containing strings and ints.
How do I accomplish this in the best way?
My goal is to use the lists within the tuple for creating a dictionary later on, but while these ints are there within the tuple I don't know what to do. Basically I think I want an output such as:
[([['Avg_Order_Insert_Size', '572.9964086553654'], ['Avg_Order_Insert_Size', '34.858670832934195'],
['Avg_Order_Insert_Size', '22.09531308137768']])]
I think this is enough:
>>> value = [(3, [['Avg_Order_Insert_Size', ...
>>> value[0] = value[0][1:]
>>> value
[([['Avg_Order_Insert_Size', '572.9964086553654'], ['Avg_Order_Insert_Size', '34.858670832934195'], ['Avg_Order_Insert_Size', '22.09531308137768']],)]

Sort nested dictionary in ascending order and grab outer key?

I have a dictionary that looks like:
dictionary = {'article1.txt': {'harry': 3, 'hermione': 2, 'ron': 1},
'article2.txt': {'dumbledore': 1, 'hermione': 3},
'article3.txt': {'harry': 5}}
And I'm interested in picking the article with the most number of occurences of Hermione. I already have code that selects the outer keys (article1.txt, article2.txt) and inner key hermione.
Now I want to be able to have code that sorts the dictionary into a list of ascending order for the highest number occurrences of the word hermione. In this case, I want a list such that ['article1.txt', 'article2.txt']. I tried it with the following code:
#these keys are generated from another part of the program
keys1 = ['article1.txt', 'article2.txt']
keys2 = ['hermione', 'hermione']
place = 0
for i in range(len(keys1)-1):
for j in range(len(keys2)-1):
if articles[keys1[i]][keys2[j]] > articles[keys1[i+1]][keys2[j+1]]:
ordered_articles.append(keys1[i])
place += 1
else:
ordered_articles.append(place, keys1[i])
But obviously (I'm realizing now) it doesn't make sense to iterate through the keys to check if dictionary[key] > dictionary[next_key]. This is because we would never be able to compare things not in sequence, like dictionary[key[1]] > dictionary[key[3]].
Help would be much appreciated!
It seems that what you're trying to do is sort the articles by the amount of 'hermiones' in them. And, python has a built-in function that does exactly that (you can check it here). You can use it to sort the dictionary keys by the amount of hermiones each of them points to.
Here's a code you can use as example:
# filters out articles without hermione from the dictionary
# value here is the inner dict (for example: {'harry': 5})
dictionary = {key: value for key, value in dictionary.items() if 'hermione' in value}
# this function just returns the amount of hermiones in an article
# it will be used for sorting
def hermione_count(key):
return dictionary[key]['hermione']
# dictionary.keys() is a list of the keys of the dictionary (the articles)
# key=... here means we use hermione_count as the function to sort the list
article_list = sorted(dictionary.keys(), key=hermione_count)

Select duplicated lists from a list of lists (Python 2.7.13)

I have two lists, one is a list of lists, and they have the same number of indexes(the half number of values), like this:
list1=[['47', '43'], ['299', '295'], ['47', '43'], etc.]
list2=[[9.649, 9.612, 9.42, etc.]
I want to detect the repeated pair of values in the same list(and delete it), and sum the values with the same indexes in the second list, creating an output like this:
list1=[['47', '43'], ['299', '295'], etc.]
list2=[[19.069, 9.612, etc.]
The main problem is that the order of the values is important and I'm really stuck.
You could create a collections.defaultdict to sum values together, with keys as the sublists (converted as tuple to be hashable)
list1=[['47', '43'], ['299', '295'], ['47', '43']]
list2=[9.649, 9.612, 9.42]
import collections
c = collections.defaultdict(float)
for l,v in zip(list1,list2):
c[tuple(l)] += v
print(c)
Alternative using collections.Counter and which does the same:
c = collections.Counter((tuple(k),v) for k,v in zip(list1,list2))
At this point, we have the related data:
defaultdict(<class 'float'>, {('299', '295'): 9.612, ('47', '43'): 19.069})
now if needed (not sure, since the dictionary holds the data very well) we can rebuild the lists, keeping the (relative) order between them (but not their original order, that shouldn't be a problem since they're still linked):
list1=[]
list2=[]
for k,v in c.items():
list1.append(list(k))
list2.append(v)
print(list1,list2)
result:
[['299', '295'], ['47', '43']]
[9.612, 19.069]

How to find the union of multiple lists of sub-lists

i have 6 different lists of list similar to
list1=[['hello',1,2,'b3'],['world',1,2,'b4']]
list2=[['yo',4,5,'ba'],['lolz',1,4.35,'b4']]
list3=[['yo',4,5,'ba'],['world',3,4.35,'b6']]
list4=[['test',4,5,'b6'],['test',4,5,'b6']]
they can have around 100 sub-lists in each list but they always have the 4 entries in the sub-list. I want to find all the different sub-list that are all the same and put them into a final list. so it would look something like
final=[['yo',4,5,'ba'],['test',4,5,'b6']]
The pattern is important so the entries in the sub-lists will need to stay in order but the order of the sub-list doesn't matter. what is the best way i could do this? Thank you for your help.
Assuming that there are no unhashable elements of the sublists, I would convert them to tuples and feed them to collections.Counter
from collections import Counter
big_list = [list1, list2, ...]
c = Counter(tuple(sublist) for l in big_list for sublist in l)
final = [list(i) for i in c if c[i] > 1]

Making two lists identical in python 2.7

I have two lists in python.
a=[1,4,5]
b=[4,1,5]
What i need is to order b according to a. Is there any methods to do it so simply without any
loops?
The easiest way to do this would be to use zip to combine the elements of the two lists into tuples:
a, b = zip(*sorted(zip(a, b)))
sorted will compare the tuples by their first element (the element from a) first; zip(*...) will "unzip" the sorted list.
or may be just check everything is perfect then..copy list a for b
if all(x in b for x in a) and len(a)==len(b):
b=a[:]
If you want to make list2 identical to list1, you don't need to mess with order or re-arrange anything, just replace list2 with a copy of list1:
list2 = list(list1)
list() takes any iterable and produces a new list from it, so we can use this to copy list1, thus creating two lists that are exactly the same.
It might also be possible to just do list2 = list1, but do note that this will cause any changes to either to affect the other (as they point to the same object), so this is probably not what you want.
If list2 is referenced elsewhere, and thus needs to remain the same object, it's possible to replace every value in the list using list2[:] = list1.
In general, you probably want the first solution.
Sort b based on items' index in a, with all items not in a at the end.
>>> a=[1,4,5,2]
>>> b=[4,3,1,5]
>>> sorted(b, key=lambda x:a.index(x) if x in a else len(a))
[1, 4, 5, 3]