Compare dictionaries and delete key:value pairs - python-2.7

I have two list dictionaries.
big_dict = defaultdict(list)
small_dict defaultdict(list)
big_dict = {StepOne:[{PairOne:{key1: value1}}, {PairTwo:{key2: value2}}, {PairThree: {key3: value3}}]}
small_dict = {key1: value1}
Is it possible found subset of second dictionary in 'StepOne' and delete another sub-dictionaries in 'StepOne' key?

I bet there is a more pythonic way to do it, but this should solve your problem:
big_dict = {'A0':[{'a':{'ab':1}, 'b':{'bb':2}, 'c':{'cc':3}}], 'A1':[{'b':{'bb':1}, 'c':{'bb':5}, 'd':{'cc':3}}]}
small_dict = {'bb':2, 'cc':3}
for big_key in big_dict.keys():
for nested_key in big_dict[big_key][0].keys():
ls_small = [ x for x in small_dict if x in big_dict[big_key][0][nested_key]]
if not ls_small:
del big_dict[big_key][0][nested_key]
else:
ls_small = [ y for y in ls_small if small_dict[y] is big_dict[big_key][0][nested_key][y]]
if not ls_small:
del big_dict[big_key][0][nested_key]
ls_small = []
I've added another main dictionary, 'A1' to make it more representative. What this does is it loops through keys of the main dictionary ('A0', 'A1') and then through keys of the first set of nested dictionaries ('a', 'b',...). It selects the nested dictionaries as the 1st element of the lists - values of the main dictionaries.
For each nested dictionary it checks if any of the keys in small_dict are part of it's subdictionary. The sibdictionary is fetched by big_dict[big_key][nested_key] since it's the value of the nested dictionary. If the small_dict keys are found in the subdictionary, they are temporarily stored in ls_small.
If ls_small for that nested dictionary is empty after key-checking step it means no keys from small_dict are present in that nested dictionary and the nested dictionary is deleted. If it is not empty, the else part checks for matching of the values - again deleting the entry if the values don't match.
The output for this example is:
{'A1': [{'d': {'cc': 3}}], 'A0': [{'c': {'cc': 3}, 'b': {'bb': 2}}]}
Note - as it is right now, the approach will keep the nested dictionary if only one small_dict key:value pair matches, meaning that an input of this form
big_dict = {'A0':[{'a':{'bb':2}, 'b':{'bb':2, 'cc': 5}, 'c':{'cc':3}}], 'A1':[{'b':{'bb':1}, 'c':{'bb':5}, 'd':{'cc':3}}]}
will produce
{'A1': [{'d': {'cc': 3}}], 'A0': [{'a': {'bb': 2}, 'c': {'cc': 3}, 'b': {'cc': 5, 'bb': 2}}]}
Is this the desired behavior?

Related

Dictionary update overwrites duplicate keys

I have a table that has 6982 records that I am reading through to make a dictionary. I used a literal to create the dictionary
fld_zone_dict = dict()
fields = ['uniqueid', 'FLD_ZONE', 'FLD_ZONE_1']
...
for row in cursor:
uid = row[0]
old_zone_value = row[1]
new_zone_value = row[2]
fld_zone_dict[uid] = [old_zone_value, new_zone_value]
However, I noticed that using this method, if a uid has the same value as a previous uid (theoretically, there could be duplicate), the entry gets overwritten. So, if I had 2 entries I wanted to add: 'CA10376036': ['AE', 'X'] and 'CA10376036': ['V', 'D'], the first one gets overwritten and I only get 'CA10376036': ['V', 'D']. How can I add to my dictionary with out overwriting the duplicate keys so that I get something like this?
fld_zone_dict = {'CA10376036': ['AE', 'X'], 'CA9194089':['D', 'X'],'CA10376036': ['V', 'D']....}
Short answer: There is no way to have duplicate keys in a dictionary object in Python.
However, if you were to restructure your data and take that key and put it inside of a dictionary that is nested in a list, you could have duplicate IDs. EX:
[
{
"id": "CA10376036",
"data: ['AE', 'X']
},
{
"id": "CA10376036",
"data: ['V', 'D']
},
]
Doing this though will negate any benefits of lookup speed and ease.
edit: blhsing also has a good example of how to restructure data with a reduced initial lookup time, though you would still have to iterate through data to get the record you wanted.
Dicts are not allowed to have duplicate keys in Python. You can use the dict.setdefault method to convert existing keys to a list instead:
for row in cursor:
uid = row[0]
old_zone_value = row[1]
new_zone_value = row[2]
fld_zone_dict.setdefault(uid, []).append([old_zone_value, new_zone_value])
so that fld_zone_dict will become like:
{'CA10376036': [['AE', 'X'], ['V', 'D']], 'CA9194089': ['D', 'X'], ...}
but then other keys will not have a list of lists as values, so you probably should convert them all instead:
for k, v in fld_zone_dict.items():
fld_zone_dict[k] = [v]
for row in cursor:
uid = row[0]
old_zone_value = row[1]
new_zone_value = row[2]
fld_zone_dict[uid].append([old_zone_value, new_zone_value])
so that fld_zone_dict will become like:
{'CA10376036': [['AE', 'X'], ['V', 'D']], 'CA9194089': [['D', 'X']], ...}

Indexing a list of dictionaries for a relating value

I have a 4 dictionaries which have been defined into a list
dict1 = {'A':'B'}
dict2 = {'C':'D'}
dict3 = {'E':'F'}
dict4 = {'G':'H'}
list = [dict1, dict2, dict3, dict4]
value = 'D'
print (the relating value to D)
using the list of dictionaries I would like to index it for the relating value of D (which is 'C').
is this possible?
note: the list doesn't have to be used, the program just needs to find the relating value of C by going through the 4 dictionaries in one way or another.
Thanks!
You have a list of dictionaries. A straightforward way would be to loop over the list, and search for desired value using -
dict.iteritems()
which iterates over the dictionary and returns the 'key':'value' pair as a tuple (key,value). So all thats left to do is search for a desired value and return the associated key. Here is a quick code I tried. Also this should work for dictionaries with any number of key value pairs (I hope).
dict1 = {'A':'B'}
dict2 = {'C':'D'}
dict3 = {'E':'F'}
dict4 = {'G':'H'}
list = [dict1, dict2, dict3, dict4]
def find_in_dict(dictx,search_parameter):
for x,y in dictx.iteritems():
if y == search_parameter:
return x
for i in range(list.__len__()):
my_key = find_in_dict(list[i], "D")
print my_key or "No key found"
On a different note, such a usage of dictionaries is little awkward for me, as it defeats the purpose of having a KEY as an index for an associated VALUE. But anyway, its just my opinion and I am not aware of your use case. Hope it helps.

Python3 Removing dictionary key if value contained in list is blank

So I have a dictionary filled with lots of useful stuff. I would like to remove a key (build a new dict without the key) if any value within a list is empty.
The dictionary:
>>>print(vaar123)
{'moo': 'cora', 'ham': ['', 'test'], 'bye': 2, 'pigeon': '', 'heloo': 1}
I can remove the 'pigeon' key with its empty value with something along the lines of.
>>>dict((k, v) for k, v in vaar123.items() if v)
{'moo': 'cora', 'ham': ['', 'test'], 'heloo': 1, 'bye': 2}
But try as I might, I cannot seem to come up with a method to remove 'ham' as it has an empty value in its list.
Thanks in advance for any suggestions,
Frank
Info: The dictionary is built with a value on creation (set by admin) the additional value is added to the list by user input. The value pair is used as output. Having a single value in the list produces undesirable output.
This function recursively checks Sized Iterables to see if they are empty and returns False if it finds one that is
from collections.abc import Sized, Iterable #If you're on Python >= 3.6,
#you can use collections.abc.Collection
def all_nonempty(v):
if isinstance(v, (Sized, Iterable)):
return v and (all(map(all_nonempty, v)) if not isinstance(v, str) else True)
#We do the check against str because 'a' is a sized iterable that
#contains 'a'. I don't think there's an abstract class for
#containers like that
return True
Then we can use this to winnow the dict
print({k: v for k, v in d.items() if all_nonempty(v)})
outputs:
{'moo': 'cora', 'bye': 2, 'heloo': 1}
Perhaps like this:
>>> d = {'moo': 'cora', 'ham': ['', 'test'], 'heloo': 1, 'bye': 2}
>>> {k:v for k,v in d.items() if not(isinstance(v,list) and len(v) > 0 and v[0] == '')}
{'heloo': 1, 'moo': 'cora', 'bye': 2}
Or maybe just:
>>> {k:v for k,v in d.items() if not(isinstance(v,list) and '' in v)}
{'heloo': 1, 'moo': 'cora', 'bye': 2}
The first answer will remove items where the values are lists in which the first element is ''. The second will remove any value which is a list in which '' occurs somewhere.
Assuming all values in the lists are strings:
{k: v
for k, v in vaar123.items()
if (not hasattr(v, '__iter__')) or
(hasattr(v, '__iter__') and v and all(elem for elem in v))}
Explanation: Keep non-iterable values because they can't be empty (doesn't make sense). Otherwise, if a value is iterable, discard it if it's empty or if it contains any false values (i.e., empty string per the assumption above).

PYTHON 2.7 - Modifying List of Lists and Re-Assembling Without Mutating

I currently have a list of lists that looks like this:
My_List = [[This, Is, A, Sample, Text, Sentence] [This, too, is, a, sample, text] [finally, so, is, this, one]]
Now what I need to do is "tag" each of these words with one of 3, in this case arbitrary, tags such as "EE", "FF", or "GG" based on which list the word is in and then reassemble them into the same order they came in. My final code would need to look like:
GG_List = [This, Sentence]
FF_List = [Is, A, Text]
EE_List = [Sample]
My_List = [[(This, GG), (Is, FF), (A, FF), (Sample, "EE), (Text, FF), (Sentence, GG)] [*same with this sentence*] [*and this one*]]
I tried this by using for loops to turn each item into a dict but the dicts then got rearranged by their tags which sadly can't happen because of the nature of this thing... the experiment needs everything to stay in the same order because eventually I need to measure the proximity of tags relative to others but only in the same sentence (list).
I thought about doing this with NLTK (which I have little experience with) but it looks like that is much more sophisticated then what I need and the tags aren't easily customized by a novice like myself.
I think this could be done by iterating through each of these items, using an if statement as I have to determine what tag they should have, and then making a tuple out of the word and its associated tag so it doesn't shift around within its list.
I've devised this.. but I can't figure out how to rebuild my list-of-lists and keep them in order :(.
for i in My_List: #For each list in the list of lists
for h in i: #For each item in each list
if h in GG_List: # Check for the tag
MyDicts = {"GG":h for h in i} #Make Dict from tag + word
Thank you so much for your help!
Putting the tags in a dictionary would work:
My_List = [['This', 'Is', 'A', 'Sample', 'Text', 'Sentence'],
['This', 'too', 'is', 'a', 'sample', 'text'],
['finally', 'so', 'is', 'this', 'one']]
GG_List = ['This', 'Sentence']
FF_List = ['Is', 'A', 'Text']
EE_List = ['Sample']
zipped = zip((GG_List, FF_List, EE_List), ('GG', 'FF', 'EE'))
tags = {item: tag for tag_list, tag in zipped for item in tag_list}
res = [[(word, tags[word]) for word in entry if word in tags] for entry in My_List]
Now:
>>> res
[[('This', 'GG'),
('Is', 'FF'),
('A', 'FF'),
('Sample', 'EE'),
('Text', 'FF'),
('Sentence', 'GG')],
[('This', 'GG')],
[]]
Dictionary works by key-value pairs. Each key is assigned a value. To search the dictionary, you search the index by the key, e.g.
>>> d = {1:'a', 2:'b', 3:'c'}
>>> d[1]
'a'
In the above case, we always search the dictionary by its keys, i.e. the integers.
In the case that you want to assign the tag/label to each word, you are searching by the key word and finding the "value", i.e. the tag/label, so your dictionary would have to look something like this (assuming that the strings are words and numbers as tag/label):
>>> d = {'a':1, 'b':1, 'c':3}
>>> d['a']
1
>>> sent = 'a b c a b'.split()
>>> sent
['a', 'b', 'c', 'a', 'b']
>>> [d[word] for word in sent]
[1, 1, 3, 1, 1]
This way the order of the tags follows the order of the words when you use a list comprehension to iterate through the words and find the appropriate tags.
So the problem comes when you have the initial dictionary indexed with the wrong way, i.e. key -> labels, value -> words, e.g.:
>>> d = {1:['a', 'd'], 2:['b', 'h'], 3:['c', 'x']}
>>> [d[word] for word in sent]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyError: 'a'
Then you would have to reverse your dictionary, assuming that all elements in your value lists are unique, you can do this:
>>> from collections import ChainMap
>>> d = {1:['a', 'd'], 2:['b', 'h'], 3:['c', 'x']}
>>> d_inv = dict(ChainMap(*[{value:key for value in values} for key, values in d.items()]))
>>> d_inv
{'h': 2, 'c': 3, 'a': 1, 'x': 3, 'b': 2, 'd': 1}
But the caveat is that ChainMap is only available in Python3.5 (yet another reason to upgrade your Python ;P). For Python <3.5, solutions, see How do I merge a list of dicts into a single dict?.
So going back to the problem of assigning labels/tags to words, let's say we have these input:
>>> d = {1:['a', 'd'], 2:['b', 'h'], 3:['c', 'x']}
>>> sent = 'a b c a b'.split()
First, we invert the dictionary (assuming that there're one to one mapping for every word and its tag/label:
>>> d_inv = dict(ChainMap(*[{value:key for value in values} for key, values in d.items()]))
Then, we apply the tags to the words through a list comprehension:
>>> [d_inv[word] for word in sent]
[1, 2, 3, 1, 2]
And for multiple sentences:
>>> sentences = ['a b c'.split(), 'h a x'.split()]
>>> [[d_inv[word] for word in sent] for sent in sentences]
[[1, 2, 3], [2, 1, 3]]

Python remove method mute the dictionary keys

I want to use a dictionary representing a directed graph with a certain number of nodes (num) and with all possible edges (output).
Examples:
if num = 1, output: {0: set([])}
if num = 2, output: {0: set([1]), 1: set([0])}
if num = 3, output: {0: set([1,2]), 1: set([0,2]), 2: set([0,1])}
if num = 4, output: {0: set([1,2,3]), 1: set([0,2,3]), 2: set([0,1,3]), 3: set([0,1,2])}
My code will iterate through the dictionary and create each set by remove the key from a temperate list:
num = 3
keys = range(0,num)
mydict ={}
for key in keys:
temp = keys
value_list = temp.remove(key)
mydict[key] = set([value_list])
But it seemed by using temp.remove(key), not only temp but also keys would be muted. Why is that?
Most objects (not primitives like ints) you use in Python are just references to the actual data. What that means is that in your example, temp and keys are both pointers referencing the same data.
keys = range(0,num) # Bind keys to a new list instance = [0, 1, 2, ..., num]
mydict = {}
for key in keys:
temp = keys # Bind temp to the same dictionary as keys
value_list = temp.remove(key) # Remove from the list temp and keys point to
...
If you want temp to point to a unique list, there are several ways to do it, but I prefer something like:
temp = list(keys)
EDIT:
According to the analysis done here by cryo, this strange syntax is slightly faster (known as slicing)
temp = list[:]