Create dict try comprehension - python-2.7

This:
index ={}
for item in args:
for array in item:
for k,v in json.loads(array).iteritems():
for value in v:
index.setdefault(k,[]).append({'values':value['id']})
Works
But, when I try this:
index ={}
filt = {index.setdefault(k,[]).append(value['id']) for item in args for array in item for (k,v) in json.loads(array).iteritems() for value in v}
print filt
Output:
result set([None])
Whats wrong?

dict.setdefault is an inplace method that returns None so you are creating a set of None's which as sets cannot have duplicates leave you with set([None]):
In [27]: d = {}
In [28]: print(d.setdefault(1,[]).append(1)) # returns None
None
In [35]: d = {}
In [36]: {d.setdefault(k,[]).append(1) for k in range(2)} # a set comprehension
Out[36]: {None}
In [37]: d
Out[37]: {0: [1], 1: [1]}
The index dict like d above would get updated but using any comprehension for side effects is not a good approach. You also cannot replicate the for loops/setdefault logic even using a dict comprehension.
What you could do is use a defaultdict with list.extend:
from collections import defaultdict
index = defaultdict(list)
for item in args:
for array in item:
for k,v in json.loads(array).iteritems():
index[k].extend({'values':value['id']} for value in v)

Related

Can't merge two lists into a dictionary

I can't merge two lists into a dictionary.I tried the following :
Map two lists into a dictionary in Python
I tried all solutions and I still get an empty dictionary
from sklearn.feature_extraction import DictVectorizer
from itertools import izip
import itertools
text_file = open("/home/vesko_/evnt_classification/bag_of_words", "r")
text_fiel2 = open("/home/vesko_/evnt_classification/sdas", "r")
lines = text_file.read().split('\n')
words = text_fiel2.read().split('\n')
diction = dict(itertools.izip(words,lines))
new_dict = {k: v for k, v in zip(words, lines)}
print new_dict
I get the following :
{'word': ''}
['word=']
The two lists are not empty.
I'm using python2.7
EDIT :
Output from the two lists (I'm only showing a few because it's a vector with 11k features)
//lines
['change', 'I/O', 'fcnet2', 'ifconfig',....
//words
['word', 'word', 'word', .....
EDIT :
Now at least I have some output #DamianLattenero
{'word\n': 'XXAMSDB35:XXAMSDB35_NGCEAC_DAT_L_Drivei\n'}
['word\n=XXAMSDB35:XXAMSDB35_NGCEAC_DAT_L_Drivei\n']
I think the root of a lot of confusion is code in the example that is not relevant.
Try this:
text_file = open("/home/vesko_/evnt_classification/bag_of_words", "r")
text_fiel2 = open("/home/vesko_/evnt_classification/sdas", "r")
lines = text_file.read().split('\n')
words = text_fiel2.read().split('\n')
# to remove any extra newline or whitespace from what was read in
map(lambda line: line.rstrip(), lines)
map(lambda word: word.rstrip(), words)
new_dict = dict(zip(words,lines))
print new_dict
Python builtin zip() returns an iterable of tuples from each of the arguments. Giving this iterable of tuples to the dict() object constructor creates a dictionary where each of the items in words is the key and items in lines is the corresponding value.
Also note that if the words file has more items than lines then there will either keys with empty values. If lines has items then only the last one will be added with an None key.
I tryed this and worked for me, I created two files, added numbers 1 to 4, letters a to d, and the code creates the dictionary ok, I didn't need to import itertools, actually there is an extra line not needed:
lines = [1,2,3,4]
words = ["a","b","c","d"]
diction = dict(zip(words,lines))
# new_dict = {k: v for k, v in zip(words, lines)}
print(diction)
{'a': 1, 'b': 2, 'c': 3, 'd': 4}
If that worked, and not the other, you must have a problem in loading the list, try loading like this:
def create_list_from_file(file):
with open(file, "r") as ins:
my_list = []
for line in ins:
my_list.append(line)
return my_list
lines = create_list_from_file("/home/vesko_/evnt_classification/bag_of_words")
words = create_list_from_file("/home/vesko_/evnt_classification/sdas")
diction = dict(zip(words,lines))
# new_dict = {k: v for k, v in zip(words, lines)}
print(diction)
Observation:
If you files.txt looks like this:
1
2
3
4
and
a
b
c
d
the result will have for keys in the dictionary, one per line:
{'a\n': '1\n', 'b\n': '2\n', 'c\n': '3\n', 'd': '4'}
But if you file looks like:
1 2 3 4
and
a b c d
the result will be {'a b c d': '1 2 3 4'}, only one value

How to delete the first element of a row so that the whole row deleted from a list?

My list looks as follow:
items = []
a = "apple", 1.23
items.append(a)
b = "google", 2.33
items.append(b)
c = "ibm", 4.35
items.append(c)
Now I will just remove the row of "apple" by just giving the name of "apple".
How to do?
You can convert items into a dictionary, delete the entry with key apple and return the dictionary items:
>>> items
[('apple', 1.23), ('google', 2.33), ('ibm', 4.35)]
>>> d = dict(items)
>>> del d['apple']
>>> items = d.items()
>>> items
[('ibm', 4.35), ('google', 2.33)]
In python 3, you should cast d.items with list as .items() returns a dict_items object which is iterable but not subscriptable:
>>> items = list(d.items())
I suggest that you use a proper data structure. In your case, a dict will do the trick.
items = {"apple": 1.23, "google": 2.33, "ibm": 4.35}
To delete, use:
items.pop("apple", None)
Since I canonly accept one answer and truely to say I am not 100% satisfied with both, so I haven't accpted any one. Hope it's OK for you all.
I do followings, a combination of both of yours:
d = dict(items)
d.pop("apple", None)
myitem = d.items()
I think the best approach is that of using a dictionary, as suggested by #Sricharan Madasi and #Moses Koledoye. However, provided that the OP seems to prefer to arrange data as a list of tuples, he may find this function useful:
def my_func(lst, key):
return [(name, number) for (name, number) in lst if name != key]
The following interactive session demonstrates its usage:
>>> items = [('apple', 1.23), ('google', 2.33), ('ibm', 4.35)]
>>> my_func(items, 'apple')
[('google', 2.33), ('ibm', 4.35)]
>>> my_func(items, 'ibm')
[('apple', 1.23), ('google', 2.33)]
>>> my_func(items, 'foo')
[('apple', 1.23), ('google', 2.33), ('ibm', 4.35)]

How do I extract part of a tuple that's duplicate as key to a dictionary, and have the second part of the tuple as value?

I'm pretty new to Python and Qgis, right now I'm just running scripts but I my end-goal is to create a plugin.
Here's the part of the code I'm having problems with:
import math
layer = qgis.utils.iface.activeLayer()
iter = layer.getFeatures()
dict = {}
#iterate over features
for feature in iter:
#print feature.id()
geom = feature.geometry()
coord = geom.asPolyline()
points=geom.asPolyline()
#get Endpoints
first = points[0]
last = points[-1]
#Assemble Features
dict[feature.id() ]= [first, last]
print dict
This is my result :
{0L: [(355277,6.68901e+06), (355385,6.68906e+06)], 1L: [(355238,6.68909e+06), (355340,6.68915e+06)], 2L: [(355340,6.68915e+06), (355452,6.68921e+06)], 3L: [(355340,6.68915e+06), (355364,6.6891e+06)], 4L: [(355364,6.6891e+06), (355385,6.68906e+06)], 5L: [(355261,6.68905e+06), (355364,6.6891e+06)], 6L: [(355364,6.6891e+06), (355481,6.68916e+06)], 7L: [(355385,6.68906e+06), (355501,6.68912e+06)]}
As you can see, many of the lines have a common endpoint:(355385,6.68906e+06) is shared by 7L, 4L and 0L for example.
I would like to create a new dictionary, fetching the shared points as a key, and having the second points as value.
eg : {(355385,6.68906e+06):[(355277,6.68901e+06), (355364,6.6891e+06), (355501,6.68912e+06)]}
I have been looking though list comprehension tutorials, but without much success: most people are looking to delete the duplicates, whereas I would like use them as keys (with unique IDs). Am I correct in thinking set() would still be useful?
I would be very grateful for any help, thanks in advance.
Maybe this is what you need?
dictionary = {}
for i in dict:
for j in dict:
c = set(dict[i]).intersection(set(dict[j]))
if len(c) == 1:
# ok, so now we know, that exactly one tuple exists in both
# sets at the same time, but this one will be the key to new dictionary
# we need the second tuple from the set to become value for this new key
# so we can subtract the key-tuple from set to get the other tuple
d = set(dict[i]).difference(c)
# Now we need to get tuple back from the set
# by doing list(c) we get list
# and our tuple is the first element in the list, thus list(c)[0]
c = list(c)[0]
dictionary[c] = list(d)[0]
else: pass
This code attaches only one tuple to the key in dictionary. If you want multiple values for each key, you can modify it so that each key would have a list of values, this can be done by simply modifying:
# some_value cannot be a set, it can be obtained with c = list(c)[0]
key = some_value
dictionary.setdefault(key, [])
dictionary[key].append(value)
So, the correct answer would be:
dictionary = {}
for i in a:
for j in a:
c = set(a[i]).intersection(set(a[j]))
if len(c) == 1:
d = set(a[i]).difference(c)
c = list(c)[0]
value = list(d)[0]
if c in dictionary and value not in dictionary[c]:
dictionary[c].append(value)
elif c not in dictionary:
dictionary.setdefault(c, [])
dictionary[c].append(value)
else: pass
See this code :
dict={0L: [(355277,6.68901e+06), (355385,6.68906e+06)], 1L: [(355238,6.68909e+06), (355340,6.68915e+06)], 2L: [(355340,6.68915e+06), (355452,6.68921e+06)], 3L: [(355340,6.68915e+06), (355364,6.6891e+06)], 4L: [(355364,6.6891e+06), (355385,6.68906e+06)], 5L: [(355261,6.68905e+06), (355364,6.6891e+06)], 6L: [(355364,6.6891e+06), (355481,6.68916e+06)], 7L: [(355385,6.68906e+06), (355501,6.68912e+06)]}
dictionary = {}
list=[]
for item in dict :
list.append(dict[0])
list.append(dict[1])
b = []
[b.append(x) for c in list for x in c if x not in b]
print b # or set(b)
res={}
for elm in b :
lst=[]
for item in dict :
if dict[item][0] == elm :
lst.append(dict[item][1])
elif dict[item][1] == elm :
lst.append(dict[item][0])
res[elm]=lst
print res

Get dictionary with lowest key value from a list of dictionaries

From a list of dictionaries I would like to get the dictionary with the lowest value for the 'cost' key and then remove the other key,value pairs from that dictionary
lst = [{'probability': '0.44076116', 'cost': '108.41'} , {'probability': '0.55923884', 'cost': '76.56'}]
You can supply a custom key function to the min() built-in function:
>>> min(lst, key=lambda item: float(item['cost']))
{'cost': '76.56', 'probability': '0.55923884'}
Or, if you just need a minimum cost value itself, you can find a minimum cost value from the list of cost values:
costs = [float(item["cost"]) for item in lst]
print(min(costs))
#alecxe's solution is neat and short, +1 for him. here's my way to do it:
>>> dict_to_keep = dict()
>>> min=1000000
>>> for d in lst:
... if float(d["cost"]) < min:
... min = float(d["cost"])
... dict_to_keep = d
...
>>> print (dict_to_keep)
{'cost': '76.56', 'probability': '0.55923884'}

Subset a list of tuples by max value in Python

My question arise from this discussion. I apologize, but I was not able to add a comment to ask my question under another answer because of my level. I have this list of tuples:
my_list = [('Scaffold100019', 98310), ('Scaffold100019', 14807), ('Scaffold100425', 197577), ('Scaffold100636', 326), ('Scaffold10064', 85415), ('Scaffold10064', 94518)]
I would like to make a dictionary which stores only the max value for each key defined as the first element of the tuple:
my_dict = {'Scaffold100019': 98310, 'Scaffold100425': 197577, 'Scaffold100636': 326, 'Scaffold10064': 94518}
Starting from the Marcus Müller's answer I have:
d = {}
#build a dictionary of lists
for x,y in my_list: d.setdefault(x,[]).append(y)
my_dict = {}
#build a dictionary with the max value only
for item in d: my_dict[item] = max(d[item])
In this way I reach my goal but, is there a sleeker way to complete this task?
I suggest this solution with only one loop, quite readable:
my_dict = {}
for x,y in my_list:
if x in my_dict.keys():
my_dict [x] = max (y, my_dict [x])
else:
my_dict [x] = y
You could use collections.defaultdict.
from collections import defaultdict
d = defaultdict(int)
for key, value in my_list:
d[key] = max(d[key], value)
The above code works on your example data, but will only work in general if each key has a maximum value that is nonnegative. This is because defaultdict(int) returns zero when no value is set, so if all values for a given key are negative, the resulting max will incorrectly be zero.
If purely negative values are possible for a given key, you can make the following alteration:
d = defaultdict(lambda: -float('inf'))
With this alteration, negative infinity will be returned when a key isn't set, so negative values are no longer a concern.
Use the fact that everything is greater than None and the dictionaries get method with None as the fallback return value.
>>> d = {}
>>> for name, value in my_list:
... if value > d.get(name, None):
... d[name] = value
...
>>> d
{'Scaffold100425': 197577, 'Scaffold10064': 94518, 'Scaffold100019': 98310, 'Scaffold100636': 326}
This will work for all values and hashes at most two times per loop.