Convert text list into Dictionary? - list

I have a list as the given one:
l = ['1,a','2,b','3,c']
I want to convert this list into a Dictionary, like this:
l_dict = {1:'a',2:'b',3:'c'}
How can I solve it?

you can use a generator expression to pass to the dict constructor each string split by ','
dict(e.split(',') for e in l)
output:
{'1': 'a', '2': 'b', '3': 'c'}

You need to first split and then push the value to dict. Here there are two options if you just want to push it to dict you can use list else if you want in order use od
Link
from collections import OrderedDict
l = ['1,a','2,b','3,c']
list = {}
od = OrderedDict()
for text in l:
convertToDict = text.split(",")
list[convertToDict[0]] = convertToDict[1]
od[convertToDict[0]] = convertToDict[1]
print(list)
print(od)

Related

Can't merge two lists into a dictionary

I can't merge two lists into a dictionary.I tried the following :
Map two lists into a dictionary in Python
I tried all solutions and I still get an empty dictionary
from sklearn.feature_extraction import DictVectorizer
from itertools import izip
import itertools
text_file = open("/home/vesko_/evnt_classification/bag_of_words", "r")
text_fiel2 = open("/home/vesko_/evnt_classification/sdas", "r")
lines = text_file.read().split('\n')
words = text_fiel2.read().split('\n')
diction = dict(itertools.izip(words,lines))
new_dict = {k: v for k, v in zip(words, lines)}
print new_dict
I get the following :
{'word': ''}
['word=']
The two lists are not empty.
I'm using python2.7
EDIT :
Output from the two lists (I'm only showing a few because it's a vector with 11k features)
//lines
['change', 'I/O', 'fcnet2', 'ifconfig',....
//words
['word', 'word', 'word', .....
EDIT :
Now at least I have some output #DamianLattenero
{'word\n': 'XXAMSDB35:XXAMSDB35_NGCEAC_DAT_L_Drivei\n'}
['word\n=XXAMSDB35:XXAMSDB35_NGCEAC_DAT_L_Drivei\n']
I think the root of a lot of confusion is code in the example that is not relevant.
Try this:
text_file = open("/home/vesko_/evnt_classification/bag_of_words", "r")
text_fiel2 = open("/home/vesko_/evnt_classification/sdas", "r")
lines = text_file.read().split('\n')
words = text_fiel2.read().split('\n')
# to remove any extra newline or whitespace from what was read in
map(lambda line: line.rstrip(), lines)
map(lambda word: word.rstrip(), words)
new_dict = dict(zip(words,lines))
print new_dict
Python builtin zip() returns an iterable of tuples from each of the arguments. Giving this iterable of tuples to the dict() object constructor creates a dictionary where each of the items in words is the key and items in lines is the corresponding value.
Also note that if the words file has more items than lines then there will either keys with empty values. If lines has items then only the last one will be added with an None key.
I tryed this and worked for me, I created two files, added numbers 1 to 4, letters a to d, and the code creates the dictionary ok, I didn't need to import itertools, actually there is an extra line not needed:
lines = [1,2,3,4]
words = ["a","b","c","d"]
diction = dict(zip(words,lines))
# new_dict = {k: v for k, v in zip(words, lines)}
print(diction)
{'a': 1, 'b': 2, 'c': 3, 'd': 4}
If that worked, and not the other, you must have a problem in loading the list, try loading like this:
def create_list_from_file(file):
with open(file, "r") as ins:
my_list = []
for line in ins:
my_list.append(line)
return my_list
lines = create_list_from_file("/home/vesko_/evnt_classification/bag_of_words")
words = create_list_from_file("/home/vesko_/evnt_classification/sdas")
diction = dict(zip(words,lines))
# new_dict = {k: v for k, v in zip(words, lines)}
print(diction)
Observation:
If you files.txt looks like this:
1
2
3
4
and
a
b
c
d
the result will have for keys in the dictionary, one per line:
{'a\n': '1\n', 'b\n': '2\n', 'c\n': '3\n', 'd': '4'}
But if you file looks like:
1 2 3 4
and
a b c d
the result will be {'a b c d': '1 2 3 4'}, only one value

Splitting a list into new lists

So I have a list plaintextthat contains ['A', 'A', 'R', 'O', 'N'] and I want to end up with a set of lists called letter1, letter2, letter3, and so on, that contain ['A'], ['A'], ['R'], and so on. How do I go about doing this without cloning the list five times and removing the extra parts?
You can iterate over the list:
In [1]: letters = ['A', 'A', 'R', 'O', 'N']
#use list comprehension to iterate over the list and place each element into a list
In [2]: [[l] for l in letters]
Out[2]: [['A'], ['A'], ['R'], ['O'], ['N']]
To add titles, we typically use a dictionary. For example
#create a dictionary
letters_dict = {}
#iterate over original list as above except now saving to a dictionary
for i in range(len(letters)):
letters_dict['letter'+str(i+1)] = [letters[i]]
This gives you the following:
In [4]: letters_dict
Out[4]:
{'letter1': ['A'],
'letter2': ['A'],
'letter3': ['R'],
'letter4': ['O'],
'letter5': ['N']}
You can now access each of the lists as follows:
In [5]: letters_dict['letters1']
Out[5]: ['A']
Finally, just for completeness, there's a cool extension of the dictionary method. Namely, using code from this thread, you can do the following:
#create a class
class atdict(dict):
__getattr__= dict.__getitem__
__setattr__= dict.__setitem__
__delattr__= dict.__delitem__
#create an instance of the class using our dictionary:
l = atdict(letters_dict)
This way, you can do the following:
In [11]: l.letter1
Out[11]: ['A']
In [12]: l.letter5
Out[12]: ['N']
If you have no desire to store the values in an iterable or referencable object (ie dictionary, list, class) as you suggest in your question, then you could literally do the below:
letter1 = letters[0]
letter2 = letters[1]
letter3 = letters[2]
#and so forth ...
but as you can see, even with 6 variables the above becomes tedious.

How do I extract part of a tuple that's duplicate as key to a dictionary, and have the second part of the tuple as value?

I'm pretty new to Python and Qgis, right now I'm just running scripts but I my end-goal is to create a plugin.
Here's the part of the code I'm having problems with:
import math
layer = qgis.utils.iface.activeLayer()
iter = layer.getFeatures()
dict = {}
#iterate over features
for feature in iter:
#print feature.id()
geom = feature.geometry()
coord = geom.asPolyline()
points=geom.asPolyline()
#get Endpoints
first = points[0]
last = points[-1]
#Assemble Features
dict[feature.id() ]= [first, last]
print dict
This is my result :
{0L: [(355277,6.68901e+06), (355385,6.68906e+06)], 1L: [(355238,6.68909e+06), (355340,6.68915e+06)], 2L: [(355340,6.68915e+06), (355452,6.68921e+06)], 3L: [(355340,6.68915e+06), (355364,6.6891e+06)], 4L: [(355364,6.6891e+06), (355385,6.68906e+06)], 5L: [(355261,6.68905e+06), (355364,6.6891e+06)], 6L: [(355364,6.6891e+06), (355481,6.68916e+06)], 7L: [(355385,6.68906e+06), (355501,6.68912e+06)]}
As you can see, many of the lines have a common endpoint:(355385,6.68906e+06) is shared by 7L, 4L and 0L for example.
I would like to create a new dictionary, fetching the shared points as a key, and having the second points as value.
eg : {(355385,6.68906e+06):[(355277,6.68901e+06), (355364,6.6891e+06), (355501,6.68912e+06)]}
I have been looking though list comprehension tutorials, but without much success: most people are looking to delete the duplicates, whereas I would like use them as keys (with unique IDs). Am I correct in thinking set() would still be useful?
I would be very grateful for any help, thanks in advance.
Maybe this is what you need?
dictionary = {}
for i in dict:
for j in dict:
c = set(dict[i]).intersection(set(dict[j]))
if len(c) == 1:
# ok, so now we know, that exactly one tuple exists in both
# sets at the same time, but this one will be the key to new dictionary
# we need the second tuple from the set to become value for this new key
# so we can subtract the key-tuple from set to get the other tuple
d = set(dict[i]).difference(c)
# Now we need to get tuple back from the set
# by doing list(c) we get list
# and our tuple is the first element in the list, thus list(c)[0]
c = list(c)[0]
dictionary[c] = list(d)[0]
else: pass
This code attaches only one tuple to the key in dictionary. If you want multiple values for each key, you can modify it so that each key would have a list of values, this can be done by simply modifying:
# some_value cannot be a set, it can be obtained with c = list(c)[0]
key = some_value
dictionary.setdefault(key, [])
dictionary[key].append(value)
So, the correct answer would be:
dictionary = {}
for i in a:
for j in a:
c = set(a[i]).intersection(set(a[j]))
if len(c) == 1:
d = set(a[i]).difference(c)
c = list(c)[0]
value = list(d)[0]
if c in dictionary and value not in dictionary[c]:
dictionary[c].append(value)
elif c not in dictionary:
dictionary.setdefault(c, [])
dictionary[c].append(value)
else: pass
See this code :
dict={0L: [(355277,6.68901e+06), (355385,6.68906e+06)], 1L: [(355238,6.68909e+06), (355340,6.68915e+06)], 2L: [(355340,6.68915e+06), (355452,6.68921e+06)], 3L: [(355340,6.68915e+06), (355364,6.6891e+06)], 4L: [(355364,6.6891e+06), (355385,6.68906e+06)], 5L: [(355261,6.68905e+06), (355364,6.6891e+06)], 6L: [(355364,6.6891e+06), (355481,6.68916e+06)], 7L: [(355385,6.68906e+06), (355501,6.68912e+06)]}
dictionary = {}
list=[]
for item in dict :
list.append(dict[0])
list.append(dict[1])
b = []
[b.append(x) for c in list for x in c if x not in b]
print b # or set(b)
res={}
for elm in b :
lst=[]
for item in dict :
if dict[item][0] == elm :
lst.append(dict[item][1])
elif dict[item][1] == elm :
lst.append(dict[item][0])
res[elm]=lst
print res

PYTHON 2.7 - Modifying List of Lists and Re-Assembling Without Mutating

I currently have a list of lists that looks like this:
My_List = [[This, Is, A, Sample, Text, Sentence] [This, too, is, a, sample, text] [finally, so, is, this, one]]
Now what I need to do is "tag" each of these words with one of 3, in this case arbitrary, tags such as "EE", "FF", or "GG" based on which list the word is in and then reassemble them into the same order they came in. My final code would need to look like:
GG_List = [This, Sentence]
FF_List = [Is, A, Text]
EE_List = [Sample]
My_List = [[(This, GG), (Is, FF), (A, FF), (Sample, "EE), (Text, FF), (Sentence, GG)] [*same with this sentence*] [*and this one*]]
I tried this by using for loops to turn each item into a dict but the dicts then got rearranged by their tags which sadly can't happen because of the nature of this thing... the experiment needs everything to stay in the same order because eventually I need to measure the proximity of tags relative to others but only in the same sentence (list).
I thought about doing this with NLTK (which I have little experience with) but it looks like that is much more sophisticated then what I need and the tags aren't easily customized by a novice like myself.
I think this could be done by iterating through each of these items, using an if statement as I have to determine what tag they should have, and then making a tuple out of the word and its associated tag so it doesn't shift around within its list.
I've devised this.. but I can't figure out how to rebuild my list-of-lists and keep them in order :(.
for i in My_List: #For each list in the list of lists
for h in i: #For each item in each list
if h in GG_List: # Check for the tag
MyDicts = {"GG":h for h in i} #Make Dict from tag + word
Thank you so much for your help!
Putting the tags in a dictionary would work:
My_List = [['This', 'Is', 'A', 'Sample', 'Text', 'Sentence'],
['This', 'too', 'is', 'a', 'sample', 'text'],
['finally', 'so', 'is', 'this', 'one']]
GG_List = ['This', 'Sentence']
FF_List = ['Is', 'A', 'Text']
EE_List = ['Sample']
zipped = zip((GG_List, FF_List, EE_List), ('GG', 'FF', 'EE'))
tags = {item: tag for tag_list, tag in zipped for item in tag_list}
res = [[(word, tags[word]) for word in entry if word in tags] for entry in My_List]
Now:
>>> res
[[('This', 'GG'),
('Is', 'FF'),
('A', 'FF'),
('Sample', 'EE'),
('Text', 'FF'),
('Sentence', 'GG')],
[('This', 'GG')],
[]]
Dictionary works by key-value pairs. Each key is assigned a value. To search the dictionary, you search the index by the key, e.g.
>>> d = {1:'a', 2:'b', 3:'c'}
>>> d[1]
'a'
In the above case, we always search the dictionary by its keys, i.e. the integers.
In the case that you want to assign the tag/label to each word, you are searching by the key word and finding the "value", i.e. the tag/label, so your dictionary would have to look something like this (assuming that the strings are words and numbers as tag/label):
>>> d = {'a':1, 'b':1, 'c':3}
>>> d['a']
1
>>> sent = 'a b c a b'.split()
>>> sent
['a', 'b', 'c', 'a', 'b']
>>> [d[word] for word in sent]
[1, 1, 3, 1, 1]
This way the order of the tags follows the order of the words when you use a list comprehension to iterate through the words and find the appropriate tags.
So the problem comes when you have the initial dictionary indexed with the wrong way, i.e. key -> labels, value -> words, e.g.:
>>> d = {1:['a', 'd'], 2:['b', 'h'], 3:['c', 'x']}
>>> [d[word] for word in sent]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyError: 'a'
Then you would have to reverse your dictionary, assuming that all elements in your value lists are unique, you can do this:
>>> from collections import ChainMap
>>> d = {1:['a', 'd'], 2:['b', 'h'], 3:['c', 'x']}
>>> d_inv = dict(ChainMap(*[{value:key for value in values} for key, values in d.items()]))
>>> d_inv
{'h': 2, 'c': 3, 'a': 1, 'x': 3, 'b': 2, 'd': 1}
But the caveat is that ChainMap is only available in Python3.5 (yet another reason to upgrade your Python ;P). For Python <3.5, solutions, see How do I merge a list of dicts into a single dict?.
So going back to the problem of assigning labels/tags to words, let's say we have these input:
>>> d = {1:['a', 'd'], 2:['b', 'h'], 3:['c', 'x']}
>>> sent = 'a b c a b'.split()
First, we invert the dictionary (assuming that there're one to one mapping for every word and its tag/label:
>>> d_inv = dict(ChainMap(*[{value:key for value in values} for key, values in d.items()]))
Then, we apply the tags to the words through a list comprehension:
>>> [d_inv[word] for word in sent]
[1, 2, 3, 1, 2]
And for multiple sentences:
>>> sentences = ['a b c'.split(), 'h a x'.split()]
>>> [[d_inv[word] for word in sent] for sent in sentences]
[[1, 2, 3], [2, 1, 3]]

How to split and extract each tagname from the list of tags which is in the json data?

"tags" : "['x', 'y', 'z']"
I want to extract each element and add each element to a tag table like
tag1 = x
tag2 = y
tag3 = z
I need to store each tag in tags table in different rows for a event.
table: Event
id, title, ...
table: Tag
Tagid, eventid, tagname
Tags can vary for each event.
Or without eval:
t = {"tags" : "['x', 'y', 'z']"}
tags = [el.replace("'","").strip() for el in t['tags'][1:-1].split(',')]
# Basic string splitting:
tags = t['tags'].split(',')
# To replace a character in a string, like "z"
"a123".replace("a", "b") => "b123
# To strip whitespace:
" Wow ".strip() => "Wow"
# Then, a list comprehension to loop through elements of an array and put them in new array:
x = [1, 2, 3]
y = [i+1 for i in x] => [2, 3, 4]
# All together, this becomes
tags = [el.replace("'","").strip() for el in t['tags'][1:-1].split(',')]
Some say eval is evil, because it's subject to code injection, and therefore possibly unpredictable. But as long as you trust the input, it should be okay. Using ast.literal_eval is much better than eval, as it only evaluates to basic types, and so you don't have to worry about the code injection.
>>> t = {"tags" : "['x', 'y', 'z']"}
>>> import ast
>>> ast.literal_eval(t['tags'])
['x', 'y', 'z']
And now it's a list.
from answer given by Ignacio Vazquez-Abrams I am able to change it into list as below:
tags = ast.literal_eval(tags) #converted to the list
##Stored the tags with event_id in the tags table.
eventobj = Event.objects.get(pk=1)
for i in range(len(tags)):
tagsobj = Tags.objects.create(name = tags[i], event_id = eventobj.pk)
tagsobj.save()