I want to create an python multi dimensional dictionary :-
Currently i am doing like this
multidict = {}
IN LOOP
mulitdict[i] = data
if loop runs ten times I am getting same value in all index..
Eg:
I want to have like this
multidict {0 : {'name':name1, 'age' : age1}, 1: {'name':name2, 'age' : age2}
but i am getting as shown below
multidict {0 : {'name':name1, 'age' : age1}, 1: {'name':name1, 'age' : age1}
I also tried the default dict also....but every time i get same value in all index. What is the problem?
Tried code :
csv_parsed_data2 = {}
with open('1112.txt') as infile:
i =0
for lineraw in infile:
line = lineraw.strip()
if 'sample1 ' in line:
### TO GET SOURCE ROUTER NAME ###
data['sample1'] = line[8:]
elif 'sample2- ' in line:
### TO GET DESTINATION ROUTER NAME ###
data['sample2'] = line[13:]
elif 'sample3' in line:
### TO GET MIN,MAX,MEAN AND STD VALUES ###
min_value = line.replace("ms"," ")
min_data = min_value.split(" ")
data['sample3'] = min_data[1]
csv_parsed_data2[i] = data
i = i + 1
print i,'::',csv_parsed_data2,'--------------'
print csv_parsed_data2,' all index has same value'
any efficient way to do this??
It sounds you are assigning the same data dict to each of the values of your outer multidict, and just modifying the values it holds on each pass through the loop. This will result in all the values appearing the same, with the values from the last pass through the loop.
You probably need to make sure that you create a separate dictionary object to hold the data from each value. A crude fix might be to replace multidict[i] = data with multidict[i] = dict(data), but if you know how data is created, you can probably do something more elegant.
Edit: Seeing your code, here's a way to fix the issue:
csv_parsed_data2 = {}
with open('1112.txt') as infile:
i =0
data = {} # start with empty data dict
for lineraw in infile:
line = lineraw.strip()
if 'sample1 ' in line:
### TO GET SOURCE ROUTER NAME ###
data['sample1'] = line[8:]
elif 'sample2- ' in line:
### TO GET DESTINATION ROUTER NAME ###
data['sample2'] = line[13:]
elif 'sample3' in line:
### TO GET MIN,MAX,MEAN AND STD VALUES ###
min_value = line.replace("ms"," ")
min_data = min_value.split(" ")
data['sample3'] = min_data[1]
csv_parsed_data2[i] = data
data = {} # after saving a reference to the dict, reinitialize it
i = i + 1
print i,'::',csv_parsed_data2,'--------------'
print csv_parsed_data2,' all index has same value'
To understand what was going on, consider this simpler situation, where I a values in a dictionary after saving a reference to it when it had some older values:
my_dict = { "foo": "bar" }
some_ref = my_dict
print some_ref["foo"] # prints "bar"
my_dict["foo"] = "baz"
print some_ref["foo"] # prints "baz", since my_dict and some_ref refer to the same object
print some_ref is d # prints "True", confirming that fact
In your code, my_dict was data and some_ref were all the values of csv_parsed_data2. They would all end up being references to the same object, which would hold whatever the last values assigned to data were.
Try this:
multidict = {}
for j in range(10):
s = {}
s['name'] = raw_input()
s['age'] = input()
multidict[j] = s
This will have the desired result
Related
I have a list of dictionaries that contain bacterial name as keys, and as values a set of numbers identifying a DNA sequence. Unluckily, in some dictionaries there is a missing value, and the script fails to produce the csv. Can anyone give me an idea on how I can get around it? This is my script:
import glob, subprocess, sys, os, csv
from Bio import SeqIO, SearchIO
from Bio.Seq import Seq
from Bio.SeqRecord import SeqRecord
def allele():
folders=sorted(glob.glob('path_to_files'))
dict_list=[]
for folder in folders:
fasta_file=glob.glob(folder +'/file.fa')[0]
input_handle=open(fasta_file ,'r')
records=list(SeqIO.parse(input_handle, 'fasta'))
namelist=[]
record_dict={}
sampleID = os.path.basename(folder)
record_dict['sampleid']=sampleID
for record in records:
name=record.description.split('\t')
gene=record.id.split('_')
geneID=gene[0] + '_' +gene[1]
allele=gene[2]
record_dict[geneID]=allele
dict_list.append(record_dict)
header = dict_list[0].keys()
with open('path_to_files/mycsv.csv', 'w') as csv_output:
writer=csv.DictWriter(csv_output,header,delimiter='\t')
writer.writeheader()
for samp in dict_list:
writer.writerow(samp)
print 'start'
allele()
Also can I get any suggestion on how to identify those dictionaries whose values sequence are the same?
Thanks
Concerning your first question, I'd just get the shorter dict's and fill, the missing entry with something that your dictWriter works with (didn't use it, ever), I guess NaN may work.
The simple thing would look like
testDict1 = { "sampleid" : 0,
"bac_s1" : [1,2,4],
"bac_s2" : [1,2,12],
"bac_s3" : [1,3,12],
"bac_s4" : [1,6,12],
"bac_s5" : [1,9,14]
}
testDict2 = { "sampleid" : 1,
"bac_s1" : [1,2,4],
"bac_s2" : [1,3,12],
"bac_s3" : [1,3,12],
"bac_s5" : [2,9,14],
}
testDict3 = { "sampleid" : 2,
"bac_s1" : [3,2,4],
"bac_s2" : [4,2,12],
"bac_s3" : [5,3,12],
"bac_s4" : [1,6,12],
"bac_s5" : [1,9,14]
}
dictList = [ testdict1, testdict2, testdict3 ]
### modified from https://stackoverflow.com/a/16974075/803359
### note, does not tell you elements that are present in shortdict but missing in long dict
### i.e. for your purpose you have to assume that keys in short dict are really present in longdict
def missing_elements( longdict, shortdict ):
list1 = longdict.keys()
list2 = shortdict.keys()
assert len( list1 ) >= len( list2 )
return sorted( set( list1 ).difference( list2 ) )
### make the first dict the longest
dictList = sorted( dictList, key=len, reverse=True )
for myDict in dictList[1:]:
### compare all others to the first
missing = missing_elements( dictList[0], myDict )
print missing
### then you fill with something empty or NaN that works with your save-function
for m in missing:
myDict[m] = [ float( 'nan' ) ]
print " "
for myDict in dictList:
print myDict
print " "
which you can incorporate in your code easily.
My code seems to be working, but I have having trouble with the print statement, which I will eventually write out to a CSV. I am able to get the print to work for the first two items, but when I try to add the len part as the third thing to print, it get an error "'str' object is not callable". When I print the len part by itself, it seems to work fine. Any insight as to what I am doing wrong to print all together?
inFile = open(file.txt,'r')
reader = csv.reader(inFile)
allrows = list(reader)
dd = defaultdict(OrderedDict)
ids = OrderedDict()
output = {}
iterallrows = iter(allrows)
next(iterallrows)
for row in iterallrows:
id_ = row[2]
name = row[3]
dd[id_][name] = None
ids[id_] = None
print('{} {} {}'.format(id_,','.join(dd[id_],','(len(dd[id_])))))
You have this:
[...],','(...)[...]
This attempts to treat ',' as a function, which it is not. Put a comma between all arguments to a function.
I am working with Python 2.7 and trying to insert a value which is a float to a key. However, all the values are being inserted as 0.0. The polarity value is being inserted as 0.0 and not the actual value.
Code Snippet:
from textblob import TextBlob
import json
with open('new-webmd-answer.json') as data_file:
data = json.load(data_file, strict=False)
data_new = {}
lst = []
for d in data:
string = d["answerContent"]
blob = TextBlob(string)
#print blob
#print blob.sentiment
#print d["questionId"]
data_new['questionId'] = d["questionId"]
data_new['answerMemberId'] = d["answerMemberId"]
string1 = str(blob.sentiment.polarity)
print string1
data_new['polarity'] = string1
#print blob.sentiment.polarity
lst.append((data_new))
json_data = json.dumps(lst)
#print json_data
with open('polarity.json', 'w') as outfile:
json.dump(json_data, outfile)
The way your code is currently written, you're overwriting the dictionary with each iteration. Then you append that dictionary to the list multiple times.
lets say your dictionary was dict = {"a" : 1} and then you append that to a list
alist.append(dict)
alist
[{'a' : 1}]
Then you change the value of dict, dict{"a" : 0} and append it to the list again alist.append(dict)
alist
[{'a' : 0}, {'a' : 0}]
This occurs because dictionaries are mutable. For a more complete overview on mutable vs unmutable objects see the docs here
To achieve your expected output, make a new dictionary with each iteration of data
lst = []
for d in data:
data_new = {} # makes a new dictionary with each iteration
string = d["answerContent"]
blob = TextBlob(string)
# print blob
# print blob.sentiment
# print d["questionId"]
data_new['questionId'] = d["questionId"]
data_new['answerMemberId'] = d["answerMemberId"]
string1 = str(blob.sentiment.polarity)
print string1
data_new['polarity'] = string1
# print blob.sentiment.polarity
lst.append((data_new))
I'm pretty new to Python and Qgis, right now I'm just running scripts but I my end-goal is to create a plugin.
Here's the part of the code I'm having problems with:
import math
layer = qgis.utils.iface.activeLayer()
iter = layer.getFeatures()
dict = {}
#iterate over features
for feature in iter:
#print feature.id()
geom = feature.geometry()
coord = geom.asPolyline()
points=geom.asPolyline()
#get Endpoints
first = points[0]
last = points[-1]
#Assemble Features
dict[feature.id() ]= [first, last]
print dict
This is my result :
{0L: [(355277,6.68901e+06), (355385,6.68906e+06)], 1L: [(355238,6.68909e+06), (355340,6.68915e+06)], 2L: [(355340,6.68915e+06), (355452,6.68921e+06)], 3L: [(355340,6.68915e+06), (355364,6.6891e+06)], 4L: [(355364,6.6891e+06), (355385,6.68906e+06)], 5L: [(355261,6.68905e+06), (355364,6.6891e+06)], 6L: [(355364,6.6891e+06), (355481,6.68916e+06)], 7L: [(355385,6.68906e+06), (355501,6.68912e+06)]}
As you can see, many of the lines have a common endpoint:(355385,6.68906e+06) is shared by 7L, 4L and 0L for example.
I would like to create a new dictionary, fetching the shared points as a key, and having the second points as value.
eg : {(355385,6.68906e+06):[(355277,6.68901e+06), (355364,6.6891e+06), (355501,6.68912e+06)]}
I have been looking though list comprehension tutorials, but without much success: most people are looking to delete the duplicates, whereas I would like use them as keys (with unique IDs). Am I correct in thinking set() would still be useful?
I would be very grateful for any help, thanks in advance.
Maybe this is what you need?
dictionary = {}
for i in dict:
for j in dict:
c = set(dict[i]).intersection(set(dict[j]))
if len(c) == 1:
# ok, so now we know, that exactly one tuple exists in both
# sets at the same time, but this one will be the key to new dictionary
# we need the second tuple from the set to become value for this new key
# so we can subtract the key-tuple from set to get the other tuple
d = set(dict[i]).difference(c)
# Now we need to get tuple back from the set
# by doing list(c) we get list
# and our tuple is the first element in the list, thus list(c)[0]
c = list(c)[0]
dictionary[c] = list(d)[0]
else: pass
This code attaches only one tuple to the key in dictionary. If you want multiple values for each key, you can modify it so that each key would have a list of values, this can be done by simply modifying:
# some_value cannot be a set, it can be obtained with c = list(c)[0]
key = some_value
dictionary.setdefault(key, [])
dictionary[key].append(value)
So, the correct answer would be:
dictionary = {}
for i in a:
for j in a:
c = set(a[i]).intersection(set(a[j]))
if len(c) == 1:
d = set(a[i]).difference(c)
c = list(c)[0]
value = list(d)[0]
if c in dictionary and value not in dictionary[c]:
dictionary[c].append(value)
elif c not in dictionary:
dictionary.setdefault(c, [])
dictionary[c].append(value)
else: pass
See this code :
dict={0L: [(355277,6.68901e+06), (355385,6.68906e+06)], 1L: [(355238,6.68909e+06), (355340,6.68915e+06)], 2L: [(355340,6.68915e+06), (355452,6.68921e+06)], 3L: [(355340,6.68915e+06), (355364,6.6891e+06)], 4L: [(355364,6.6891e+06), (355385,6.68906e+06)], 5L: [(355261,6.68905e+06), (355364,6.6891e+06)], 6L: [(355364,6.6891e+06), (355481,6.68916e+06)], 7L: [(355385,6.68906e+06), (355501,6.68912e+06)]}
dictionary = {}
list=[]
for item in dict :
list.append(dict[0])
list.append(dict[1])
b = []
[b.append(x) for c in list for x in c if x not in b]
print b # or set(b)
res={}
for elm in b :
lst=[]
for item in dict :
if dict[item][0] == elm :
lst.append(dict[item][1])
elif dict[item][1] == elm :
lst.append(dict[item][0])
res[elm]=lst
print res
So this is my code it is strying tomease car registration plates, start times and end times (In the complete code it would be printed at the bottom).
data = str(list)
sdata = str(list)
edata = str(list)
current = 0
repeats = input ('How many cars do you want to measure?')
def main():
global current
print (current)
print ''
print ''
print '---------------------------------------'
print '---------------------------------------'
print 'Enter the registration number.'
data[current] = raw_input(' ')
print 'Enter the time it passed Camera 1. In this form HH:MM:SS'
sdata[current] = raw_input(' ')
print 'Enter the time it passed Camera 2. In this form HH:MM:SS'
edata[current] = raw_input (' ')
print '---------------------------------------'
print''
print''
print''
print 'The Registration Number is :'
print data[current]
print''
print 'The Start Time Is:'
print sdata[current]
print''
print 'The End Time Is:'
print edata[current]
print''
print''
raw_input('Press enter to confirm.')
print'---------------------------------------'
d = d + 1
s = s + 1
a = a + 1
current = current = 1
while current < repeats:
main()
When I run it and it gets to:
data[current] = raw_input(' ')
I get the error message 'TypeError: 'str' object does not support item assignment'
Thank you in advance for the help. :D
The error is clear. str object does not support item assignment
Strings in python are immutable. You have converted the data to a string when you do
data = str(list)
So, by
data["current"] = raw_input()
you are trying to assign some value to a string, which is not supported in python.
If you want data to be a list,
data = list()
or
data = []
will help, thus preventing the error
Dont use str during assignment
data = str(list)
sdata = str(list)
edata = str(list)
Instead use
data = []
sdata = []
edata = []
and later while printing use str if u want
print str(data[current])
as aswin said its immutable so dont complex it