How to store multi-value in the dictionary in python - list

I'm trying to let the dictionary called
theInventory = {}
to sort the following items, so i can add price or view the books by authors and etc.
It's my homework problem. so i need to use dictionary and multi-value using 2D list
Let the some text file called database.txt contains
last name/first name/quantity/price
Shakespeare,William,Romeo And Juliet,5,5.99
Shakespeare,William,Macbeth,3,7.99
Dickens,Charles,Hard Times,7,27.00
Austin,Jane,Sense And Sensibility,2,4.95
Is it possible to do the following?
inFile = (database.txt, "r")
For line in inFile:
aLine = []
aLine = line.split(",") # dont know how to split by ,
theInventory[aLine[0] + ", " + aLine[1]] = list[list[aLine[3], int(aLine[4]), float(aLine[5])]]
inFile.close()
the result will be like
>print (theInventory)
>> {"Shakespeare, William": [["Romeo And Juliet", 5, 5.99], ["Macbeth", 3, 7.99]], "Dickens, Charles": [["Hard Times", 7, 27.00]], "Austin, Jane": [["Sense And Sensibility", 2, 4.95]]}
so that you can modify quantity and price of the certain book.
or even add books to dictionary.

the file looks like a CSV file. Python has a module named as csv which enables the user to read/write rcsv files.
Below is the example code as given in the python document.
import csv
with open('database.txt', 'rb') as f:
reader = csv.reader(f, delimiter=',', quoting=csv.QUOTE_NONE)
for row in reader:
print row
output:
['Shakespeare', 'William', 'Romeo And Juliet', '5', '5.99']
['Shakespeare', 'William', 'Macbeth', '3', '7.99']
['Dickens', 'Charles', 'Hard Times', '7', '27.00']
['Austin', 'Jane', 'Sense And Sensibility', '2', '4.95']
>>>
more information about the csv module can be found here:https://docs.python.org/2/library/csv.html
edit: this is something very basic, but might given you an idea
import csv
dicto = {}
name = ''
#dicto[name] = []
with open('database.txt', 'rb') as f:
reader = csv.reader(f, delimiter=',', quoting=csv.QUOTE_NONE)
for row in reader:
name = row[0] + ',' + row[1] # frames the name
if name in dicto.keys(): # checks if the name exists in the dictionary , if yes just append the book
books = dicto.get(name)
books.append(row[2:])
else: # if no entry of author is made, create new book list
books = []
books.append(row[2:])
dicto[name] =books # update the list of books
print dicto
output:
{'Dickens,Charles': [['Hard Times', '7', '27.00']], 'Shakespeare,William': [['Romeo And Juliet', '5', '5.99'], ['Macbeth', '3', '7.99']], 'Austin,Jane': [['Sense And Sensibility', '2', '4.95']]}

Related

Dictionary update overwrites duplicate keys

I have a table that has 6982 records that I am reading through to make a dictionary. I used a literal to create the dictionary
fld_zone_dict = dict()
fields = ['uniqueid', 'FLD_ZONE', 'FLD_ZONE_1']
...
for row in cursor:
uid = row[0]
old_zone_value = row[1]
new_zone_value = row[2]
fld_zone_dict[uid] = [old_zone_value, new_zone_value]
However, I noticed that using this method, if a uid has the same value as a previous uid (theoretically, there could be duplicate), the entry gets overwritten. So, if I had 2 entries I wanted to add: 'CA10376036': ['AE', 'X'] and 'CA10376036': ['V', 'D'], the first one gets overwritten and I only get 'CA10376036': ['V', 'D']. How can I add to my dictionary with out overwriting the duplicate keys so that I get something like this?
fld_zone_dict = {'CA10376036': ['AE', 'X'], 'CA9194089':['D', 'X'],'CA10376036': ['V', 'D']....}
Short answer: There is no way to have duplicate keys in a dictionary object in Python.
However, if you were to restructure your data and take that key and put it inside of a dictionary that is nested in a list, you could have duplicate IDs. EX:
[
{
"id": "CA10376036",
"data: ['AE', 'X']
},
{
"id": "CA10376036",
"data: ['V', 'D']
},
]
Doing this though will negate any benefits of lookup speed and ease.
edit: blhsing also has a good example of how to restructure data with a reduced initial lookup time, though you would still have to iterate through data to get the record you wanted.
Dicts are not allowed to have duplicate keys in Python. You can use the dict.setdefault method to convert existing keys to a list instead:
for row in cursor:
uid = row[0]
old_zone_value = row[1]
new_zone_value = row[2]
fld_zone_dict.setdefault(uid, []).append([old_zone_value, new_zone_value])
so that fld_zone_dict will become like:
{'CA10376036': [['AE', 'X'], ['V', 'D']], 'CA9194089': ['D', 'X'], ...}
but then other keys will not have a list of lists as values, so you probably should convert them all instead:
for k, v in fld_zone_dict.items():
fld_zone_dict[k] = [v]
for row in cursor:
uid = row[0]
old_zone_value = row[1]
new_zone_value = row[2]
fld_zone_dict[uid].append([old_zone_value, new_zone_value])
so that fld_zone_dict will become like:
{'CA10376036': [['AE', 'X'], ['V', 'D']], 'CA9194089': [['D', 'X']], ...}

summing up a column in a csv file based on user search

I have the following csv file:
data.cvs
school,students,teachers,subs
us-school1,10,2,0
us-school2,20,4,2
uk-school1,10,2,0
de-school1,10,3,1
de-school1,15,3,3
I am trying to have a user search for the school country (us or uk, or de)
and then sum up the corresponding column. (e.g. sum all students in us-* etc.)
So far i am able to search using the raw_input and display column contents corresponding to the country, appreciate if someone can give me some pointers on how i can achive this.
desired output:
Country: us
Total students: 30
Total teachers: 6
Total subs: 2
--
import csv
import re
search = raw_input('Enter school (e.g. us: ')
with open('data.csv') as csvfile:
reader = csv.DictReader(csvfile)
for row in reader:
school = row['school']
students = row['students']
teachers = row['teachers']
sub = row['subs']
if re.match(search, schools) is not None:
print students
That's relatively easy to do - all you need is a dict to hold group your countries, and then just add together all of the values:
import collections
import csv
result = {} # store the results
with open("data.csv", "rb") as f: # open our file
reader = csv.DictReader(f) # use csv.DictReader for convenience
for row in reader:
country = row.pop("school")[:2] # get our country
result[country] = result.get(country, collections.defaultdict(int)) # country group
for column in row: # loop through all other columns
result[country][column] += int(row[column]) # add them together
# Now you can use or print your result by country:
for country in result:
print("Country: {}".format(country))
print("Total students: {}".format(result[country].get("students", 0)))
print("Total teachers: {}".format(result[country].get("teachers", 0)))
print("Total subs: {}\n".format(result[country].get("subs", 0)))
This is also universal as you can add additional number columns (e.g. janitors :D) and it will happily sum them together, but keep in mind that it works only with integers (if you want floats, replace the references to int with float) and it expects that every field except school is a number.
Your problem could be solved with something like this:
import csv
search = raw_input('Enter school (e.g. us: ')
with open('data.csv') as csvfile:
reader = csv.DictReader(csvfile)
result_countrys = {}
for row in reader:
students = int(row['students'])
teachers = int(row['teachers'])
subs = int(row['subs'])
subs = row['subs']
country = school[: 2]
if country in result_countrys:
count = result_countrys[country]
count['students'] = count['students'] + students
count['teachers'] = count['teachers'] + teachers
count['subs'] = count['subs'] + subs
else :
dic = {}
dic['students'] = students
dic['teachers'] = teachers
dic['subs'] = subs
result_countrys[country] = dic
for k, v in result_countrys[search].iteritems():
print("country " + str(search) + " has " + str(v) + " " + str(k))
I tryed out with this set of values:
reader = [{'school': 'us-school1', 'students': 20, 'teachers': 6, 'subs': 2}, {'school': 'us-school2', 'students': 20, 'teachers': 6, 'subs': 2}, {'school': 'uk-school1', 'students': 20, 'teachers': 6, 'subs': 2}]
and the result is:
Enter school (e.g. us): us
country us has 30 students
country us has 6 teachers
country us has 2 subs

List to Dictionary - multiple values to key

I am very new to coding and seeking guidance on below...
I have a csv output currently like this:
'Age, First Name, Last Name, Mark'
'21, John, Smith, 68'
'16, Alex, Jones, 52'
'42, Michael, Carpenter, 92 '
How do I create a dictionary that will end up looking like this:
dictionary = {('age' : 'First Name', 'Mark'), ('21' : 'John', '68'), etc}
I would like the first value to be the key - and only want two other values, and I'm having difficulty finding ways to approach this.
So far I've got
data = open('test.csv', 'r').read().split('\n')
I've tried to split each part into a string
for row in data:
x = row.split(',')
EDIT:
Thank you for those who have gave some input into solving my problem.
So after using
myDic = {}
for row in data:
tmpLst = row.split(",")
key = tmpLst[0]
value = (tmpLst[1], tmpLst[-1])
myDic[key] = value
my data came out as
['Age', 'First Name', 'Last Name', 'Mark']
['21', 'John', 'Smith', '68']
['16', 'Alex', 'Jones', '52']
['42', 'Michael', 'Carpenter', '92']
But get an IndexError: list index out of range at the line
value = (tmpLst[1], tmpLst[-1])
even though I can see that it should be within the range of the index.
Does anyone know why this error is coming up or what needs to be changed?
Assuming an actual valid CSV file that looks like this:
Age,First Name,Last Name,Mark
21,John,Smith,68
16,Alex,Jones,52
42,Michael,Carpenter,92
the following code should do what you want:
from __future__ import print_function
import csv
with open('test.csv') as csv_file:
reader = csv.reader(csv_file)
d = { row[0]: (row[1], row[3]) for row in reader }
print(d)
# Output:
# {'Age': ('First Name', 'Mark'), '16': ('Alex', '52'), '21': ('John', '68'), '42': ('Michael', '92')}
If d = { row[0]: (row[1], row[3]) for row in reader } is confusing, consider this alternative:
d = {}
for row in reader:
d[row[0]] = (row[1], row[3])
I guess you want output like this:
dictionary = {'age' : ('First Name', 'Mark')}
Then you can use the following code:
myDic = {}
for row in data:
tmpLst = row.split(",")
key = tmpLst[0]
value = (tmpLst[1], tmpLst[-1])
myDic[key] = value

How do I append a dictionary item to a pre-existing dictionary in Python?

I have a .csv file containing tweets and their sentiment polarity. Using DictReader, I see each row in the file in the format:
Sentiment: '0', SentimentText: 'xyz'
Now I want to add each row of the file to a pre-existing dictionary such that the structure of the dictionary at the end is:
{{Sentiment: '0', SentimentText: 'xyz'},
{Sentiment: '1', SentimentText: 'abc'}...#so on until eof}
Is there any way that this is possible?
EDIT: So far, this is what I have achieved. This basically makes a list of dictionaries:
dataset = []
with open('SentimentAnalysisDataset.csv') as csvfile:
reader = csv.DictReader(csvfile)
count = 1
for row in reader:
data = [{'Text': row['SentimentText'], 'Polarity': row['Sentiment']}]
tuple = {str(count): data}
count = count + 1
dataset.append(tuple)
This:
{{Sentiment: '0', SentimentText: 'xyz'},
{Sentiment: '1', SentimentText: 'abc'}...#so on until eof}
Is not a valid dictionary structure. If you wish to use a list, this will work:
[{Sentiment: '0', SentimentText: 'xyz'},
{Sentiment: '1', SentimentText: 'abc'}...#so on until eof]
Otherwise, you're probably looking for a structure like this:
{'0': 'xyz',
'1': 'abc',
...}
In order to do that, you should update the existing dictionary like so:
existing_dict = {'0': 'xyz',
'1': 'abc'}
existing_dict[row['Sentiment']] = row['SentimentText']
# OR
new_dict = {row['Sentiment']: row['SentimentText'],
... # For all sentiments in file
}
existing_dict.update(new_dict)

Converting certain index values in list to int

I have searched for a solution to this, but I have not been able to find one, weirdly.
I am opening a file with the following contents in it.
Alex,10,0,6,3,7,4
Bob, 6,3,7,2,1,8
I want to convert all the values in score_list from 1-4 index value to an integer. I have tried to do so with this following but it just doesn't work.
score_list = []
def opening_file():
counter = 0
with open('scores.txt', newline='') as infile:
reader = csv.reader(infile)
for row in reader:
score_list.append(row[0:5])
counter = 0
while counter != 5:
counter +=1
row[counter] = int(row[counter])
print (score_list)
opening_file()
but it doesn't work and just produces
[['Alex', '10', '0', '6', '3'], ['Bob', ' 6', '3', '7', '2']]
instead of [['Alex', 10, 0, 6, 3], ['Bob', 6, 3, 7, 2]]
You are converting the items within row which is just a throwaway variable. Also you don't need to that redundant works, you can simply unpack your row to name and scores parts and use a list comprehension in order to convert the digits to integer.
with open('scores.txt', newline='') as infile:
reader = csv.reader(infile)
for row in reader:
name, *scores = row
score_list.append([name] + [int(i) for i in scores])
Your while loop transforming the values in row happens too late. Each row's values have already been copied (by a slice operation) into a new list which has been appended to score_list. And you only run the loop on the last row anyway (assuming your indentation in the question is correct).
Try something like this:
with open('scores.txt', newline='') as infile:
reader = csv.reader(infile)
for row in reader:
for i in range(1,5):
row[i] = int(row[i])
score_list.append(row[0:5])
I'm using a for loop on a range, rather than a while loop, just because it's more convenient (a while loop version could work just fine, it just requires more lines). The key thing is to change row inside the loop on reader and before we slice the row to append to score_list.
First of all, the code converts items in the row array, but you print the score_list array. Second, as it alters the row variable outside the reader for loop, it only alters the last row. You could do something like this:
import csv
def opening_file():
with open('scores.txt', newline='') as infile:
return [[row[0]] + [int(x) for x in row[1:]] for row in csv.reader(infile)]
score_list = opening_file()
print(str(score_list))