This is my code it's the exact same code as the pdf
states = [
"Oregon":"OR",
"Florida": "FL",
"California": "CA",
"New York": "NY",
"Michigan": "MI"
]
cities = [
"CA": "San Francisco",
"MI": "Detroit",
"FL": "Jacksonville"
]
cities["NY"] = "New York"
cities["OR"] = "Portland"
print "-" * 10
print "NY state has: ", cities["NY"]
print "OR state has: ", cities["OR"]
print "-" * 10
print "Michigan's abbreviation is: ", states["Michigan"]
print "Florida's abbreviation is: ", states["Florida"]
print "-" * 10
print "Michigan has: ", cities[states["Michigan"]]
print "Florida has: ", cities[states["Florida"]]
print "-" * 10
for state, abbrev in states.items():
print "%s is abbreviated %s", % (state, abbrev)
print "-" * 10
for abbrev, city in cities.items():
print "%s has the city %s" % (abbrev, city)
print "-" * 10
for state, abbrev in states.items():
print "%s state is abbreviated %s and has city %s" % (
state, abbrev, cities[abbrev])
print "-" * 10
state = states.get("Texas", None)
if not state:
print "Sorry, no Texas."
city = cities.get("TX", "Does Not Exist")
print "The city for the state 'TX' is: %s" % city
This is my error i put into my terminal python ex39.py and i get this.
File "ex39.py", line 3
"Oregon":"OR",
^
SyntaxError: invalid syntax
i'm running macOS 10.13.6 Beta (17G47b)
MacBook (13-inch, Mid 2010)
Processor 2.4 GHz Intel Core 2 Duo
Memory 8 GB 1067 MHz DDR3
Graphics NVIDIA GeForce 320M 256 MB
So the issue here is when you use brackets [] it makes a list, like [1,2,3,4,5].
While 1-5 in that list are all in the same list, they don't directly interact with each other.
You're looking for a dictionary, which uses curly brackets {}. It takes a array of information, but it takes them in pairs of a key and its value.
So you need this
states = {
"Oregon":"OR",
"Florida": "FL",
"California": "CA",
"New York": "NY",
"Michigan": "MI"
}
cities = {
"CA": "San Francisco",
"MI": "Detroit",
"FL": "Jacksonville"
}
The first of the pair is they key, the second is the value.
Hope this helps! Happy coding!
I'm trying to read in a large file (~8Gb) using pandas read_csv. In one of the columns in the data, there is sometimes a list which includes commas but it enclosed by curly brackets e.g.
"label1","label2","label3","label4","label5"
"{A1}","2","","False","{ "apple" : false, "pear" : false, "banana" : null}
Therefore, when these particular lines were read in I was getting the error "Error tokenizing data. C error: Expected 37 fields in line 35, saw 42". I found this solution which said to add
sep=",(?![^{]*})" into the read_csv arguments which worked with splitting the data correctly. However, the data now includes the quotation marks around every entry (this didn't happen before I added the sep argument in).
The data looks something like this now:
"label1" "label2" "label3" "label4" "label5"
"{A1}" "2" "" "False" "{ "apple" : false, "pear" : false, "banana" : null}"
meaning I can't use, for example, .describe(), etc on the numerical data because they're still strings.
Does anyone know of a way of reading it in without the quotation marks but still splitting the data where it is?
Very new to Python so apologies if there is an obvious solution.
serialdev found a solution to removing the "s but the data columns are objects and not what I would expect/want, e.g. the integer values aren't seen as integers.
The data needs to be split at "," explicitly (including the "s), is there a way of stating that in the read_csv arguments?
Thanks!
To read in the data structure you specified, where the last element is an unknown length.
"{A1}","2","","False","{ "apple" : false, "pear" : false, "banana" : null}"
"{A1}","2","","False","{ "apple" : false, "pear" : false, "banana" : null, "orange": "true"}"
Change the separate to a regular expression using a negative forward lookahead assertion. This will enable you to separate on a ',' only when not immediately followed by a space.
df = pd.read_csv('my_file.csv', sep='[,](?!\s)', engine='python', thousands='"')
print df
0 1 2 3 4
0 "{A1}" 2 NaN "False" "{ "apple" : false, "pear" : false, "banana" :...
1 "{A1}" 2 NaN "False" "{ "apple" : false, "pear" : false, "banana" :...
Specifying the thousands separator as the quote is a bit of a hackie way to parse fields contains a quoted integer into the correct datatype. You can achieve the same result using converters which can also remove the quotes from the strings should you need it to and cast "True" or "False" to a boolean.
If need remove " from column, use vectorized function str.strip:
import pandas as pd
mydata = [{'"first_name"': '"Bill"', '"age"': '"7"'},
{'"first_name"': '"Bob"', '"age"': '"8"'},
{'"first_name"': '"Ben"', '"age"': '"9"'}]
df = pd.DataFrame(mydata)
print (df)
"age" "first_name"
0 "7" "Bill"
1 "8" "Bob"
2 "9" "Ben"
df['"first_name"'] = df['"first_name"'].str.strip('"')
print (df)
"age" "first_name"
0 "7" Bill
1 "8" Bob
2 "9" Ben
If need apply function str.strip() to all columns, use:
df = pd.concat([df[col].str.strip('"') for col in df], axis=1)
df.columns = df.columns.str.strip('"')
print (df)
age first_name
0 7 Bill
1 8 Bob
2 9 Ben
Timings:
mydata = [{'"first_name"': '"Bill"', '"age"': '"7"'},
{'"first_name"': '"Bob"', '"age"': '"8"'},
{'"first_name"': '"Ben"', '"age"': '"9"'}]
df = pd.DataFrame(mydata)
df = pd.concat([df]*3, axis=1)
df.columns = ['"first_name1"','"age1"','"first_name2"','"age2"','"first_name3"','"age3"']
#create sample [300000 rows x 6 columns]
df = pd.concat([df]*100000).reset_index(drop=True)
df1,df2 = df.copy(),df.copy()
def a(df):
df.columns = df.columns.str.strip('"')
df['age1'] = df['age1'].str.strip('"')
df['first_name1'] = df['first_name1'].str.strip('"')
df['age2'] = df['age2'].str.strip('"')
df['first_name2'] = df['first_name2'].str.strip('"')
df['age3'] = df['age3'].str.strip('"')
df['first_name3'] = df['first_name3'].str.strip('"')
return df
def b(df):
#apply str function to all columns in dataframe
df = pd.concat([df[col].str.strip('"') for col in df], axis=1)
df.columns = df.columns.str.strip('"')
return df
def c(df):
#apply str function to all columns in dataframe
df = df.applymap(lambda x: x.lstrip('\"').rstrip('\"'))
df.columns = df.columns.str.strip('"')
return df
print (a(df))
print (b(df1))
print (c(df2))
In [135]: %timeit (a(df))
1 loop, best of 3: 635 ms per loop
In [136]: %timeit (b(df1))
1 loop, best of 3: 728 ms per loop
In [137]: %timeit (c(df2))
1 loop, best of 3: 1.21 s per loop
Would this work since you have all the data that you need:
.map(lambda x: x.lstrip('\"').rstrip('\"'))
So simply clean up all the occurrences of " afterwards
EDIT with example:
mydata = [{'"first_name"' : '"bill', 'age': '"75"'},
{'"first_name"' : '"bob', 'age': '"7"'},
{'"first_name"' : '"ben', 'age': '"77"'}]
IN: df = pd.DataFrame(mydata)
OUT:
"first_name" age
0 "bill "75"
1 "bob "7"
2 "ben "77"
IN: df['"first_name"'] = df['"first_name"'].map(lambda x: x.lstrip('\"').rstrip('\"'))
OUT:
0 bill
1 bob
2 ben
Name: "first_name", dtype: object
Use this sequence after selecting the column, it is not ideal but will get the job done:
.map(lambda x: x.lstrip('\"').rstrip('\"'))
You can change the Dtypes after using this pattern:
df['col'].apply(lambda x: pd.to_numeric(x, errors='ignore'))
or simply:
df[['col2','col3']] = df[['col2','col3']].apply(pd.to_numeric)
It depend on your file. Did you check your data if there is comma or not, in cell ? If you have like this e.g Banana : Fruit, Tropical, Eatable, etc. in same cell, you're gonna get this kind of bug. One of basic solution is removing all commas in a file. Or, if you can read it, you can remove special characters :
>>>df
Banana
0 Hello, Salut, Salom
1 Bonjour
>>>df['Banana'] = df['Banana'].str.replace(',','')
>>>df
Banana
0 Hello Salut Salom
1 Bonjour
I keep getting the
"too many variables to unpack"
error. Can anybody help me get this working, and possibly give me an explanation?
wings_quantity = {
'small' : 8,
'medium' : 14,
'large' : 20,
'half bucket' : 30,
'bucket' : 65,
}
wings_price = {
'small' : 5.99,
'medium' :8.50,
'large' : 14.00,
'half bucket' :20.00,
'bucket' : 55.00
}
for number, key in wings_quantity:
print " "
print "There are "+(str(wings_quantity[number]))+ " wings in a "+(wings_quantity[key])+" size."
print " "
for number, key in wings_quantity:
ppw = wings_quantity[number] / wings_price[number]
print ('The cost per wing in a %s size is $') + ppw %wing_quantity[key]
You are close, but you forgot to put the iteritems() on the end of your for statements.
Change
for number, key in wings_quantity:
to
for number, key in wings_quantity.iteritems():
After that problem you need to rewrite your print statements as they are trying to access the dictionary twice. Since you already have the values you can just print them like so:
print "There are "+ key + " wings in a "+ str(value) +" size."
I tested this in 3.4 and it worked, but in 3.x you need to change it to
for number, key in wings_quantity.items():
This produced this output for the first loop
There are bucket wings in a 65 size.
There are small wings in a 8 size.
There are medium wings in a 14 size.
There are half bucket wings in a 30 size.
There are large wings in a 20 size.
I am new (like 2 weeks) trying to learn Python 2.7x.
I am trying to do a basic program that has a user input a cost of a meal and it outputs how much it would be with a .15 tip. I want the output to look like 23.44 (showing 2 decimals)
My code:
MealPrice = float(raw_input("Please type in your bill amount: "))
tip = float(MealPrice * 0.15,)
totalPrice = MealPrice+tip
int(totalPrice)
print "Your tip would be: ",tip
print "Yout total bill would be: ",totalPrice
my output:
Please type in your bill amount: 22.22
Your tip would be: 3.333
Yout total bill would be: 25.553
You want to format your float value for printing only; use formatting:
print "Your tip would be: {:.2f}".format(tip)
print "Your total bill would be: {:.2f}".format(totalPrice)
The .2f is a formatting mini language specification for a floating point value of 2 digits after the decimal.
You need to remove the int() call to preserve those digits after the decimal. You don't need to call float() so much either:
MealPrice = float(raw_input("Please type in your bill amount: "))
tip = MealPrice * 0.15
totalPrice = MealPrice + tip
print "Your tip would be: {:.2f}".format(tip)
print "Your total bill would be: {:.2f}".format(totalPrice)
Demo:
Please type in your bill amount: 42.50
Your tip would be: 6.38
Your total bill would be: 48.88
You can further tweak the formatting to align those numbers up along the decimal point too.
I created this program in Python 2.7.3
I did this in my Computer Science class. He assigned it in two parts. For the first part we had to create a program to calculate a monthly cell phone bill for five customers. The user inputs the number of texts, minutes, and data used. Additionaly, there are overage fees. $10 for every GB of data over the limit, $.4, per minute over the limit, and $.2 per text sent over the limit. 500 is the limit amount of text messages, 750 is the limit amount of minutes, and 2 GB is the limit amount of data for the plan.
For part 2 of the assignment. I have to calculate the total tax collected, total charges (each customer bill added together), total goverment fees collected, total customers who had overages etc.
Right now all I want help on is adding the customer bills all together. As I said earlier, when you run the program it prints the Total bill for 5 customers. I don't know how to assign those seperate totals to a variable, add them together, and then eventually print them as one big variable.
TotalBill = 0
monthly_charge = 69.99
data_plan = 30
minute = 0
tax = 1.08
govfees = 9.12
Finaltext = 0
Finalminute = 0
Finaldata = 0
Finaltax = 0
TotalCust_ovrtext = 0
TotalCust_ovrminute = 0
TotalCust_ovrdata = 0
TotalCharges = 0
for i in range (1,6):
print "Calculate your cell phone bill for this month"
text = input ("Enter texts sent/received ")
minute = input ("Enter minute's used ")
data = input ("Enter Data used ")
if data > 2:
data = (data-2)*10
TotalCust_ovrdata = TotalCust_ovrdata + 1
elif data <=2:
data = 0
if minute > 750:
minute = (minute-750)*.4
TotalCust_ovrminute = TotalCust_ovrminute + 1
elif minute <=750:
minute = 0
if text > 500:
text = (text-500)*.2
TotalCust_ovrtext = TotalCust_ovrtext + 1
elif text <=500:
text = 0
TotalBill = ((monthly_charge + data_plan + text + minute + data) * (tax)) + govfees
print ("Your Total Bill is... " + str(round(TotalBill,2)))
print "The toatal number of Customer's who went over their minute's usage limit is... " ,TotalCust_ovrminute
print "The total number of Customer's who went over their texting limit is... " ,TotalCust_ovrtext
print "The total number of Customer's who went over their data limit is... " ,TotalCust_ovrdata
Some of the variables created are not used in the program. Please overlook them.
As Preet suggested.
create another variable like TotalBill i.e.
AccumulatedBill = 0
Then at the end of your loop put.
AccumulatedBill += TotalBill
This will add each TotalBill to Accumulated. Then simply print out the result at the end.
print "Total for all customers is: %s" %(AccumulatedBill)
Note: you don't normally use uppercase on variables for the first letter of the word. Use either camelCase or underscore_separated.