How to extract a value in a nested json source? - python-2.7

I'm trying to get the temperature from a json source. It's nested and I can't figure out how to get a nested value from a json file or url
So here comes my code so far:
#! /usr/bin/python
import urllib2
import json
f = urllib2.urlopen('http://api.openweathermap.org/data/2.5/find?q=London&units=metric')
json_string = f.read()
parsed_json = json.loads(json_string)
temp = parsed_json['list']
print "Current temperature is: %s" % (temp)
f.close()
Right now I can get all values at once but not just a particular value (temp in my case)
I prefer to get the value clean without u'temp': if possible.

u'temp' is how Python represents unicode objects which is what JSON strings get parsed into in Python. Is this what you're looking for?
print temp[0]['main']['temp']
I don't know the structure of the API you're calling, so this may be making quite a few assumptions, but it will get you the raw temperature.

You are getting several values back. To list them all do:
import urllib2
import json
f = urllib2.urlopen('http://api.openweathermap.org/data/2.5/find?q=London&units=metric')
json_string = f.read()
parsed_json = json.loads(json_string)
for each in parsed_json['list']:
country = each['sys']['country']
temperature = each['main']['temp']
print "Current Temperature in {} is {}".format(country, temperature)
Output
Current Temperature in CA is 11.73
Current Temperature in GB is 11.8

Related

Python read text file based on partial name and file timestamp

I'm trying to pull two of the same files into python in different dataframes, with the end goal of comparing what was added in the new file and removed from the old. So far, I've got code that looks like this:
In[1] path = r'\\Documents\FileList'
files = os.listdir(path)
In[2] files_txt = [f for f in files if f[-3:] == 'txt']
In[3] for f in files_txt:
data = pd.read_excel(path + r'\\' + f)
df = df.append(data)
I've also set a variable to equal the current date minus a certain number of days, which I want to use to pull the file that has a date equal to that variable:
d7 = dt.datetime.today() - timedelta(7)
As of now, I'm unsure of how to do this, as the first part of the filename always remains the same but they add numbers at the end (eg. file_03232016 then file_03302016). I want to parse through the directory for the beginning part of the filename and add it to a dataframe if it matches the date parameter I set.
EDIT: I forgot to add that sometimes I also need to look at the system date created timestamp, as the text date in the file name isn't always there.
Here are some modifications to your original code to get a list of files containing your target date. You need to use strftime.
import os
from datetime import timedelta
d7 = dt.datetime.today() - timedelta(7)
target_date_str = d7.strftime('_%m%d%Y')
files_txt = [f for f in files if f[-13:] == target_date_str + '.txt']
>>> target_date_str + '.txt'
'_03232016.txt'
data = []
for f in files_txt:
data.append(pd.read_excel(os.path.join(path, f))
df = pd.concat(data, ignore_index=True)
Use strftime in order to represent your datetime variable as a string with desired format and glob for searching files by file mask in the directory:
import datetime as dt
import glob
fmask = r'\\Documents\FileList\*' + (dt.datetime.today() - dt.timedelta(7)).strftime('%m%d%Y') + '*.txt'
files_txt = glob.glob(fmask)
# concatenate all CSV/txt files into one data frame
df = pd.concat([pd.read_csv(f) for f in files_txt], ignore_index=True)
PS I guess you want to use read_csv instead of read_excel when working with txt files unless you really have excel files with txt extension?

How to remove unwanted items from a parse file

from googlefinance import getQuotes
import json
import time as t
import re
List = ["A","AA","AAB"]
Time=t.localtime() # Sets variable Time to retrieve date/time info
Date2= ('%d-%d-%d %dh:%dm:%dsec'%(Time[0],Time[1],Time[2],Time[3],Time[4],Time[5])) #formats time stamp
while True:
for i in List:
try: #allows elements to be called and if an error does the next step
Data = json.dumps(getQuotes(i.lower()),indent=1) #retrieves Data from google finance
regex = ('"LastTradePrice": "(.+?)",') #sets parse
pattern = re.compile(regex) #compiles parse
price = re.findall(pattern,Data) #retrieves parse
print(i)
print(price)
except: #sets Error coding
Error = (i + ' Failed to load on: ' + Date2)
print (Error)
It will display the quote as: ['(number)'].
I would like it to only display the number, which means removing the brackets and quotes.
Any help would be great.
Changing:
print(price)
into:
print(price[0])
prints this:
A
42.14
AA
10.13
AAB
0.110
Try to use type() function to know the datatype, in your case type(price)
it the data type is list use print(price[0])
you will get the output (number), for brecess you need to check google data and regex.

Use Python with lxml to parse a xml document and write elements into a text file

With the following Python code I want to parse a xml file. An extract of the xml file you can see below the code. I need to "extract" everything which is behind "inv: name =" like in this case "'datasource roof height' and (value = 1000 or value = 2000 or value = 3000 or value = 4000 or value = 5000 or value = 6000)". Any ideas?
My Python code (so far):
from lxml import etree
doc = etree.parse("data.xml")
for con in doc.xpath("//specification"):
for cons in con.xpath("./#body"):
with open("output.txt", "w") as cons_out:
cons_out.write(cons)
cons_out.close()
Part of the xml file:
<ownedRule xmi:type="uml:Constraint" xmi:id="EAID_OR000004_EE68_4efa_8E1B_8DDFA8F95FB8" name="datasource roof height">
<constrainedElement xmi:idref="EAID_94F3B0A6_EE68_4efa_8E1B_8DDFA8F95FB8"/>
<specification xmi:type="uml:OpaqueExpression" xmi:id="EAID_COE000004_EE68_4efa_8E1B_8DDFA8F95FB8" body="inv: name = 'datasource roof height' and (value = 1000 or value = 2000 or value = 3000 or value = 4000 or value = 5000 or value = 6000)"/>
</ownedRule>
XML Parsers understand attributes and elements. What is present within these attributes or elements (the textual content) is of no concern to the XML parser.
In order to solve your problem you would need to split the string retrieved from the body attribute. Of course, I am assuming that the body attribute for all elements would have the same format content i.e. "inv : name = some content"
from lxml import etree
doc = etree.parse("data.xml")
for con in doc.xpath("//specification"):
for cons in con.xpath("./#body"):
with open("output.txt", "w") as cons_out:
content = cons.split("inv: name =")[1]
cons_out.write(content)
cons_out.close()

How to read dates using xlrd?

This is the code where "rec" variable is used to read the dates in excel sheet but its printing float value how to print that in date format for example '2015:09:02'
for rec in sorted(out.keys()):
print rec #printing float values
print str(out[rec])
I got output:
42240.0
24
Excel internally stored date values as floats. So in xlrd if you want to read Excel date values as Python date values, you have to use the xldate_as_tuple method to get the date.
Documentation: http://www.lexicon.net/sjmachin/xlrd.html#xlrd.xldate_as_tuple-function
Here's a generic Example:
import datetime, xlrd
book = xlrd.open_workbook("myexcelfile.xls")
sh = book.sheet_by_index(0)
a1 = sh.cell_value(rowx=0, colx=0)
a1_as_datetime = datetime.datetime(*xlrd.xldate_as_tuple(a1, book.datemode))
print 'datetime: %s' % a1_as_datetime
If you create the file myexcelfile.xls and enter a date in cell A1 and run the above code, you should be able to see the correct datetime value in the a1_as_datetime variable.

Converting a list from a .txt file into a dictionary

Ok, I've tried all the methods in Convert a list to a dictionary in Python, but I can't seem to get this to work right. I'm trying to convert a list that I've made from a .txt file into a dictionary. So far my code is:
import os.path
from tkinter import *
from tkinter.filedialog import askopenfilename
import csv
window = Tk()
window.title("Please Choose a .txt File")
fileName = askopenfilename()
classInfoList = []
classRoster = {}
with open(fileName, newline = '') as listClasses:
for line in csv.reader(listClasses):
classInfoList.append(line)
The .txt file is in the format:
professor
class
students
An example would be:
Professor White
Chem 101
Jesse Pinkman, Brandon Walsh, Skinny Pete
The output I desire would be a dictionary with professors as the keys, and then the class and list of students for the values.
OUTPUT:
{"Professor White": ["Chem 101", [Jesse Pinkman, Brandon Walsh, Skinny Pete]]}
However, when I tried the things in the above post, I kept getting errors.
What can I do here?
Thanks
Since the data making up your dictionary is on consecutive lines, you will have to process three lines at once. You can use the next() method on the file handle like this:
output = {}
input_file = open('file1')
for line in input_file:
key = line.strip()
value = [next(input_file).strip()]
value.append(next(input_file).split(','))
output[key] = value
input_file.close()
This would give you:
{'Professor White': ['Chem 101',
['Jesse Pinkman, Brandon Walsh, Skinny Pete']]}