Following is my intention
1. read json file with codecs and utf-8 encoding
2. load the json file into python as dictionary
3. iterate through dictionary , if 'categories' key contains value 'Restaurant' then add it to a set ; else continue to next iteration.
Issue: 'categories' key may contain values like 'Restaurant', 'Restaurant and Bristro', 'Restaurant and Bar'.
My if condition should select all these three values not only 'Restaurant'
Sample code as follows
import json
restaurant_ids = set()
# open the json file
with codecs.open(json_file.json, encoding='utf_8') as f:
# iterate through each line (json record) in the file
for b_json in f:
# convert the json record to a Python dict
business = json.loads(b_json)
# if this key is not a restaurant, skip to the next one
if u'Restaurants' not in business[u'categories']:
continue
# add the restaurant business id to our restaurant_ids set
restaurant_ids.add(business[u'business_id'])
print (len(restaurant_ids))
I am getting error at if condition, business[u'categories'] seems to be unicode object, I get following error
Argument of type 'NonType' is not iterable
Any help would be highly appreciated
One of the JSON objects is missing the catagories key.
Related
there is a text file containing data in the form:
[sec1]
"ab": "s"
"sd" : "d"
[sec2]
"rt" : "ty"
"gh" : "rr"
"kk":"op"
we are supposed to return dara of matching sections in json format like if user wants sec1 so we are supposed to send sec1 contents
The format you specified is very similar to the TOML format. However, this one uses equals signs for assignments of key-value pairs.
If your format actually uses colons for the assignment, the following example may help you.
It uses regular expressions in conjunction with a defaultdict to read the data from the file. The section to be queried is extracted from the URL using a variable rule.
If there is no hit within the loaded data, the server responds with a 404 error (NOT FOUND).
import re
from collections import defaultdict
from flask import (
Flask,
abort,
jsonify
)
def parse(f):
data = defaultdict(dict)
section = None
for line in f:
if re.match(r'^\[[^\]]+\]$', line.strip()):
section = line[1:-2]
data[section] = dict()
continue
m = re.match(r'^"(?P<key>[^"]+)"\s*:\s*"(?P<val>[^"]+)"$', line.strip())
if m:
key,val = m.groups()
if not section:
raise OSError('illegal format')
data[section][key] = val
continue
return dict(data)
app = Flask(__name__)
#app.route('/<string:section>')
def data(section):
path = 'path/to/file'
with open(path) as f:
data = parse(f)
if section in data:
return jsonify(data[section])
abort(404)
I'm new to Python but required to write a script for API so trying to read the response from API and put it in a file, version is python2.7
Following is the code
import requests
import json
#URL = "someurl"
# sending get request and saving the response as response object
#response = requests.get(url = URL)
#print(response.status_code)
#print(response.content)
items = json.loads('{"batch_id":"5d83a2d317cb4","names":
{"19202":"text1","19203":"text2"}}')
print(items['names'])
for item in items['names']:
print(item)
Current output is
19202
19203
But I would like to pick text1,text2 and write to a file, can anyone help how to get those values
items is a dictionary. items['names'] is also a dictionary. for item in items['names']: will iterate over keys not values. The item will hold key in the dictionary.
To access the value in that key-value pair, you have to use print items['names'][item] instead of print item. Your code should look something like below.
import requests
import json
#URL = "someurl"
# sending get request and saving the response as response object
#response = requests.get(url = URL)
#print(response.status_code)
#print(response.content)
items = json.loads('{"batch_id":"5d83a2d317cb4","names":
{"19202":"text1","19203":"text2"}}')
print(items['names'])
for item in items['names']:
print(items['names'][item])
>>> print(list(items['names'].values()))
['text1', 'text2']
So you can do like this:
for item in items['names'].values():
print(item)
I get the following error when trying to make an online prediction on my ML Engine model.
The key "values" is not correct. (See error on image.)
enter image description here
I already tested with RAW image data : {"image_bytes":{"b64": base64.b64encode(jpeg_data)}}
& Converted the data to a numpy array.
Currently I have the following code:
from googleapiclient import discovery
import base64
import os
from PIL import Image
import json
import numpy as np
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "/Users/jacob/Desktop/******"
def predict_json(project, model, instances, version=None):
"""Send json data to a deployed model for prediction.
Args:
project (str): project where the Cloud ML Engine Model is deployed.
model (str): model name.
instances ([Mapping[str: Any]]): Keys should be the names of Tensors
your deployed model expects as inputs. Values should be datatypes
convertible to Tensors, or (potentially nested) lists of datatypes
convertible to tensors.
version: str, version of the model to target.
Returns:
Mapping[str: any]: dictionary of prediction results defined by the
model.
"""
# Create the ML Engine service object.
# To authenticate set the environment variable
# GOOGLE_APPLICATION_CREDENTIALS=<path_to_service_account_file>
service = discovery.build('ml', 'v1')
name = 'projects/{}/models/{}'.format(project, model)
if version is not None:
name += '/versions/{}'.format(version)
response = service.projects().predict(
name=name,
body={'instances': instances}
).execute()
if 'error' in response:
raise RuntimeError(response['error'])
return response['predictions']
savepath = 'upload/11277229_F.jpg'
img = Image.open('test/01011000/11277229_F.jpg')
test = img.resize((299, 299))
test.save(savepath)
img1 = open(savepath, "rb").read()
def load_image(filename):
with open(filename) as f:
return np.array(f.read())
predict_json('image-recognition-25***08', 'm500_200_waug', [{"values": str(base64.b64encode(img1).decode("utf-8")), "key": '87'}], 'v1')
The error message itself indicates (as you point out in the question), that the key "values" is not one of the inputs specified in the model. To inspect the model's input, use saved_model_cli show --all --dir=/path/to/model. That will show you a list of the names of the inputs. You'll need to use the correct name.
That said, it appears there is another issue. It's not clear from the question what type of input your model is expecting, though it's likely one of two things:
A matrix of integers or floats
A byte string with the raw image file
contents.
The exact solution will depend on which of the above your exported model is using. saved_model_cli will help here, based on the type and shape of the input. It will either be DT_FLOAT32 (or some other int/float type) and [NONE, 299, 299, CHANNELS] or DT_STRING and [NONE], respectively.
If your model is type (1), then you will need to send a matrix of ints/floats (which does not use base64 encoding):
predict_json('image-recognition-25***08', 'm500_200_waug', [{CORRECT_INPUT_NAME: load_image(savepath).tolist(), "key": '87'}], 'v1')
Note the use of tolist to convert the numpy array to a list of lists.
In the case of type (2), you need to tell the service you have some base64 data by adding in {"b64": ...}:
predict_json('image-recognition-25***08', 'm500_200_waug', [{CORRECT_INPUT_NAME: {"b64": str(base64.b64encode(img1).decode("utf-8"))}, "key": '87'}], 'v1')
All of this, of course, depends on using the correct name for CORRECT_INPUT_NAME.
One final note, I'm assuming your model actually does have key as an additional inputs since you included it in your request; again, that can all be verified against the output of saved_model_cli show.
I used to get this errors too. If anyone comes across this error, and using gcloud.
Tensors are automatically called csv_rows. For example this works for me now
"instances": [{
"csv_row": "STRING,7,4.02611534,9,14,0.66700000,0.17600000,0.00000000,0.00000000,1299.76500000,57",
"key": "0"
}]
am getting the following post data in my django app
POST
Variable Value
csrfmiddlewaretoken u'LHM3nkrrrrrrrrrrrrrrrrrrrrrrrrrdd'
id u'{"docs":[],"dr":1, "id":4, "name":"Group", "proj":"/al/p1/proj/2/", "resource":"/al/p1/dgroup/4/","route":"group", "parent":null'
am trying to get the id value in variable id i.e "id":4 (the value 4). When I do request.POST.get('id')I get the whole json string. u'{"docs":[],"dr":1, "id":4, "name":"Group", "proj":"/al/p1/proj/2/", "resource":"/al/p1/dgroup/4/","route":"group", "parent":null' How can I get the "id" in the string?
The data you are sending is simply a json string.
You have to parse that string before you can access data within it. For this you can use Python's json module (it should be available if you're using Python 2.7).
import json
data = json.loads( request.POST.get('id') )
id = data["id"]
If you somehow don't have the json module, you can get the simplejson module.
For more details, refer this question : best way to deal with JSON in django
That's happening because id is string, not dict as it should be. Please provide your template and view code to find source of problem.
I'm using Django to manage a Postgres database. I have a value stored in the database representing a city in Spain (Málaga). My Django project uses unicode strings for everything by putting from __future__ import unicode_literals at the beginning of each of the files I created.
I need to pull the city information from the database and send it to another server using an XML request. There is logging in place along the way so that I can observe the flow of data. When I try and log the value for the city I get the following traceback:
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe1' in position 1: ordinal not in range(128)
Here is the code I use to log the values I'm passing.
def createXML(self, dict):
"""
.. method:: createXML()
Create a single-depth XML string based on a set of tuples
:param dict: Set of tuples (simple dictionary)
"""
xml_string = ''
for key in dict:
self.logfile.write('\nkey = {0}\n'.format(key))
if (isinstance(dict[key], basestring)):
self.logfile.write('basestring\n')
self.logfile.write('value = {0}\n\n'.format(dict[key].decode('utf-8')))
else:
self.logfile.write('value = {0}\n\n'.format(dict[key]))
xml_string += '<{0}>{1}</{0}>'.format(key, dict[key])
return xml_string
I'm basically saving all the information I have in a simple dictionary and using this function to generate an XML formatted string - this is beyond the scope of this question.
The error I am getting had me wondering what was actually being saved in the database. I have verified the value is utf-8 encoded. I created a simple script to extract the value from the database, decode it and print it to the screen.
from __future__ import unicode_literals
import psycopg2
# Establish the database connection
try:
db = psycopg2.connect("dbname = 'dbname' \
user = 'user' \
host = 'IP Address' \
password = 'password'")
cur = db.cursor()
except:
print "Unable to connect to the database."
# Get database info if any is available
command = "SELECT state FROM table WHERE id = 'my_id'"
cur.execute(command)
results = cur.fetchall()
state = results[0][0]
print "my state is {0}".format(state.decode('utf-8'))
Result: my state is Málaga
In Django I'm doing the following to create the HTTP request:
## Create the header
http_header = "POST {0} HTTP/1.0\nHost: {1}\nContent-Type: text/xml\nAuthorization: Basic {2}\nContent-Length: {3}\n\n"
req = http_header.format(service, host, auth, len(self.xml_string)) + self.xml_string
Can anyone help me correct the problem so that I can write this information to the database and be able to create the req string to send to the other server?
Am I getting this error as a result of how Django is handling this? If so, what is Django doing? Or, what am I telling Django to do that is causing this?
EDIT1:
I've tried to use Django's django.utils.encoding on this state value as well. I read a little from saltycrane about a possible hiccup Djano might have with unicode/utf-8 stuff.
I tried to modify my logging to use the smart_str functionality.
def createXML(self, dict):
"""
.. method:: createXML()
Create a single-depth XML string based on a set of tuples
:param dict: Set of tuples (simple dictionary)
"""
xml_string = ''
for key in dict:
if (isinstance(dict[key], basestring)):
if (key == 'v1:State'):
var_str = smart_str(dict[key])
for index in range(0, len(var_str)):
var = bin(ord(var_str[index]))
self.logfile.write(var)
self.logfile.write('\n')
self.logfile.write('{0}\n'.format(var_str))
xml_string += '<{0}>{1}</{0}>'.format(key, dict[key])
return xml_string
I'm able to write the correct value to the log doing this but I narrowed down another possible problem with the .format() string functionality in Python. Of course my Google search of python format unicode had the first result as Issue 7300, which states that this is a known "issue" with Python 2.7.
Now, from another stackoverflow post I found a "solution" that does not work in Django with the smart_str functionality (or at least I've been unable to get them to work together).
I'm going to continue digging around and see if I can't find the underlying problem - or at least a work-around.
EDIT2:
I found a work-around by simply concatenating strings rather than using the .format() functionality. I don't like this "solution" - it's ugly, but it got the job done.
def createXML(self, dict):
"""
.. method:: createXML()
Create a single-depth XML string based on a set of tuples
:param dict: Set of tuples (simple dictionary)
"""
xml_string = ''
for key in dict:
xml_string += '<{0}>'.format(key)
if (isinstance(dict[key], basestring)):
xml_string += smart_str(dict[key])
else:
xml_string += str(dict[key])
xml_string += '<{0}>'.format(key)
return xml_string
I'm going to leave this question unanswered as I'd love to find a solution that lets me use .format() the way it was intended.
This is correct approach (problem was with opening file. With UTF-8 You MUST use codecs.open() :
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import codecs
class Writer(object):
logfile = codecs.open("test.log", "w", 'utf-8')
def createXML(self, dict):
xml_string = ''
for key, value in dict.iteritems():
self.logfile.write(u'\nkey = {0}\n'.format(key))
if (isinstance(value, basestring)):
self.logfile.write(u'basestring\n')
self.logfile.write(u'value = {0}\n\n'.format( value))
else:
self.logfile.write(u'value = {0}\n\n'.format( value ))
xml_string += u'<{0}>{1}</{0}>'.format(key, value )
return xml_string
And this is from python console:
In [1]: from test import Writer
In [2]: d = { 'a' : u'Zażółć gęślą jaźń', 'b' : u'Och ja Ci zażółcę' }
In [3]: w = Writer()
In [4]: w.createXML(d)
Out[4]: u'<a>Za\u017c\xf3\u0142\u0107 g\u0119\u015bl\u0105 ja\u017a\u0144</a><b>Och ja Ci za\u017c\xf3\u0142c\u0119</b>'
And this is test.log file:
key = a
basestring
value = Zażółć gęślą jaźń
key = b
basestring
value = Och ja Ci zażółcę