Google Vision API 'TypeError: invalid file' - google-cloud-platform

The following piece of code comes from Google's Vision API Documentation, the only modification I've made is adding the argument parser for the function at the bottom.
import argparse
import os
from google.cloud import vision
import io
def detect_text(path):
"""Detects text in the file."""
client = vision.ImageAnnotatorClient()
with io.open(path, 'rb') as image_file:
content = image_file.read()
image = vision.types.Image(content=content)
response = client.text_detection(image=image)
texts = response.text_annotations
print('Texts:')
for text in texts:
print('\n"{}"'.format(text.description))
vertices = (['({},{})'.format(vertex.x, vertex.y)
for vertex in text.bounding_poly.vertices])
print('bounds: {}'.format(','.join(vertices)))
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", type=str,
help="path to input image")
args = vars(ap.parse_args())
detect_text(args)
If I run it from a terminal like below, I get this invalid file error:
PS C:\VisionTest> python visionTest.py --image C:\VisionTest\test.png
Traceback (most recent call last):
File "visionTest.py", line 31, in <module>
detect_text(args)
File "visionTest.py", line 10, in detect_text
with io.open(path, 'rb') as image_file:
TypeError: invalid file: {'image': 'C:\\VisionTest\\test.png'}
I've tried with various images and image types as well as running the code from different locations with no success.

Seems like either the file doesn't exist or is corrupt since it isn't even read. Can you try another image and validate it is in the location you expect?

Related

scheduler produces empty files

I'm using pythonanywhere for a simple scheduled task.
I want to download data from a link once a day and save csv files. Later once i have a decent time series I'll figure out how I actually want to manage the data. It's not much data so don't need anything fancy like a database.
My script takes the data from the google sheets link, adds a log column and a time column, then writes a csv with the date in the filename.
It works exactly as I want it to when I run it manually in pythonanywhere, but the scheduler is just creating empty csv files albeit with the correct name.
Any ideas what's up? I don't understand the log file. Surely the error should happen when it is run manually?
script:
import pandas as pd
import time
import datetime
def write_today(df):
date = time.strftime("%Y-%m-%d")
df.to_csv('Properties_'+date+'.csv')
url = 'https://docs.google.com/spreadsheets/d/19h2GmLN-2CLgk79gVxcazxtKqS6rwW36YA-qvuzEpG4/export?format=xlsx'
df = pd.read_excel(url, header=1).rename(columns={'Unnamed: 1':'code'})
source = pd.read_excel(url).columns[0]
df['source'] = source
df['time'] = datetime.datetime.now()
write_today(df)
the scheduler is set up as so:
log file:
Traceback (most recent call last):
File "/home/abmoore/load_data.py", line 24, in <module>
write_today(df)
File "/home/abmoore/load_data.py", line 16, in write_today
df.to_csv('Properties_'+date+'.csv')
File "/usr/local/lib/python2.7/dist-packages/pandas/core/frame.py", line 1344, in to_csv
formatter.save()
File "/usr/local/lib/python2.7/dist-packages/pandas/formats/format.py", line 1551, in save
self._save()
File "/usr/local/lib/python2.7/dist-packages/pandas/formats/format.py", line 1638, in _save
self._save_header()
File "/usr/local/lib/python2.7/dist-packages/pandas/formats/format.py", line 1634, in _save_header
writer.writerow(encoded_labels)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xa3' in position 0: ordinal not in range(128)
Your problem there is the UnicodeDecodeError -- you have some non-ascii data in your spreadsheet, and the pandas to_csv function defaults to ascii encoding. try specifying utf8 instead:
def write_today(df):
filename = 'Properties_{date}.csv'.format(date=time.strftime("%Y-%m-%d"))
df.to_csv(filename, encoding='utf8')
http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_csv.html

How to load retrained_graph.pb and retrained_label.txt using pycharm editor

Using pete warden tutorials i had trained the inception network and training of which i am getting two files
1.retrained_graph.pb
2.retrained_label.txt
Using this i wanted to classify the flower image.
I had install pycharm and linked all the tensorflow library , i had also test the sample tensorflow code it is working fine.
Now when i run the label_image.py program which is
import tensorflow as tf, sys
image_path = sys.argv[1]
# Read in the image_data
image_data = tf.gfile.FastGFile(image_path, 'rb').read()
# Loads label file, strips off carriage return
label_lines = [line.rstrip() for line
in tf.gfile.GFile("/tf_files/retrained_labels.txt")]
# Unpersists graph from file
with tf.gfile.FastGFile("/tf_files/retrained_graph.pb", 'rb') as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
_ = tf.import_graph_def(graph_def, name='')
with tf.Session() as sess:
# Feed the image_data as input to the graph and get first prediction
softmax_tensor = sess.graph.get_tensor_by_name('final_result:0')
predictions = sess.run(softmax_tensor, \
{'DecodeJpeg/contents:0': image_data})
# Sort to show labels of first prediction in order of confidence
top_k = predictions[0].argsort()[-len(predictions[0]):][::-1]
for node_id in top_k:
human_string = label_lines[node_id]
score = predictions[0][node_id]
print('%s (score = %.5f)' % (human_string, score))
i am getting this error message
/home/chandan/Tensorflow/bin/python /home/chandan/PycharmProjects/tf/tf_folder/tf_files/label_image.py
Traceback (most recent call last):
File "/home/chandan/PycharmProjects/tf/tf_folder/tf_files/label_image.py", line 7, in <module>
image_path = sys.argv[1]
IndexError: list index out of range
Could any one please help me with this issue.
You are getting this error because it is expecting image name (with path) as an argument.
In pycharm go to View->Tool windows->Terminal.
It is same as opening separate terminal. And run
python label_image.py /image_path/image_name.jpg
You are trying to get the command line argument by calling sys.argv[1]. So you need to give command line arguments to satisfy it. Looks like the argument required is a test image, you should pass its location as a parameter.
Pycharm should have a script parameters and interpreter options dialog which you can use to enter the required parameters.
Or you can call the script from a command line and enter the parameter via;
>python my_python_script.py my_python_parameter.jpg
EDIT:
According to the documents (I don't have pycharm installed on this computer), you should go to Run/Debug configuration menu and edit the configurations for your script. Add the absolute path of your file into Script Parameters box in quotes.
Or alternatively if you just want to skip the parameter thing completely, just get the path as raw_input (input in python3) or just simply give it to image_path = r"absolute_image_path.jpg"

telnetlib and python subprocess in telnet

I am telnet into Keysight N1914A power meter and python subprocess.check_out[("Measure:Power?)] is not working. So I am trying to use the python telnetlib. I do not need username or password to log in. IP address and port number is all it needs.
There are lots of examples showing how to log in the device. My question is that how to obtain the results from the device after input commands.
For example: in the device, I type *IDN? it will result its device information; and when I type Measure:Power? it will result the power in decibel format.
import time
import csv
from string import split
import getpass
import sys
import telnetlib
import subprocess
Host = "192.168.1.10"
PORT = 5024
TIMEOUT =10
i = open('practice1.csv', 'wb')
tn = telnetlib.Telnet(Host,PORT)
print "You log in"
time.sleep(5)
while True:
#Powertemp1 = subprocess.check_output(["Measure:Power?"])
#tn.write("Measure:Power?")
tn.write("*IDN?")
Powertemp1 = tn.read_all()
print type(Powertemp1)
print Powertemp1
#Powertemp = float(Powertemp1)
#print '&s' % Powertemp
#wr = csv.writer(i, dialet = 'excel')
#wr.writerow([Powertemp])
time.sleep(5)
type(tn.read_all()) is str, but in the actual screen it is around 40 empty lines, and nothing is stored in the output text file.
Here is the result:
You log in
Traceback (most recent call last):
File "sunlook.py", line 25, in <module>
tn.write("*IDN?")
File "/usr/lib/python2.7/telnetlib.py", line 282, in write
self.sock.sendall(buffer)
File "/usr/lib/python2.7/socket.py", line 224, in meth
return getattr(self._sock,name)(*args)
socket.error: [Errno 32] Broken pipe

Error using ElementTree to parse data from .config file

Im trying to use ElementTree to get data from a .config file. The structure of this file is like this for example:
<userSettings>
<AutotaskUpdateTicketEstimatedHours.My.MySettings>
<setting name="Username" serializeAs="String">
<value>AAA</value>
</setting>
My code is this:
import os, sys
import xml.etree.ElementTree as ET
class Init():
script_dir = os.path.dirname(__file__)
rel_path = "app.config"
abs_file_path = os.path.join(script_dir, rel_path)
tree = ET.parse(abs_file_path)
root = tree.getroot()
sites = root.iter('userSettings')
for site in sites:
apps = site.findall('AutotaskUpdateTicketEstimatedHours.My.MySettings')
for app in apps:
print(''.join([site.get('Username'), app.get('value')]))
if __name__ == '__main__':
handler = Init()
However, when I run this code I get:
Traceback (most recent call last):
File "/Users/AAAA/Documents/Aptana/AutotaskUpdateTicketEstimatedHours/Main.py", line 5, in <module>
class Init():
File "/Users/AAA/Documents/Aptana/AutotaskUpdateTicketEstimatedHours/Main.py", line 16, in Init
print(''.join([site.get('Username'), app.get('value')]))
TypeError: sequence item 0: expected string, NoneType found
What I'm I doing wrong the causes this error?
(My problem seems to be accessing the tree structure of my config.file correctly)
You may change your code to:
print(''.join([app.get('name'), app.find('value').text]))
app is an Element Object in this case <setting>. Using the get function you will get an attribute value by name (e.g. name, serializeAs), using the find
function you will get a subelement (e.g <value>).
Once you have <value> you can get the data inside with text
Note that site (<AutotaskUpdateTicketEstimatedHours.My.MySettings>) doesn't have any attributes, therefore you get None.

PYPDF watermarking returns error

hi im trying to watermark a pdf fileusing pypdf2 though i get this error i cant figure out what goes wrong.
i get the following error:
Traceback (most recent call last): File "test.py", line 13, in <module>
page.mergePage(watermark.getPage(0)) File "C:\Python27\site-packages\PyPDF2\pdf.py", line 1594, in mergePage
self._mergePage(page2) File "C:\Python27\site-packages\PyPDF2\pdf.py", line 1651, in _mergePage
page2Content, rename, self.pdf) File "C:Python27\site-packages\PyPDF2\pdf.py", line 1547, in
_contentStreamRename
op = operands[i] KeyError: 0
using python 2.7.6 with pypdf2 1.19 on windows 32bit.
hopefully someone can tell me what i do wrong.
my python file:
from PyPDF2 import PdfFileWriter, PdfFileReader
output = PdfFileWriter()
input = PdfFileReader(open("test.pdf", "rb"))
watermark = PdfFileReader(open("watermark.pdf", "rb"))
# print how many pages input1 has:
print("test.pdf has %d pages." % input.getNumPages())
print("watermark.pdf has %d pages." % watermark.getNumPages())
# add page 0 from input, but first add a watermark from another PDF:
page = input.getPage(0)
page.mergePage(watermark.getPage(0))
output.addPage(page)
# finally, write "output" to document-output.pdf
outputStream = file("outputs.pdf", "wb")
output.write(outputStream)
outputStream.close()
Try writing to a StringIO object instead of a disk file. So, replace this:
outputStream = file("outputs.pdf", "wb")
output.write(outputStream)
outputStream.close()
with this:
outputStream = StringIO.StringIO()
output.write(outputStream) #write merged output to the StringIO object
outputStream.close()
If above code works, then you might be having file writing permission issues. For reference, look at the PyPDF working example in my article.
I encountered this error when attempting to use PyPDF2 to merge in a page which had been generated by reportlab, which used an inline image canvas.drawInlineImage(...), which stores the image in the object stream of the PDF. Other PDFs that use a similar technique for images might be affected in the same way -- effectively, the content stream of the PDF has a data object thrown into it where PyPDF2 doesn't expect it.
If you're able to, a solution can be to re-generate the source pdf, but to not use inline content-stream-stored images -- e.g. generate with canvas.drawImage(...) in reportlab.
Here's an issue about this on PyPDF2.