Pylint: Module/Instance of has no member for google.cloud.vision API - google-cloud-platform

When I run this code (which will later be used to detect and extract text using Google Vision API in Python) I get the following errors:
Module 'google.cloud.vision_v1.types' has no 'Image' member pylint(no-member)
Instance of 'ImageAnnotatorClient' has no 'text_detection' member pylint(no-member)
from google.cloud import vision
from google.cloud.vision import types
import os, io
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = r'C:\Users\paul\VisionAPI\key.json'
client = vision.ImageAnnotatorClient()
FILE_NAME = 'im3.jpg'
FOLDER_PATH = r'C:\Users\paul\VisionAPI\images'
with io.open(os.path.join(FOLDER_PATH , FILE_NAME), 'rb') as image_file:
content = image_file.read()
image = vision.types.Image(content=content)
response = client.text_detection(image=image)
What does "Module/Instance of ___ has no members" mean?

I was able to reproduce the pylint error, though the script executes successfully when run (with minor modifications for my environment to change the filename being processed).
Therefore, I am assuming that by "run this code" you mean "run this code through pylint". If not, please update the question with how you are executing the code in a way that generates pylint errors.
This page describes the specific error you are seeing, and the case that causes a false positive for it. This is likely exactly the false positive you are hitting.
The Google Cloud Vision module appears to dynamically create these members, and pylint doesn't have a way to detect that they actually exist at runtime, so it raises the error.
Two options:
Tag the affected lines with a # pylint: disable=no-member annotation, as suggested in the page linked above.
Run pylint with the --ignore-modules=google.cloud.vision_v1 flag (or put the equivalent in your .pylintrc). You'll notice that even the actual module name is different than the one you imported :)
This is a similar question with more detail about workarounds for the E1101 error.

Related

Dataflow breaks using TaggedOutputs, "can't pickle WeakDictionary"

we are trying to deploy an Streaming pipeline to Dataflow where we separate in few different "routes" that we manipulate differently the data.
We did the complete development with the DirectRunner, and works smoothly as we tested but now, that we did deployed it to Dataflow, it does not work.
The code fails when yielding on the following doFn
class SplitByRoute(beam.DoFn):
OUTPUT_TAG_ROUTE_ONE= "route_one"
OUTPUT_TAG_ROUTE_TWO = "route_two"
OUTPUT_NOT_SUPPORTED = "not_supported"
def __init__(self):
beam.DoFn.__init__(self)
def process(self, elem):
try:
route = self.define_route(elem["param"]) # Just tag it depending on param
except Exception:
route = None
logging.info(f"Routed to {route}")
if route == self.OUTPUT_TAG_ROUTE_ONE:
yield TaggedOutput(self.OUTPUT_TAG_ROUTE_ONE, elem)
elif route == self.OUTPUT_TAG_ROUTE_TWO:
logging.info(f"Element: {elem}")
yield TaggedOutput(self.OUTPUT_TAG_ROUTE_TWO, elem)
else:
yield TaggedOutput(self.OUTPUT_NOT_SUPPORTED, elem)
It does log the element, yield the output and fails with the following error
AttributeError: Can't pickle local object 'WeakValueDictionary.__init__.<locals>.remove' [while running 'generatedPtransform-3196']
Other considerations are that we use taggedOutputs on the pipeline before this DoFn, and it works on Dataflow but this one in particularly fails with the error mentioned. Could it be the memory cache? or something related to it? Where Weakrefs are used?
Far as I know, this error happens when you have a class inside another one. Maybe not(?)
Any suggestions so how we could manage this? It's been very frustrating error.
Thank you!!! :)
We found the error
As you might know, apache-beam uses dill package to serialize the data between the modules. This let us pickle an instance of a object and send it through the pipeline.
The problem was that in self.define_route(elem["param"]), we used that instance of the class and we modified one of it's attributes. As the answer from Samuel Romero says, you can pickle a class, but I didn't really know (and probably someone has to) that if you modify the class instance it can not be pickle again. that's an strage behaviour, I know, so I opened an issue on BEAM https://issues.apache.org/jira/browse/BEAM-10384 if you want to check it out.
I will probably get into it (to understand better the problem) soon or later, but if someone had the same error, the workaround, as I mentioned is to do not modify the instance of a class beeing serialized.
Thanks to anyone who tried to help!
As you can read here, Python uses the pickle library for data serialization and it is subject to its limitations. Data serialization is the way processes transfer data between them since they do not share memory space.
Here I found a suggestion about using a fork of multiprocessing module that uses the dill package instead of pickle. This fork is part of the pathos framework (as is the dill package too) and is now called pathos.multiprocess and not pathos.multiprocessing as seen in the reference I mentioned previously.

Use TensorBoard with Keras Tuner

I ran into an apparent circular dependency trying to use log data for TensorBoard during a hyper-parameter search done with Keras Tuner, for a model built with TF2. The typical setup for the latter needs to set up the Tensorboard callback in the tuner's search() method, which wraps the model's fit() method.
from kerastuner.tuners import RandomSearch
tuner = RandomSearch(build_model, #this method builds the model
hyperparameters=hp, objective='val_accuracy')
tuner.search(x=train_x, y=train_y,
validation_data=(val_x, val_y),
callbacks=[tensorboard_cb]
In practice, the tensorboard_cb callback method needs to set up the directory where data will be logged and this directory has to be unique to each trial. A common way is to do this by naming the directory based on the current timestamp, with code like below.
log_dir = time.strftime('trial_%Y_%m_%d-%H_%M_%S')
tensorboard_cb = TensorBoard(log_dir)
This works when training a model with known hyper-parameters. However, when doing hyper-parameters search, I have to define and specify the TensorBoard callback before invoking tuner.search(). This is the problem: tuner.search() will invoke build_model() multiple times and each of these trials should have its own TensorBoard directory. Ideally defining log_dir will be done inside build_model() but the Keras Tuner search API forces the TensorBoard to be defined outside of that function.
TL;DR: TensorBoard gets data through a callback and requires one log directory per trial, but Keras Tuner requires defining the callback once for the entire search, before performing it, not per trial. How can unique directories per trial be defined in this case?
The keras tuner creates a subdir for each run (statement is probably version dependent).
I guess finding the right version mix is of importance.
Here is how it works for me, in jupyterlab.
prerequisite:
pip requirements
keras-tuner==1.0.1
tensorboard==2.1.1
tensorflow==2.1.0
Keras==2.2.4
jupyterlab==1.1.4
(2.) jupyterlab installed, built and running [standard compile arguments: production:minimize]
Here is the actual code. First i define the log folder and the callback
# run parameter
log_dir = "logs/" + datetime.datetime.now().strftime("%m%d-%H%M")
# training meta
stop_callback = EarlyStopping(
monitor='loss', patience=1, verbose=0, mode='auto')
hist_callback = tf.keras.callbacks.TensorBoard(
log_dir=log_dir,
histogram_freq=1,
embeddings_freq=1,
write_graph=True,
update_freq='batch')
print("log_dir", log_dir)
Then i define my hypermodel, which i do not want to disclose. Afterwards
i set up the hyper parameter search
from kerastuner.tuners import Hyperband
hypermodel = get_my_hpyermodel()
tuner = Hyperband(
hypermodel
max_epochs=40,
objective='loss',
executions_per_trial=5,
directory=log_dir,
project_name='test'
)
which i then execute
tuner.search(
train_data,
labels,
epochs=10,
validation_data=(val_data, val_labels),
callbacks=[hist_callback],
use_multiprocessing=True)
tuner.search_space_summary()
While the notebook with this code searches for adequate hyper parameters i control the loss in another notebook. Since tf V2 tensorboard can be called via a magic function
Cell 1
import tensorboard
Cell 2
%load_ext tensorboard
Cell 3
%tensorboard --logdir 'logs/'
Sitenote: Since i run jupyterlab in a docker container i have to specifiy the appropriate address and port for tensorboard and also forward this in the dockerfile.
The result is not really predictable for me... I did not understand yet, when i can expect histograms and distributions in tensorboard.
Some runs the loading time seems really excessive... so have patience
Under scalars i find a list of the turns as follows
"logdir"/"model_has"/execution[iter]/[train/validation]
E.g.
0101-1010/bb7981e03d05b05106d8a35923353ec46570e4b6/execution0/train
0101-1010/bb7981e03d05b05106d8a35923353ec46570e4b6/execution0/validation

Problems with MSF4J and #MatrixParam

Folks, I have found what seems to be a problem with / (bug in ?) MSF4J as including an #MatrixParam annotated variable in a URI causes the affected (micro)service to either 'hang' indefinitely, or if accessed via a browser, to give a "404 Not Found" message for the path/endpoint, even when correct.
Here is a code fragment that illustrates the problem - it compiles ok (eclipse/maven) and deploys without errors using microservicesrunner() in the usual way.
package org.test.service;
import javax.ws.rs.GET;
import javax.ws.rs.MatrixParam;
import javax.ws.rs.Path;
import javax.ws.rs.Produces;
import javax.ws.rs.core.Response;
#Path("books")
public class MPTest { // MatrixParam Test
#GET
#Produces(MediaType.TEXT_PLAIN)
#Path("/query")
// method to respond to 'GET' requests
public Response getListOfBooks(#MatrixParam("Author") String author) {
// do something in here to get book data from DB and sort by titles
List<String> titles = .......;
return Response.status(200) .entity("List of Books by " +author+ "ordered by title " + titles).build();
}
}
With this code fragment, accessing the URL "(host:8080)/books/query;Author=MickeyMouse" should cause a list of books by that author to be retrieved from the DB (I have omitted the actual code that does so for clarity, as it is not relevant to this post).
However, it does not get there, so that code isnt executed. As far as I can tell with a debugger, no #MatricParam value is retrieved - it remains null until the process times out. Things like curl and wget just hang until they time out, and from a browser, the best I can get is a 404 not found error for the URI, even though it is valid.
However, if I replace the #MatrixParam with a #PathParam it works perfectly, and can I get the URL string retrieved in its entirity. The URI that I get is as expected - no odd hex characters, no typos, and so forth. The URI entered is what you get back. So, no problem there.
Behaviour is also consistent across platforms (couple of flavours of Linux, and three versions of Windoze), so it is not anything to do with the OS itself. Similarly, I get the same behavior with multiple clients and tools, so it isnt a problem there either.
So, it appears to be a problem within the MSF4J framework / domain, and I could use some support / help / suggestions here as I've reached the point of tearing my hair out..... Any ideas, folks?
The only reference I can find to a similar problem was closed as 'off topic' without a reply (see Rest API Matrix param annotation) so I think that this needs re-opening as it seems to be a genuine problem....
Regards, and thanks in advance for any help,
Rick
#MatrixParam is not supported with MSF4J at the moment. You can create a GitHub issue. So we can implement that support in future releases.

How do I embed an IPython Interpreter into an application running in an IPython Qt Console

There are a few topics on this, but none with a satisfactory answer.
I have a python application running in an IPython qt console
http://ipython.org/ipython-doc/dev/interactive/qtconsole.html
When I encounter an error, I'd like to be able to interact with the code at that point.
try:
raise Exception()
except Exception as e:
try: # use exception trick to pick up the current frame
raise None
except:
frame = sys.exc_info()[2].tb_frame.f_back
namespace = frame.f_globals.copy()
namespace.update(frame.f_locals)
import IPython
IPython.embed_kernel(local_ns=namespace)
I would think this would work, but I get an error:
RuntimeError: threads can only be started once
I just use this:
from IPython import embed; embed()
works better than anything else for me :)
Update:
In celebration of this answer receiving 50 upvotes, here are the updates I've made to this snippet in the intervening six years since it was posted.
First, I now like to import and execute in a single statement, as I use black for all my python code these days and it reformats the original snippet in a way that doesn't make sense in this specific and unusual context. So:
__import__("IPython").embed()
Given than I often use this inside a loop or a thread, it can be helpful to include a snippet that allows terminating the parent process (partly for convenience, and partly to remind myself of the best way to do it). os._exit is the best choice here, so my snippet includes this (same logic w/r/t using a single statement):
q = __import__("functools").partial(__import__("os")._exit, 0)
Then I can simply use q() if/when I want to exit the master process.
My full snippet (with # FIXME in case I would ever be likely to forget to remove it!) looks like this:
q = __import__("functools").partial(__import__("os")._exit, 0) # FIXME
__import__("IPython").embed() # FIXME
You can follow the following recipe to embed an IPython session into your program:
try:
get_ipython
except NameError:
banner=exit_msg=''
else:
banner = '*** Nested interpreter ***'
exit_msg = '*** Back in main IPython ***'
# First import the embed function
from IPython.frontend.terminal.embed import InteractiveShellEmbed
# Now create the IPython shell instance. Put ipshell() anywhere in your code
# where you want it to open.
ipshell = InteractiveShellEmbed(banner1=banner, exit_msg=exit_msg)
Then use ipshell() whenever you want to be dropped into an IPython shell. This will allow you to embed (and even nest) IPython interpreters in your code and inspect objects or the state of the program.

django signals: fail silently? any better way of debugging mistakes?

It seems that django signals has a "fail silently" paradigm.
When I make a small spelling error in my signals function, for example:-
def new_users_handler(send, user, response, details, **kwargs):
print "new_users_handler function executes"
user.is_new = True
if user.is_new:
if "id" in response:
from urllib2 import urlopen, HTTPError
from django.template.defaultfilters import slugify
from django.core.files.base import ContentFile
try:
url = None
if sender == FacebookBackend:
url = "http://graph.facebook.com/%s/picture?type=large" \
% response["id"]
elif sender == google.GoogleOAuth2Backend and "picture" in response:
url = response["picture"]
......
socialauth_registered.connect(new_users_handler, sender=None)
I use "send" as my argument instead of "sender", I don't really get any useful error message or debug information in my devserver stdout.
Is there a good way of making sure that signal failures/error messages get shown loud and clear?
In my example above, this could have been a "5 minutes" fix if there was a proper error message telling me that
name "sender" is not defined
but instead, there wasn't any error messages and I was looking everywhere in my code base to try to figure out why my signals isn't getting called... not cool.
Any advice welcome!
Not quite what you are asking for, but your problem could also have been solved with a static analyser like pyflakes.
From pypi:
Pyflakes is program to analyze Python programs and detect various errors. It works by parsing the source file, not importing it, so it is safe to use on modules with side effects. It's also much faster.
sample output:
tmp.py:9: 'ContentFile' imported but unused
tmp.py:13: undefined name 'sender'
I have it integrated into my editor (vim, but i've also seen it in several others), highlighting my typos as i go, or on save.