Problem with using custom models for online prediction in Cloud ML Engine - google-cloud-ml

I am trying to deploy an online prediction model using Cloud ML Engine. The model is a custom model created by the Python package annoy.
I have created the model using the code example provided at https://github.com/spotify/annoy:
from annoy import AnnoyIndex
import random
f = 40
t = AnnoyIndex(f, 'angular') # Length of item vector that will be indexed
for i in range(1000):
v = [random.gauss(0, 1) for z in range(f)]
t.add_item(i, v)
t.build(10) # 10 trees
t.save('test.ann')
I have also created a predictor class for the model as a predictor.py file
import os
from annoy import AnnoyIndex
class MyPredictor(object):
"""An example Predictor for an AI Platform custom prediction routine."""
def __init__(self, model):
"""Stores artifacts for prediction. Only initialized via `from_path`.
"""
self._model = model
def predict(self, instances, **kwargs):
"""Performs custom prediction.
Preprocesses inputs, then performs prediction using the trained
scikit-learn model.
Args:
instances: A list of prediction input instances.
**kwargs: A dictionary of keyword args provided as additional
fields on the predict request body.
Returns:
A list of outputs containing the prediction results.
"""
return [1]
#classmethod
def from_path(cls, model_dir):
"""Creates an instance of MyPredictor using the given path.
This loads artifacts that have been copied from your model directory in
Cloud Storage. MyPredictor uses them during prediction.
Args:
model_dir: The local directory that contains the trained
scikit-learn model and the pickled preprocessor instance. These
are copied from the Cloud Storage model directory you provide
when you deploy a version resource.
Returns:
An instance of `MyPredictor`.
"""
model_path = os.path.join(model_dir, 'test.ann')
model = AnnoyIndex(40)
model.load(model_path)
return cls(model)
When I try to run the following command to deploy the model, I get an error.
gcloud beta ai-platform versions create lsh_v11
--model test_lsh_v1
--runtime-version 1.14
--python-version 3.5
--origin gs://bucket/folder/test.ann
--package-uris gs://bucket/folder/my_custom_code-0.1.tar.gz
--prediction-class predictor.MyPredictor
The error is
(gcloud.beta.ai-platform.versions.create) Create Version failed. Bad model detected with error: "Failed to load model: User-provided package my_custom_code-0.1.tar.gz failed to install: Command '['python-default', '-m', 'pip', 'install', '--target=/tmp/custom_lib', '--no-cache-dir', '-b', '/tmp/pip_builds', '/tmp/custom_code/my_custom_code-0.1.tar.gz']' returned non-zero exit status 1 (Error code: 0)"
I wonder what I am doing wrong and if it's even possible to deploy a model like this in Cloud ML Engine.
Btw, This is the content of the setup.py file I am using to create the tar file.
from setuptools.command.install import install
from setuptools import setup
setup(
name='my_custom_code',
version='0.1',
scripts=['predictor.py'],
install_requires=['annoy']
)

Related

Keras model custom signatures are not saved when learning is done on GCP

After reading blog "An Object Oriented Approach To Train an Image Classifier with Tensorflow" I've tried to set custom signatures for my Keras model (TF2) like this:
class ImageClassifierModel(tf.keras.Model):
def __init__(self,...)
def call(self,...)
def save(self, filepath, overwrite=True, include_optimizer=True, save_format=None, signatures=None, options=None,save_traces=True):
if signatures is None:
signatures = dict()
signatures["serving_default"] = self.predict_b64_image
signatures["serving_array"] = self.predict_numpy_image
super().save(filepath, overwrite, include_optimizer, save_format, signatures, options, save_traces)
#tf.function(input_signature=[tf.TensorSpec(name="image_b64_string", shape=None, dtype=tf.string)])
def predict_b64_image(self, b64_image):
#...
#tf.function(input_signature=[tf.TensorSpec(name="image_tensor", shape=(None, None, None, 3), dtype=tf.float32)])
def predict_numpy_image(self, image):
#...
I have a ModelCheckpoint callback that saves the model when learning is done.
Once learning is finished, if I looked at the model signatures with saved_model_cli tool, I can see that:
When learning is done locally (docker container):
my signatures are there.
When learning is done on Google Cloud Platform (same custom container):
After the download of the model locally (from Google Storage) I look at the signatures: my custom signatures are not there anymore. Only the "image" signature (probably inferred by TF) that expects an image of rank 4
How do you explain such a behavior?
EDIT: the only workaround I've found is:
model = tf.keras.models.load_model(my_model_path)
model.save_weights('/tmp/my_model_weights')
# redefine my model
model = ResnetModel(...)
keras_model.load_weights('/tmp/my_model_weights')
model.save(my_model_path)

Can't db.drop_all() when creating tabes with SqlAlchemy op.create_table

I'm building a Flask service that uses SqlAlchemy Core for database operations, but I'm not using the ORM- just dispatching raw SQL to the PostgreSQL db. In order to track database migrations I'm using Alembic.
My migration looks roughly like this:
def upgrade():
# Add the ossp extenson to enable use of UUIDs
add_extension_command = 'create EXTENSION if not EXISTS "uuid-ossp";'
bind = op.get_bind()
session = Session(bind=bind)
session.execute(add_extension_command)
# Create tables
op.create_table(
"account",
Column(
"id", UUID(as_uuid=True), primary_key=True, server_default=text("uuid_generate_v4()"),
),
Column("created_at", DateTime, server_default=sql.func.now()),
Column("deleted_at", DateTime, default=None),
Column("modified_at", DateTime, server_default=sql.func.now()),
)
This works great in general- the main issue I'm having is with testing. After each test, I want to be able to drop and rebuild the DB to clean out the data. To do this, I'm using PyTest, and created the following App fixture:
#pytest.fixture
def app():
app = create_app("testing")
with app.app_context():
db.init_app(app)
Migrate(app, db)
upgrade()
yield app
db.drop_all()
The general idea here was that each time we need the app context, we apply the database migrations, yield the app, then when the test is done we drop all the tables.
The issue is, db.drop_all() does nothing. I believe this is because the db object is not bound to any MetaData. The research I did lead here, which mentions that the create_table command does not create MetaData, which I assume is why the app is not aware of which tables are available to drop.
I'm a bit stuck here as to what the right path forward is. Should I change how I'm building these migrations? Is this not the right pattern to make sure I remove test data from the DB between tests?

Django reversion does not save revisions made in shell

I did the initial installation steps and created the initial revisions, but then when I save a model in django shell, the new revision is not created:
In [1]: s = Shop.objects.all()[0]
In [2]: import reversion
In [3]: s.name = 'a'
In [4]: s.save()
In [5]: s.name = 'b'
In [6]: s.save()
In [7]: reversion.get_for_object(s)
Out[7]: [<Version: <1> "X">]
This is the initial revision.
When I update the model from a view, a revision is created successfully.
What am I missing?
The models.py file is:
...
class Shop(Model):
...
import reversion
reversion.register(Shop)
<EOF>
I see a reversion method among post_save receiver, although it isn't called when I debug it.
I have Django v1.4.1, reversion v1.6.2.
I wrote django-reversion, so I think I can shed some light on this issue.
A Version of a model is automatically saved when a model is saved, providing the following are true:
The model is registered with django-reversion.
The code block is marked up as being within a revision.
Point 1 can be achieved by either registering a model with VersionAdmin, or explicitly calling reversion.register() in your models.py file.
Point 2 can be achieved by using RevisionMiddleware, or the reversion.create_revision() decorator or context manager. Any admin views in VersionAdmin also save a revision.
So, if your shell is not creating Versions, then either point 1 or point 2 is not being met. Here's how to fix it:
If you're using VersionAdmin, import the relevant admin module in your shell code to kick in the auto-registration. Alternatively, call reversion.register() in your models.py file.
In your shell code, using the reversion.create_revision() context manager around your call to save.
with reversion.create_revision():
s.save()
More about this sort of thing on the Low Level API wiki page:
http://django-reversion.readthedocs.org/en/latest/api.html

django 1.3 delete model instance and delete file

I am trying to overriding a Django model's delete method like this
class Picture(models.Model):
image = models.ImageField(upload_to='photos/')
gallery = models.ForeignKey(Gallery)
def __unicode__(self):
return u'%s' % (self.image)
def delete(self, *args, **kwargs):
self.image.delete()
super(Picture, self).delete(*args, **kwargs)
but nothing happen why? The picture file is always in the photos' folder. I am using django 1.3
Django doesn't delete files any more from version 1.3
In earlier Django versions, when a model instance containing a
FileField was deleted, FileField took it upon itself to also delete
the file from the backend storage. This opened the door to several
data-loss scenarios, including rolled-back transactions and fields on
different models referencing the same file. In Django 1.3, when a
model is deleted the FileField's delete() method won't be called. If
you need cleanup of orphaned files, you'll need to handle it yourself
(for instance, with a custom management command that can be run
manually or scheduled to run periodically via e.g. cron).
You can use either a cronjob or a manual command to delete orphan files. Alternatively you could use a post_delete handler to remove the file. This would have the advantage, that the code is only executed, when the transaction that deleted the model instance succeeded.
Note however, that post_delete handlers are only run on Model.delete(), not on QuerSet.delete(). This was fixed in the current dev version of Django, so if you regulary use QuerySet.delete() I would recommend using the dev version.
There is simple solution - django-cleanup
pip install django-cleanup
settings.py
INSTALLED_APPS = (
...
'django_cleanup.apps.CleanupConfig',
)

Enable export to XML via HTTP on a large number of models with child relations

I've a large number of models (120+) and I would like to let users of my application export all of the data from them in XML format.
I looked at django-piston, but I would like to do this with minimum code. Basically I'd like to have something like this:
GET /export/applabel/ModelName/
Would stream all instances of ModelName in applabel together with it's tree of related objects .
I'd like to do this without writing code for each model.
What would be the best way to do this?
The standard django dumpdata command is not flexible enough to export single models. You can use the makefixture command to do that http://github.com/ericholscher/django-test-utils/blob/master/test_utils/management/commands/makefixture.py
If I'd have to do this, as a basic starting point I'd start from something like:
from django.core.management import call_command
def export_view(request, app_label, model_slug):
# You can do some stuff here with the specified model if needed
# ct = ContentType.objects.get(app_label=app_label, model=model_slug)
# model_class = ct.model_class()
# I don't know if this is a correct specification of params
# command line example: python manage.py makefixture --format=xml --indent=4 YourModel[3] auth.User[:5]
# You'll have to test it out and tweak it
call_command("makefixture", "file.xml", '%s.%s[:]' % (app_label, model_slug), format='xml')