How to save a model and then deploy the model using AI Platform Notebooks only or through UI? - google-cloud-platform

I am not able to access the Cloud Console or install/run gcloud locally because of firewall issues. Even attaching to a Compute Engine VM instance fails. However, starting a Notebook from AI Platform and working on it in the browser is the only thing working in my environment.
To test a demo and to understand hands-on how to save and deploy a model, I wrote some very basic code for Boston Housing dataset. However the code fails at the part where I call joblib.dump(model, filename). The exact same same code however works without exception locally.
I have a general idea that I need to save/deploy model to Google Cloud Storage and somehow access it from there. But the instructions I find keep referring to some shell environment which I cannot access at the moment.
Q1.) Is there a way to save the model in Cloud Storage without leaving the Notebook environment?
Q2.) Also, how do I again access the model for prediction?
Most Important Question.) Will getting the above 2 questions working, lead me to have a Deployed Model ?
Sample code you can use -
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score
import joblib
boston_dataset = datasets.load_boston()
X = boston_dataset['data']
Y = boston_dataset['target']
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size = 0.2, random_state=5)
lin_model = LinearRegression()
lin_model.fit(X_train, Y_train)
y_train_predict = lin_model.predict(X_train)
rmse = (np.sqrt(mean_squared_error(Y_train, y_train_predict)))
r2 = r2_score(Y_train, y_train_predict)
y_test_predict = lin_model.predict(X_test)
# root mean square error of the model
rmse = (np.sqrt(mean_squared_error(Y_test, y_test_predict)))
# r-squared score of the model
r2 = r2_score(Y_test, y_test_predict)
print('RMSE is {}'.format(rmse))
print('R2 score is {}'.format(r2))
filename = 'model.pkl'
joblib.dump(lin_model, filename) # <- this line raises exception
loaded_model = joblib.load(filename)
result = loaded_model.score(X_test, Y_test)
print(result)

Related

Trouble authenticating and writing to database locally

I'm having trouble authenticating and writing data to a spanner database locally. All imports are up to date - google.cloud, google.auth2, etc. I have tried having someone else run this and it works fine, so the problem seems to be something on my end - something wrong or misconfigured on my computer, maybe where the credentials are stored or something?
Anyone have any ideas?
from google.cloud import spanner
from google.api_core.exceptions import GoogleAPICallError
from google.api_core.datetime_helpers import DatetimeWithNanoseconds
import datetime
from google.oauth2 import service_account
def write_to(database):
record = [[
1041613562310836275,
'test_name'
]]
columns = ("id", "name")
insert_errors = []
try:
with database.batch() as batch:
batch.insert_or_update(
table = "guild",
columns = columns,
values = record,
)
except GoogleAPICallError as e:
print(f'error: {e}')
insert_errors.append(e.message)
pass
return insert_errors
if __name__ == "__main__":
credentials = service_account.Credentials.from_service_account_file(r'path\to\a.json')
instance_id = 'instance-name'
database_id = 'database-name'
spanner_client = spanner.Client(project='project-name', credentials=credentials)
print(f'spanner creds: {spanner_client.credentials}')
instance = spanner_client.instance(instance_id)
database = instance.database(database_id)
insert_errors = write_to(database)
some credential tests:
creds = service_account.Credentials.from_service_account_file(a_json)
<google.oauth2.service_account.Credentials at 0x...>
spanner_client.credentials
<google.auth.credentials.AnonymousCredentials at 0x...>
spanner_client.credentials.signer_email
AttributeError: 'AnonymousCredentials' object has no attribute 'signer_email'
creds.signer_email
'...#....iam.gserviceaccount.com'
spanner.Client().from_service_account_json(a_json).credentials
<google.auth.credentials.AnonymousCredentials object at 0x...>
The most common reason for this is that you have accidentally set (or forgot to unset) the environment variable SPANNER_EMULATOR_HOST. If this environment variable has been set, the client library will try to connect to the emulator instead of Cloud Spanner. This will cause the client library to wait for a long time while trying to connect to the emulator (assuming that the emulator is not running on your machine). Unset the environment variable to fix this problem.
Note: This environment variable will only affect Cloud Spanner client libraries, which is why other Google Cloud product will work on the same machine. The script will also in most cases work on other machines, as they are unlikely to have this environment variable set.

SageMaker PyTorch estimator.fit freezes when running in local mode from EC2

I am trying to train a PyTorch model through SageMaker. I am running a script main.py (which I have posted a minimum working example of below) which calls a PyTorch Estimator. I have the code for training my model saved as a separate script, train.py, which is called by the entry_point parameter of the Estimator. These scripts are hosted on a EC2 instance in the same AWS region as my SageMaker domain.
When I try running this with instance_type = "ml.m5.4xlarge", it works ok. However, I am unable to debug any problems in train.py. Any bugs in that file simply give me the error: 'AlgorithmError: ExecuteUserScriptError', and will not allow me to set breakpoint() lines in train.py (encountering a breakpoint throws the above error).
Instead I am trying to run in local mode, which I believe does allow for breakpoints. However, when I reach estimator.fit(inputs), it hangs on that line indefinitely, giving no output. Any print statements that I put at the start of the main function in train.py are not reached. This is true no matter what code I put in train.py. It also did not throw an error when I had an illegal underscore in the base_job_name parameter of the estimator, which suggests that it does not even create the estimator instance.
Below is a minimum example which replicates the issue on my instance. Any help would be appreciated.
### File structure
main.py
customcode/
|
|_ train.py
### main.py
import sagemaker
from sagemaker.pytorch import PyTorch
import boto3
try:
# When running on Studio.
sess = sagemaker.session.Session()
bucket = sess.default_bucket()
role = sagemaker.get_execution_role()
except ValueError:
# When running from EC2 or local machine.
print('Performing manual setup.')
bucket = 'MY-BUCKET-NAME'
region = 'us-east-2'
role = 'arn:aws:iam::MY-ACCOUNT-NUMBER:role/service-role/AmazonSageMaker-ExecutionRole-XXXXXXXXXX'
iam = boto3.client("iam")
sagemaker_client = boto3.client("sagemaker")
boto3.setup_default_session(region_name=region, profile_name="default")
sess = sagemaker.Session(sagemaker_client=sagemaker_client, default_bucket=bucket)
hyperparameters = {'epochs': 10}
inputs = {'data': f's3://{bucket}/features'}
train_instance_type = 'local'
hosted_estimator = PyTorch(
source_dir='customcode',
entry_point='train.py',
instance_type=train_instance_type,
instance_count=1,
hyperparameters=hyperparameters,
role=role,
base_job_name='mwe-train',
framework_version='1.12',
py_version='py38',
input_mode='FastFile',
)
hosted_estimator.fit(inputs) # This is the line that freezes
### train.py
def main():
breakpoint() # Throws an error in non-local mode.
return
if __name__ == '__main__':
print('Reached') # Never reached in local mode.
main()

Script mode py3 and lack of output in s3 after successful training

I've created a script where I define my Tensorflow Estimator, then I pass it to AWS sagemaker sdk and run fit(), the training passes (though doesnt show anything related to training in the console) and in S3 the only output is /source/sourcedir.tar.gz and I believe there also should be at least /model/model.tar.gz which for some reason is not generated and I'm not getting any errors.
sagemaker_session = sagemaker.Session()
role = get_execution_role()
inputs = sagemaker_session.upload_data(path='data', key_prefix='data/NamingConventions')
NamingConventions_estimator = TensorFlow(entry_point='NamingConventions.py',
role=role,
framework_version='1.12.0',
train_instance_count=1,
train_instance_type='ml.m5.xlarge',
py_version='py3',
model_dir="s3://sagemaker-eu-west-2-218566301064/model")
NamingConventions_estimator.fit(inputs, run_tensorboard_locally=True)
and my model_fn from 'NamingConventions.py'
def model_fn(features, labels, mode, params):
net = keras.layers.Embedding(alphabetLen + 1, 8, input_length=maxFeatureLen)(features[INPUT_TENSOR_NAME])
net = keras.layers.LSTM(12)(net)
logits = keras.layers.Dense(len(conventions), activation=tf.nn.softmax)(net) #output
predictions = tf.reshape(logits, [-1])
if mode == tf.estimator.ModeKeys.PREDICT:
return tf.estimator.EstimatorSpec(
mode=mode,
predictions={"ages": predictions},
export_outputs={SIGNATURE_NAME: PredictOutput({"ages": predictions})})
loss = keras.losses.sparse_categorical_crossentropy(labels, predictions)
train_op = tf.contrib.layers.optimize_loss(
loss=loss,
global_step=tf.contrib.framework.get_global_step(),
learning_rate=params["learning_rate"],
optimizer="AdamOptimizer")
predictions_dict = {"ages": predictions}
eval_metric_ops = {
"rmse": tf.metrics.root_mean_squared_error(
tf.cast(labels, tf.float32), predictions)
}
return tf.estimator.EstimatorSpec(
mode=mode,
loss=loss,
train_op=train_op,
eval_metric_ops=eval_metric_ops)
I still can't get it running, I'm trying to use script-mode, it seems like I can't import my model from the same directory.
Currently my script:
import argparse
import os
if __name__ =='__main__':
parser = argparse.ArgumentParser()
# hyperparameters sent by the client are passed as command-line arguments to the script.
parser.add_argument('--epochs', type=int, default=10)
parser.add_argument('--batch_size', type=int, default=100)
parser.add_argument('--learning_rate', type=float, default=0.1)
# input data and model directories
parser.add_argument('--model_dir', type=str)
parser.add_argument('--train', type=str, default=os.environ.get('SM_CHANNEL_TRAIN'))
parser.add_argument('--test', type=str, default=os.environ.get('SM_CHANNEL_TEST'))
args, _ = parser.parse_known_args()
import tensorflow as tf
from NC_model import model_fn, train_input_fn, eval_input_fn
def train(args):
print(args)
estimator = tf.estimator.Estimator(model_fn=model_fn, model_dir=args.model_dir)
train_spec = tf.estimator.TrainSpec(train_input_fn, max_steps=1000)
eval_spec = tf.estimator.EvalSpec(eval_input_fn)
tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)
if __name__ == '__main__':
train(args)
Is the training job listed as successful in the AWS console? Did you check the training log in Amazon CloudWatch?
I think you need to set your estimator model_dir to the path in the environment variable SM_MODEL_DIR.
This is a bit contrary to the docs which are not clear on this point. I suspect the --model_dir arg is used for distributed training and not saving of the final artifact.
Note that you'll get all your checkpoints and summaries there to so it probably best to use --model_dir in your estimator and copy your model export to SM_MODEL_DIR when training has finished.
Script mode gives you the freedom to write TensorFlow scripts the way you want, but the cost is, you have to do almost everything by yourself. For example, here in your case, if you want the model.tar.gz in S3, you have to export the model locally first. Then SageMaker will upload your local model to S3 automatically.
So what you need to add in your script is:
You need to add an exporter and pass it to eval_spec.
You need to call export_savedmodel to save the model to the local model dir that SageMaker can get. The local model dir is in env variable SM_MODEL_DIR, and should be '/opt/ml/model'.
To finish above, I guess you need to have your serving_input_fn implemented too.
Then SageMaker will upload your model from the local model dir automatically to the S3 model dir you specify. And you can see that in S3 after job succeeds.

Saving data from traceplot in PyMC3

Below is the code for a simple Bayesian Linear regression. After I obtain the trace and the plots for the parameters, is there any way in which I can save the data that created the plots in a file so that if I need to plot it again I can simply plot it from the data in the file rather than running the whole simulation again?
import pymc3 as pm
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(0,9,5)
y = 2*x + 5
yerr=np.random.rand(len(x))
def soln(x, p1, p2):
return p1+p2*x
with pm.Model() as model:
# Define priors
intercept = pm.Normal('Intercept', 15, sd=5)
slope = pm.Normal('Slope', 20, sd=5)
# Model solution
sol = soln(x, intercept, slope)
# Define likelihood
likelihood = pm.Normal('Y', mu=sol,
sd=yerr, observed=y)
# Sampling
trace = pm.sample(1000, nchains = 1)
pm.traceplot(trace)
print pm.summary(trace, ['Slope'])
print pm.summary(trace, ['Intercept'])
plt.show()
There are two easy ways of doing this:
Use a version after 3.4.1 (currently this means installing from master, with pip install git+https://github.com/pymc-devs/pymc3). There is a new feature that allows saving and loading traces efficiently. Note that you need access to the model that created the trace:
...
pm.save_trace(trace, 'linreg.trace')
# later
with model:
trace = pm.load_trace('linreg.trace')
Use cPickle (or pickle in python 3). Note that pickle is at least a little insecure, don't unpickle data from untrusted sources:
import cPickle as pickle # just `import pickle` on python 3
...
with open('trace.pkl', 'wb') as buff:
pickle.dump(trace, buff)
#later
with open('trace.pkl', 'rb') as buff:
trace = pickle.load(buff)
Update for someone like me who is still coming over to this question:
load_trace and save_trace functions were removed. Since version 4.0 even the deprecation waring for these functions were removed.
The way to do it is now to use arviz:
with model:
trace = pymc.sample(return_inferencedata=True)
trace.to_netcdf("filename.nc")
And it can be loaded with:
trace = arviz.from_netcdf("filename.nc")
This way works for me :
# saving trace
pm.save_trace(trace=trace_nb, directory=r"c:\Users\xxx\Documents\xxx\traces\trace_nb")
# loading saved traces
with model_nb:
t_nb = pm.load_trace(directory=r"c:\Users\xxx\Documents\xxx\traces\trace_nb")

How to create a google map in jupyter notebook using Latitude and Longitude

import gmaps
import gmaps.datasets
gmaps.configure(api_key="AI...") # Your Google API key
locations = gmaps.datasets.load_dataset("starbucks_uk")
fig = gmaps.Map()
starbucks_layer = gmaps.symbol_layer(
locations, fill_color="green", stroke_color="green", scale=2)
fig.add_layer(starbucks_layer)
fig
I am currently trying to load this in my jupyter notebook, however the map will not display
Any help on this would be appreciated!
first, You need to write this code in your terminal:
jupyter nbextension enable --py gmaps
and then make sure in your terminal you get "validating ok" in your terminal.
for the datasets itself, you need to write 'starbucks_kfc_uk' not "starbucks_uk"
next code should be like this:
starbucks_df = df[df['chain_name'] == 'starbucks']
starbucks_df = starbucks_df[['latitude', 'longitude']]
starbucks_layer = gmaps.symbol_layer(
starbucks_df, fill_color="green", stroke_color="green", scale=2
)
fig = gmaps.figure()
fig.add_layer(starbucks_layer)
fig
And this is my result: