Tensorboard displaying scalar without keras.fit nor estimator in TF2.0

Tensorboard displaying scalar without keras.fit nor estimator in TF2.0 - tensorboard

I have read TF 2.0 tutorial: Using TensorBoard with other methods
Then I wrote this simple code but seems not working:
import tensorflow as tf
train_loss = tf.keras.metrics.Mean('train_loss', dtype=tf.float32)
test_loss = tf.keras.metrics.Mean('test_loss', dtype=tf.float32)
train_summary_writer = tf.summary.create_file_writer('logs/tr')
test_summary_writer = tf.summary.create_file_writer('logs/ts')
for ep in range(1000):
train_loss(1*ep) # here I just want to display 1*ep...
test_loss(2*ep)
tf.summary.scalar('trloss', train_loss.result(), step=ep)
tf.summary.scalar('tsloss', test_loss.result(), step=ep)
%tensorboard --logdir logs
I get (in browser):
No scalar data was found.
Probable causes:
You haven’t written any scalar data to your event files.
TensorBoard can’t find your event files.
Did I miss something?

Answer by myself:
with train_summary_writer.as_default():
tf.summary.scalar('trloss', train_loss.result(), step=ep)
with test_summary_writer.as_default():
tf.summary.scalar('tsloss', test_loss.result(), step=ep)
It works! But, I couldn't find any proper documentation for this.

Related

How to save a model and then deploy the model using AI Platform Notebooks only or through UI?

I am not able to access the Cloud Console or install/run gcloud locally because of firewall issues. Even attaching to a Compute Engine VM instance fails. However, starting a Notebook from AI Platform and working on it in the browser is the only thing working in my environment.
To test a demo and to understand hands-on how to save and deploy a model, I wrote some very basic code for Boston Housing dataset. However the code fails at the part where I call joblib.dump(model, filename). The exact same same code however works without exception locally.
I have a general idea that I need to save/deploy model to Google Cloud Storage and somehow access it from there. But the instructions I find keep referring to some shell environment which I cannot access at the moment.
Q1.) Is there a way to save the model in Cloud Storage without leaving the Notebook environment?
Q2.) Also, how do I again access the model for prediction?
Most Important Question.) Will getting the above 2 questions working, lead me to have a Deployed Model ?
Sample code you can use -
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score
import joblib
boston_dataset = datasets.load_boston()
X = boston_dataset['data']
Y = boston_dataset['target']
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size = 0.2, random_state=5)
lin_model = LinearRegression()
lin_model.fit(X_train, Y_train)
y_train_predict = lin_model.predict(X_train)
rmse = (np.sqrt(mean_squared_error(Y_train, y_train_predict)))
r2 = r2_score(Y_train, y_train_predict)
y_test_predict = lin_model.predict(X_test)
# root mean square error of the model
rmse = (np.sqrt(mean_squared_error(Y_test, y_test_predict)))
# r-squared score of the model
r2 = r2_score(Y_test, y_test_predict)
print('RMSE is {}'.format(rmse))
print('R2 score is {}'.format(r2))
filename = 'model.pkl'
joblib.dump(lin_model, filename) # <- this line raises exception
loaded_model = joblib.load(filename)
result = loaded_model.score(X_test, Y_test)
print(result)

Script mode py3 and lack of output in s3 after successful training

I've created a script where I define my Tensorflow Estimator, then I pass it to AWS sagemaker sdk and run fit(), the training passes (though doesnt show anything related to training in the console) and in S3 the only output is /source/sourcedir.tar.gz and I believe there also should be at least /model/model.tar.gz which for some reason is not generated and I'm not getting any errors.
sagemaker_session = sagemaker.Session()
role = get_execution_role()
inputs = sagemaker_session.upload_data(path='data', key_prefix='data/NamingConventions')
NamingConventions_estimator = TensorFlow(entry_point='NamingConventions.py',
role=role,
framework_version='1.12.0',
train_instance_count=1,
train_instance_type='ml.m5.xlarge',
py_version='py3',
model_dir="s3://sagemaker-eu-west-2-218566301064/model")
NamingConventions_estimator.fit(inputs, run_tensorboard_locally=True)
and my model_fn from 'NamingConventions.py'
def model_fn(features, labels, mode, params):
net = keras.layers.Embedding(alphabetLen + 1, 8, input_length=maxFeatureLen)(features[INPUT_TENSOR_NAME])
net = keras.layers.LSTM(12)(net)
logits = keras.layers.Dense(len(conventions), activation=tf.nn.softmax)(net) #output
predictions = tf.reshape(logits, [-1])
if mode == tf.estimator.ModeKeys.PREDICT:
return tf.estimator.EstimatorSpec(
mode=mode,
predictions={"ages": predictions},
export_outputs={SIGNATURE_NAME: PredictOutput({"ages": predictions})})
loss = keras.losses.sparse_categorical_crossentropy(labels, predictions)
train_op = tf.contrib.layers.optimize_loss(
loss=loss,
global_step=tf.contrib.framework.get_global_step(),
learning_rate=params["learning_rate"],
optimizer="AdamOptimizer")
predictions_dict = {"ages": predictions}
eval_metric_ops = {
"rmse": tf.metrics.root_mean_squared_error(
tf.cast(labels, tf.float32), predictions)
}
return tf.estimator.EstimatorSpec(
mode=mode,
loss=loss,
train_op=train_op,
eval_metric_ops=eval_metric_ops)
I still can't get it running, I'm trying to use script-mode, it seems like I can't import my model from the same directory.
Currently my script:
import argparse
import os
if __name__ =='__main__':
parser = argparse.ArgumentParser()
# hyperparameters sent by the client are passed as command-line arguments to the script.
parser.add_argument('--epochs', type=int, default=10)
parser.add_argument('--batch_size', type=int, default=100)
parser.add_argument('--learning_rate', type=float, default=0.1)
# input data and model directories
parser.add_argument('--model_dir', type=str)
parser.add_argument('--train', type=str, default=os.environ.get('SM_CHANNEL_TRAIN'))
parser.add_argument('--test', type=str, default=os.environ.get('SM_CHANNEL_TEST'))
args, _ = parser.parse_known_args()
import tensorflow as tf
from NC_model import model_fn, train_input_fn, eval_input_fn
def train(args):
print(args)
estimator = tf.estimator.Estimator(model_fn=model_fn, model_dir=args.model_dir)
train_spec = tf.estimator.TrainSpec(train_input_fn, max_steps=1000)
eval_spec = tf.estimator.EvalSpec(eval_input_fn)
tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)
if __name__ == '__main__':
train(args)

Is the training job listed as successful in the AWS console? Did you check the training log in Amazon CloudWatch?

I think you need to set your estimator model_dir to the path in the environment variable SM_MODEL_DIR.
This is a bit contrary to the docs which are not clear on this point. I suspect the --model_dir arg is used for distributed training and not saving of the final artifact.
Note that you'll get all your checkpoints and summaries there to so it probably best to use --model_dir in your estimator and copy your model export to SM_MODEL_DIR when training has finished.

Script mode gives you the freedom to write TensorFlow scripts the way you want, but the cost is, you have to do almost everything by yourself. For example, here in your case, if you want the model.tar.gz in S3, you have to export the model locally first. Then SageMaker will upload your local model to S3 automatically.
So what you need to add in your script is:
You need to add an exporter and pass it to eval_spec.
You need to call export_savedmodel to save the model to the local model dir that SageMaker can get. The local model dir is in env variable SM_MODEL_DIR, and should be '/opt/ml/model'.
To finish above, I guess you need to have your serving_input_fn implemented too.
Then SageMaker will upload your model from the local model dir automatically to the S3 model dir you specify. And you can see that in S3 after job succeeds.

What could be the reason for "TypeError: 'StratifiedShuffleSplit' object is not iterable"?

I have to deliver a Machine Learning project, and I received a file called tester.py. After I've finished writing my code in another file, I have to run tester.py to see the results, but I am getting a error: TypeError: 'StratifiedShuffleSplit' object is not iterable
I have researched this error in another topics and website, the solution is always the same: use sklearn.model_selection to import GridSearchCV. I am already doing that since the beginning, but the file tester.py not run.
The part of code from tester.py that occurs the problem is:
def main():
### load up student's classifier, dataset, and feature_list
clf, dataset, feature_list = load_classifier_and_data()
### Run testing script
test_classifier(clf, dataset, feature_list)
if __name__ == '__main__':
main()
My own code works fine.
Any help?

Try changing the following lines of tester.py
The way of working of the current version of StratifiedShuffleSplit is different that the expected when tester.py was developed.
[..]
from sklearn.model_selection import StratifiedShuffleSplit
[..]
#cv = StratifiedShuffleSplit(labels, folds, random_state = 42)
cv = StratifiedShuffleSplit(n_splits=folds, random_state=42)
[..]
#for train_idx, test_idx in cv:
for train_idx, test_idx in cv.split(features, labels):
[..]
I hope you find it useful

Accessing Templated Runtime Parameters in Google Cloud Dataflow - Python

I am experimenting with creating my own Templates for Google Cloud Dataflow, so that jobs can be executed from the GUI, making it easier for others to execute. I have followed the tutorials, created my own class of PipelineOptions, and populated it with the parser.add_value_provider_argument() method. When I then try to pass these arguments into the pipeline, using my_options.argname.get(), I get an error, telling me the item is not called from a runtime context. I don't understand this. The args aren't part of the defining the pipeline graph itself, they are just parameters such as input filename, output tablename, etc.
Below is the code. It works if I hardcode the input filename, output tablename, write Disposition, and delimiter. If I replace these with their my_options.argname.get() equivalent, it fails. In the snippet shown, I have hardcoded everything except the outputBQTable name, where I use my_options.outputBQTable.get(). This fails, with the following message.
apache_beam.error.RuntimeValueProviderError: RuntimeValueProvider(option: outputBQTable, type: str, default_value: 'dataflow_csv_reader_testing.names').get() not called from a runtime context
I appreciate any guidance on how to get this to work.
import apache_beam
from apache_beam.io.gcp.gcsio import GcsIO
from apache_beam.options.pipeline_options import PipelineOptions
from apache_beam.options.value_provider import RuntimeValueProvider
import csv
import argparse
class MyOptions(PipelineOptions):
#classmethod
def _add_argparse_args(cls,parser):
parser.add_value_provider_argument('--inputGCS', type=str,
default='gs://mybucket/df-python-csv-test/test-dict.csv',
help='Input gcs csv file, full path and filename')
parser.add_value_provider_argument('--delimiter', type=str,
default=',',
help='Character used as delimiter in csv file, default is ,')
parser.add_value_provider_argument('--outputBQTable', type=str,
default='dataflow_csv_reader_testing.names',
help='Output BQ Dataset.Table to write to')
parser.add_value_provider_argument('--writeDisposition', type=str,
default='WRITE_APPEND',
help='BQ write disposition, WRITE_TRUNCATE or WRITE_APPEND or WRITE_EMPTY')
def main():
optlist=PipelineOptions()
my_options=optlist.view_as(MyOptions)
p = apache_beam.Pipeline(options=optlist)
(p
| 'create' >> apache_beam.Create(['gs://mybucket/df-python-csv-test/test-dict.csv'])
| 'read gcs csv dict' >> apache_beam.FlatMap(lambda file: csv.DictReader(apache_beam.io.gcp.gcsio.GcsIO().open(file,'r'), delimiter='|'))
| 'write bq record' >> apache_beam.io.Write(apache_beam.io.BigQuerySink(my_options.outputBQTable.get(), write_disposition='WRITE_TRUNCATE'))
)
p.run()
if __name__ == '__main__':
main()

You cannot use my_options.outputBQTable.get() when specifying the pipeline. The BigQuery sink already knows how to use runtime provided arguments, so I think you can just pass my_options.outputBQTable.
From what I gather from the documentation you should only use options.runtime_argument.get() in the process methods of your DoFns passed to ParDo steps.
Note: I tested with 2.8.0 of the Apache Beam SDK and so I used WriteToBigQuery instead of BigQuerySink.

This is a feature yet to be developed for the Python SDK.
The related open issue can be found at the Apache Beam project page.
Until the above issue is solved, the workaround for now would be to use the Java SDK.

Saving data from traceplot in PyMC3

Below is the code for a simple Bayesian Linear regression. After I obtain the trace and the plots for the parameters, is there any way in which I can save the data that created the plots in a file so that if I need to plot it again I can simply plot it from the data in the file rather than running the whole simulation again?
import pymc3 as pm
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(0,9,5)
y = 2*x + 5
yerr=np.random.rand(len(x))
def soln(x, p1, p2):
return p1+p2*x
with pm.Model() as model:
# Define priors
intercept = pm.Normal('Intercept', 15, sd=5)
slope = pm.Normal('Slope', 20, sd=5)
# Model solution
sol = soln(x, intercept, slope)
# Define likelihood
likelihood = pm.Normal('Y', mu=sol,
sd=yerr, observed=y)
# Sampling
trace = pm.sample(1000, nchains = 1)
pm.traceplot(trace)
print pm.summary(trace, ['Slope'])
print pm.summary(trace, ['Intercept'])
plt.show()

There are two easy ways of doing this:
Use a version after 3.4.1 (currently this means installing from master, with pip install git+https://github.com/pymc-devs/pymc3). There is a new feature that allows saving and loading traces efficiently. Note that you need access to the model that created the trace:
...
pm.save_trace(trace, 'linreg.trace')
# later
with model:
trace = pm.load_trace('linreg.trace')
Use cPickle (or pickle in python 3). Note that pickle is at least a little insecure, don't unpickle data from untrusted sources:
import cPickle as pickle # just `import pickle` on python 3
...
with open('trace.pkl', 'wb') as buff:
pickle.dump(trace, buff)
#later
with open('trace.pkl', 'rb') as buff:
trace = pickle.load(buff)

Update for someone like me who is still coming over to this question:
load_trace and save_trace functions were removed. Since version 4.0 even the deprecation waring for these functions were removed.
The way to do it is now to use arviz:
with model:
trace = pymc.sample(return_inferencedata=True)
trace.to_netcdf("filename.nc")
And it can be loaded with:
trace = arviz.from_netcdf("filename.nc")

This way works for me :
# saving trace
pm.save_trace(trace=trace_nb, directory=r"c:\Users\xxx\Documents\xxx\traces\trace_nb")
# loading saved traces
with model_nb:
t_nb = pm.load_trace(directory=r"c:\Users\xxx\Documents\xxx\traces\trace_nb")

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Tensorboard displaying scalar without keras.fit nor estimator in TF2.0 - tensorboard

Answer by myself: with train_summary_writer.as_default(): tf.summary.scalar('trloss', train_loss.result(), step=ep) with test_summary_writer.as_default(): tf.summary.scalar('tsloss', test_loss.result(), step=ep) It works! But, I couldn't find any proper documentation for this.

Related

How to save a model and then deploy the model using AI Platform Notebooks only or through UI?

Script mode py3 and lack of output in s3 after successful training

What could be the reason for "TypeError: 'StratifiedShuffleSplit' object is not iterable"?

Accessing Templated Runtime Parameters in Google Cloud Dataflow - Python

Saving data from traceplot in PyMC3

Categories

Resources