Hi I'm new to Tensorflow and I've been practicing with the tensorflow.estimator library. Basically I ran the inbuilt tf.estimator.DNNClassifier algorithm below
import tensorflow as tf
def train_input_fn(features, labels, batch_size):
"""An input function for training"""
# Convert the inputs to a Dataset.
dataset = tf.data.Dataset.from_tensor_slices((dict(features), labels))
# Shuffle, repeat, and batch the examples.
return dataset.shuffle(1000).repeat().batch(batch_size)
# Feature columns describe how to use the input.
my_feature_columns = []
for key in landmark_features.keys():
my_feature_columns.append(tf.feature_column.numeric_column(key=key))
# Build a DNN with 2 hidden layers and 10 nodes in each hidden layer.
classifier = tf.estimator.DNNClassifier(feature_columns=my_feature_columns, hidden_units=[10, 10],n_classes=10)
dataset = train_input_fn(landmark_features, emotion_labels, batch_size = 1375 )
However I keep getting the following error:
INFO:tensorflow:Using default config.
WARNING:tensorflow:Using temporary folder as model directory: /tmp/tmpc_tag0rc
INFO:tensorflow:Using config: {'_model_dir': '/tmp/tmpc_tag0rc', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
rewrite_options {
meta_optimizer_iterations: ONE
}
}
Any idea on what I can do to fix my code ?
Related
I have a tensorflow model deployed on Vertex AI of Google Cloud. The model definition is:
item_model = tf.keras.Sequential([
tf.keras.layers.StringLookup(
vocabulary=item_vocab, mask_token=None),
tf.keras.layers.Embedding(len(item_vocab) + 1, embedding_dim)
])
user_model = tf.keras.Sequential([
tf.keras.layers.StringLookup(
vocabulary=user_vocab, mask_token=None),
# We add an additional embedding to account for unknown tokens.
tf.keras.layers.Embedding(len(user_vocab) + 1, embedding_dim)
])
class NCF_model(tf.keras.Model):
def __init__(self,user_model, item_model):
super(NCF_model, self).__init__()
# define all layers in init
self.user_model = user_model
self.item_model = item_model
self.concat_layer = tf.keras.layers.Concatenate()
self.feed_forward_1 = tf.keras.layers.Dense(32,activation= 'relu')
self.feed_forward_2 = tf.keras.layers.Dense(64,activation= 'relu')
self.final = tf.keras.layers.Dense(1,activation= 'sigmoid')
def call(self, inputs ,training=False):
user_id , item_id = inputs[:,0], inputs[:,1]
x = self.user_model(user_id)
y = self.item_model(item_id)
x = self.concat_layer([x,y])
x = self.feed_forward_1(x)
x = self.feed_forward_2(x)
x = self.final(x)
return x
The model has two string inputs and it outputs a probability value.
When I use the following input in the batch prediction file, I get an empty prediction file.
Sample of csv input file:
userid,itemid
yuu,190767
yuu,364
yuu,154828
yuu,72998
yuu,130618
yuu,183979
yuu,588
When I use a jsonl file with the following input.
{"input":["yuu", "190767"]}
I get the following error.
('Post request fails. Cannot get predictions. Error: Exceeded retries: Non-OK result 400 ({\n "error": "Failed to process element: 0 key: input of \'instances\' list. Error: INVALID_ARGUMENT: JSON object: does not have named input: input"\n}) from server, retry=3.', 1)
What seems to be going wrong with these inputs?
After a bit of experimenting, I found out what was wrong with the batch prediction input. In the csv file, the item column was being interpreted as an integer whereas the model has a string as an input. I'm not sure why there was no output at all in that case and I couldn't find the logs for the batch prediction.
The correct format for jsonlines was:
["user1", "item1"]
["user2", "item2"]
["user3", "item3"]
The one I used assumed the input was a named layer, 'input'. In all of this, I found the documentation of google cloud to be lacking.
I'm using a framework called FLOW RL. It enables me to use rllib and ray for my RL algorithm. I have been trying to plot non learning data on tensorboard. Following ray documentation ( link ), I have tried to add custom metrics. Therefore, I need to use the info dict, which is accessed by on_episode_step(info). An "episode" element is supposed to be present in this dictionary. That lets me access to my custom scalars.
However, every time I try to access to the episode element, I get an error because it does not exist in the info dict. Is this normal?
File "examples/rllib/newGreenWaveGrid2.py", line 295, in on_episode_start
episode = info["episode"]
KeyError: 'episode'
def on_episode_step(info):
episode = info["episode"]
whatever = abs(episode.last_observation_for()[2])
episode.user_data["whatever"].append(whatever)
if __name__ == '__main__':
alg_run, gym_name, config = setup_exps()
ray.init(num_cpus=N_CPUS + 1, redirect_output=False)
trials = run_experiments({
flow_params['exp_tag']: {
'run': alg_run,
'env': gym_name,
'config': {
**config,
'callbacks': {
"on_episode_start": on_episode_start,
"on_episode_step": on_episode_step,
"on_episode_end": on_episode_end,
}
},
'checkpoint_freq': 20,
'max_failures': 999,
'stop': {
'training_iteration': 200,
},
},
})
I have a google-cloud-ml model that I can run prediction by passing a 3 dimensional array of float32...
{ 'instances' [ { 'input' : '[ [ [ 0.0 ], [ 0.5 ], [ 0.8 ] ] ... ] ]' } ] }
However this is not an efficient format to transmit images, so I'd like to pass base64 encoded png or jpeg. This document talks about doing that, but what is not clear is what the entire json object looks like. Does the { 'b64' : 'x0welkja...' } go in place of the '[ [ [ 0.0 ], [ 0.5 ], [ 0.8 ] ] ... ] ]', leaving the enclosing 'instances' and 'input' the same? Or some other structure? Or does the tensorflow model have to be trained on base64?
The TensorFlow model does not have to be trained on base64 data. Leave your training graph as is. However, when exporting the model, you'll need to export a model that can accept PNG or jpeg (or possibly raw, if it's small) data. Then, when you export the model, you'll need to be sure to use a name for the output that ends in _bytes. This signals to CloudML Engine that you will be sending base64 encoded data. Putting it all together would like something like this:
from tensorflow.contrib.saved_model.python.saved_model import utils
# Shape of [None] means we can have a batch of images.
image = tf.placeholder(shape = [None], dtype = tf.string)
# Decode the image.
decoded = tf.image.decode_jpeg(image, channels=3)
# Do the rest of the processing.
scores = build_model(decoded)
# The input name needs to have "_bytes" suffix.
inputs = { 'image_bytes': image }
outputs = { 'scores': scores }
utils.simple_save(session, export_dir, inputs, outputs)
The request you send will look something like this:
{
"instances": [{
"b64": "x0welkja..."
}]
}
If you just want an efficient way to send images to a model (and not necessarily base-64 encode it), I would suggest uploading your images(s) to Google Cloud Storage and then having your model read off GCS. This way, you are not limited by image size and you can take advantage of multi-part, multithreaded, resumable uploads etc. that the GCS API provides.
TensorFlow's tf.read_file will directly off GCS. Here's an example of a serving input_fn that will do this. Your request to CMLE would send it an image URL (gs://bucket/some/path/to/image.jpg)
def read_and_preprocess(filename, augment=False):
# decode the image file starting from the filename
# end up with pixel values that are in the -1, 1 range
image_contents = tf.read_file(filename)
image = tf.image.decode_jpeg(image_contents, channels=NUM_CHANNELS)
image = tf.image.convert_image_dtype(image, dtype=tf.float32) # 0-1
image = tf.expand_dims(image, 0) # resize_bilinear needs batches
image = tf.image.resize_bilinear(image, [HEIGHT, WIDTH], align_corners=False)
#image = tf.image.per_image_whitening(image) # useful if mean not important
image = tf.subtract(image, 0.5)
image = tf.multiply(image, 2.0) # -1 to 1
return image
def serving_input_fn():
inputs = {'imageurl': tf.placeholder(tf.string, shape=())}
filename = tf.squeeze(inputs['imageurl']) # make it a scalar
image = read_and_preprocess(filename)
# make the outer dimension unknown (and not 1)
image = tf.placeholder_with_default(image, shape=[None, HEIGHT, WIDTH, NUM_CHANNELS])
features = {'image' : image}
return tf.estimator.export.ServingInputReceiver(features, inputs)
Your training code will train off actual images, just as in rhaertel80's suggestion above. See https://github.com/GoogleCloudPlatform/training-data-analyst/blob/master/courses/machine_learning/deepdive/08_image/flowersmodel/trainer/task.py#L27 for what the training/evaluation input functions would look like.
I was trying to use #Lak's answer (thanks Lak) to get online predictions for multiple instances in one json file, but kept getting the following error (I had two instances in my test json, hence the shape [2]):
input filename tensor must be scalar but had shape [2]
The problem is that ML engine apparently batches all the instances together and passes them to the serving inpur receiver function, but #Lak's sample code assumes the input is a single instance (it indeed works fine if you have a single instance in your json). I altered the code so that it can process a batch of inputs. I hope it will help someone:
def read_and_preprocess(filename):
image_contents = tf.read_file(filename)
image = tf.image.decode_image(image_contents, channels=NUM_CHANNELS)
image = tf.image.convert_image_dtype(image, dtype=tf.float32) # 0-1
return image
def serving_input_fn():
inputs = {'imageurl': tf.placeholder(tf.string, shape=(None))}
filename = inputs['imageurl']
image = tf.map_fn(read_and_preprocess, filename, dtype=tf.float32)
# make the outer dimension unknown (and not 1)
image = tf.placeholder_with_default(image, shape=[None, HEIGHT, WIDTH, NUM_CHANNELS])
features = {'image': image}
return tf.estimator.export.ServingInputReceiver(features, inputs)
The key changes are that 1) you don't squeeze the input tensor (that would cause trouble in the special case when your json contains only one instance) and, 2) use tf.map_fn to apply the read_and_preprocess function to a batch of input image urls.
I am rying to use tensorboard embeddings page in order to visualize the results of word2vec. After debugging, digging of lots of codes i came to a point that tensorboard successfully runs, reads the confguration file, reads the tsv files but now the embeddings page does not show data.
( the page is opened , i can see the menus , items etc) this is my config file:
embeddings {
tensor_name: 'word_embedding'
metadata_path: 'c:\data\metadata.tsv'
tensor_path: 'c:\data\tensors2.tsv'
}
What can be the problem?
The tensor file originally is 1gb. in size, if i try that file , the app crashes becasue of the memory. So i copy and paste 1 or 2 pages of the original file into tensor2.tsv and use this file. May be this is the problem. May be i need to create more data by copy/ paste.
thx
tolga
Try following code snippet to get visualized word embedding in tensorboard. Open tensorboard with logdir, check localhost:6006 for viewing your embedding.
tensorboard --logdir="visual/1"
# code
fname = "word2vec_model_1000"
model = gensim.models.keyedvectors.KeyedVectors.load(fname)
# project part of vocab, max of 100 dimension
max = 1000
w2v = np.zeros((max,100))
with open("prefix_metadata.tsv", 'w+') as file_metadata:
for i,word in enumerate(model.wv.index2word[:max]):
w2v[i] = model.wv[word]
file_metadata.write(word + '\n')
# define the model without training
sess = tf.InteractiveSession()
with tf.device("/cpu:0"):
embedding = tf.Variable(w2v, trainable=False, name='prefix_embedding')
tf.global_variables_initializer().run()
path = 'visual/1'
saver = tf.train.Saver()
writer = tf.summary.FileWriter(path, sess.graph)
# adding into projector
config = projector.ProjectorConfig()
embed= config.embeddings.add()
embed.tensor_name = 'prefix_embedding'
embed.metadata_path = 'prefix_metadata.tsv'
# Specify the width and height of a single thumbnail.
projector.visualize_embeddings(writer, config)
saver.save(sess, path+'/prefix_model.ckpt', global_step=max)
I'm training my model using TensorFlow in C++. Python is used only for constructing the graph. So is there a way to save and restore the graph and its state purely in C++? I know about the Python class tf.train.Saver but as far as I understand it does not exist in C++.
The tf.train.Saver class currently exists only in Python, but (i) it is built from TensorFlow ops that you can run from C++, and (ii) it exposes the Saver.as_saver_def() method that lets you get a SaverDef protocol buffer with the names of ops that you must run to save or restore a model.
In Python, you can get the names of the save and restore ops as follows:
saver = tf.train.Saver(...)
saver_def = saver.as_saver_def()
# The name of the tensor you must feed with a filename when saving/restoring.
print saver_def.filename_tensor_name
# The name of the target operation you must run when restoring.
print saver_def.restore_op_name
# The name of the target operation you must run when saving.
print saver_def.save_tensor_name
In C++ to restore from a checkpoint, you call Session::Run(), feeding in the name of the checkpoint file as saver_def.filename_tensor_name, with a target op of saver_def.restore_op_name. To save another checkpoint, you call Session::Run(), again feeding in the name of the checkpoint file as saver_def.filename_tensor_name, and fetching the value of saver_def.save_tensor_name.
The recent TensorFlow version includes some helper functions to do the same in C++ without Python. These are generate from the ProtoBuf in the pip-package (${HOME}/.local/lib/python2.7/site-packages/tensorflow/include/tensorflow/core/protobuf/saver.pb.h).
// save
tensorflow::Tensor checkpointPathTensor(tensorflow::DT_STRING, tensorflow::TensorShape());
checkpointPathTensor.scalar<std::string>()() = "some/path";
tensor_dict feed_dict = {{graph_def.saver_def().filename_tensor_name(), checkpointPathTensor}};
status = sess->Run(feed_dict, {}, {graph_def.saver_def().save_tensor_name()}, nullptr);
// restore
tensorflow::Tensor checkpointPathTensor(tensorflow::DT_STRING, tensorflow::TensorShape());
checkpointPathTensor.scalar<std::string>()() = "some/path";
tensor_dict feed_dict = {{graph_def.saver_def().filename_tensor_name(), checkpointPathTensor}};
status = sess->Run(feed_dict, {}, {graph_def.saver_def().restore_op_name()}, nullptr);
This is based on the undocumented python-way (more details) of restoring a model
def restore(sess, metaGraph, fn):
restore_op_name = metaGraph.as_saver_def().restore_op_name # u'save/restore_all'
restore_op = tf.get_default_graph().get_operation_by_name(restore_op_name)
filename_tensor_name = metaGraph.as_saver_def().filename_tensor_name # u'save/Const'
sess.run(restore_op, {filename_tensor_name: fn})
For a working and complete version see here.