Google Cloud Platform - Vertex AI training with custom data format - google-cloud-platform

I need to train a custom OCR in vertex AI. My data with have folder of cropped image, each image is a line, and a csv file with 2 columns: image name and text in image.
But when I tried to import it into a dataset in vertex AI, I see that image dataset only support for classification, segmentation, object detection. All of dataset have fixed number of label, but my data have a infinite number of labels(if we view text in image as label), so all types doesn't match with my requirement. Can I use vertex AI for training, and how to do that ?

Since Vertex AI managed datasets do not support OCR applications, you can train and deploy a custom model using Vertex AI’s training and prediction services.
I found a good article on building an OCR system from scratch. This OCR system is implemented in 2 steps
Text detection
Text recognition
Please note that this article is not officially supported by Google Cloud.
Once you have tested the model locally, you can train the same on Vertex AI using the custom model training service. Please follow this codelab for step-by-step instructions on training and deploying a custom model.
Once the training is complete, the model can be deployed for inference using a pre-built container offered by Vertex AI or a custom container based on your requirements. You can also choose between batch predictions for synchronous requests and online predictions for asynchronous requests.

Related

How can GCP Automl handle overfitting?

I have created a Vertex AI AutoML image classification model. How can I assess it for overfitting? I assume I should be able to compare training vs validation accuracy but these do not seem to be available.
And if it is overfitting,can I tweak regularization parameters? Is it already doing cross validation? Anything else that can be done? (More data,early stopping, dropouts ie how can these be done?)
Deploy it to endpoint and test result with sample images by uploading to endpoint. If it's overfitting you can see the stats in analysis. You can increase the training sample and retrain your model again to get better result.

Vertex AI Tensorboard trough user interface

I have been using the Vertex AI training service with a custom container for my own machine learning pipeline. I would like to get tensorboard logs into the experiments tab to see in real-time the metrics while the model is training.
I was wondering if it is possible to set a custom training job in the user interface setting a TENSORBOARD_INSTANCE_NAME. It seems that this is only possible through a json-post-request.

Google AutoML Vision API and Google Vision API Custom Algorithm

I am looking at Google AutoML Vision API and Google Vision API. I know that if you use Google AutoML Vision API that it is a custom model because you train ML models based on your own images and define your own labels. And when using Google Vision API, you are using a pretrained model...
However, I am wondering if it is possible to use my own algorithm (one which I created and not provided by Google) and using that instead with Vision / AutoML Vision API ? ...
Sure, you can definitely deploy your own ML algorithm on Google Cloud, without being tied up to the Vision or AutoML API.
Two approaches that I have used many times for this same use case:
Serverless approach, if your model is relatively light in terms of computational resources requirement - Deploy your own custom cloud function. More info here.
To be more specific, the way it works is that you just call your cloud function, passing your image directly (base64 or pointing to a storage location). The function then automatically allocates all required resources (automatically), run your custom algorithm to process the image and/or run inferences, send the results back and vanishes (all resources released, no more running costs). Neat :)
Google AI Platform. More info here
Use AI Platform to train your machine learning models at scale, to host your trained model in the cloud, and to use your model to make predictions about new data.
In doubt, go for AI Platform, as the whole pipeline is nicely lined-up for any of your custom code/models. Perfect for deployment in production as well.

Google Cloud Platform

I am building a classification model using AutoML and I have some basic usage questions about the GCP.
1 - Data privacy question; if we save behavior data to train our model in BigQuery, does Google have access to that data? Could Google ever use that data to learn more about behavior of individuals we collected data from?
2 - Since training costs are charged by the hour, I would like to understand the relationship between data and training time. Does the time increase linearly with the size of the training data set? For example, we trained a classification using 1.7MB of data and it took 3 hrs. So, would training a model with 17MB of data take 30 hours?
3 - A batch prediction costs 1.16 USD per hour. However, our data is in a csv and it seems that we cannot upload a csv to do a batch prediction. So, we will try using the API. Therefore I have two questions: A) can we do a batch upload using the API and B) what are the associated costs?
4 - What exactly is an online prediction?
5 - When using the cost calculator (for machine learning), what is a node hour?
1- As is mentioned in the Data Usage FAQ, Google does not use any of your content for any purpose except to provide you with the Cloud AutoML service.
2- The time required to train your model depends on the size and complexity of your training data, for detailed explanation take a look at the Vision documentation for example.
3- You need to upload your csv file to Google Cloud Storage and then you can use it in the API or any of the available client libraries. See Natural Language batch prediction, for example. For costs, check the documentation for the desired product. AutoML pricing depends on what feature you are using: Vision, Natural Language, Translation, Video Intelligence.
4- After you have created (trained) a model, you can deploy the model and request online (single, low-latency and real-time) predictions. Online predictions accept one row of data and provide a predicted result based on your model for that data. You use online predictions when you need a prediction as input for your business logic flow.
5- You can think of node as a single Virtual Machine, which resources are used for computing purposes. Machine types are different depending the product and purpose for which they are used. For example in image classification, the cost for AutoML Vision Image Classification model training is $3.15 per node hour, each node is equivalent to a n1-standard-8 machine with an attached NVIDIA Tesla V100.GPU. Then, node hour are the resources of such node used by one hour.

Can we train Tensorflow custom object Detection model in SageMaker of AWS?

Could you just help me with the following points:
Can we train the tensorflow custom object detection model in SageMaker of AWS?
I came across SageMaker's Image classification Algorithm? Can we use it to detect particular objects in Video after training the model?
Confused with the pricing plan of SageMaker. They are saying "you are offered a monthly free tier of 250 hours of t2.medium notebook usage"; Does that mean we can use t2.medium notebook free for 250 hours?
Final AIM is to train a model for custom object detection like we used to train in paperspace or floydhub in very less price.
Thanks in advance.
1- Sure. You can bring any TensorFlow code to SageMaker. https://docs.aws.amazon.com/sagemaker/latest/dg/tf-examples.html
2- This is a classification model (labels only), not a detection model (labels + bounding boxes). Having said that, yes, you can definitely use it to predict frames extracted from a video.
3- Yes, in the first 12 months following the creation of your AWS account.
Hope this helps.
Any TensorFlow model can be used/ported to SageMaker. You can find examples of TensorFlow models ported to SageMaker here https://github.com/awslabs/amazon-sagemaker-examples/tree/master/sagemaker-python-sdk#amazon-sagemaker-examples.