I have been using the Vertex AI training service with a custom container for my own machine learning pipeline. I would like to get tensorboard logs into the experiments tab to see in real-time the metrics while the model is training.
I was wondering if it is possible to set a custom training job in the user interface setting a TENSORBOARD_INSTANCE_NAME. It seems that this is only possible through a json-post-request.
Related
I have built a model using Google Cloud Vertex AI and uploaded to Model Registry. I then try to create an Endpoint with it but I found that it is extremely slow - takes more than 30-40mins. And sometimes it just stick as deploying... Is this normal? Is there any way to speed this up? Or at least is there anywhere I can see any possible error?
I have created a Vertex AI AutoML image classification model. How can I assess it for overfitting? I assume I should be able to compare training vs validation accuracy but these do not seem to be available.
And if it is overfitting,can I tweak regularization parameters? Is it already doing cross validation? Anything else that can be done? (More data,early stopping, dropouts ie how can these be done?)
Deploy it to endpoint and test result with sample images by uploading to endpoint. If it's overfitting you can see the stats in analysis. You can increase the training sample and retrain your model again to get better result.
I need to train a custom OCR in vertex AI. My data with have folder of cropped image, each image is a line, and a csv file with 2 columns: image name and text in image.
But when I tried to import it into a dataset in vertex AI, I see that image dataset only support for classification, segmentation, object detection. All of dataset have fixed number of label, but my data have a infinite number of labels(if we view text in image as label), so all types doesn't match with my requirement. Can I use vertex AI for training, and how to do that ?
Since Vertex AI managed datasets do not support OCR applications, you can train and deploy a custom model using Vertex AI’s training and prediction services.
I found a good article on building an OCR system from scratch. This OCR system is implemented in 2 steps
Text detection
Text recognition
Please note that this article is not officially supported by Google Cloud.
Once you have tested the model locally, you can train the same on Vertex AI using the custom model training service. Please follow this codelab for step-by-step instructions on training and deploying a custom model.
Once the training is complete, the model can be deployed for inference using a pre-built container offered by Vertex AI or a custom container based on your requirements. You can also choose between batch predictions for synchronous requests and online predictions for asynchronous requests.
I am looking at Google AutoML Vision API and Google Vision API. I know that if you use Google AutoML Vision API that it is a custom model because you train ML models based on your own images and define your own labels. And when using Google Vision API, you are using a pretrained model...
However, I am wondering if it is possible to use my own algorithm (one which I created and not provided by Google) and using that instead with Vision / AutoML Vision API ? ...
Sure, you can definitely deploy your own ML algorithm on Google Cloud, without being tied up to the Vision or AutoML API.
Two approaches that I have used many times for this same use case:
Serverless approach, if your model is relatively light in terms of computational resources requirement - Deploy your own custom cloud function. More info here.
To be more specific, the way it works is that you just call your cloud function, passing your image directly (base64 or pointing to a storage location). The function then automatically allocates all required resources (automatically), run your custom algorithm to process the image and/or run inferences, send the results back and vanishes (all resources released, no more running costs). Neat :)
Google AI Platform. More info here
Use AI Platform to train your machine learning models at scale, to host your trained model in the cloud, and to use your model to make predictions about new data.
In doubt, go for AI Platform, as the whole pipeline is nicely lined-up for any of your custom code/models. Perfect for deployment in production as well.
I want to plan an architecture based on GCP cloud platform. Below are the subject areas what I have to cover. Can someone please help me to find out the proper services which will perform that operation?
Data ingestion (Batch, Real-time, Scheduler)
Data profiling
AI/ML based data processing
Analytical data processing
Elastic search
User interface
Batch and Real-time publish
Security
Logging/Audit
Monitoring
Code repository
If I am missing something which I have to take care then please add the same too.
GCP offers many products with functionality that can overlap partially. What product to use would depend on the more specific use case, and you can find an overview about it here.
That being said, an overall summary of the services you asked about would be:
1. Data ingestion (Batch, Real-time, Scheduler)
That will depend on where your data comes from, but the most common options are Dataflow (both for batch and streaming) and Pub/Sub for streaming messages.
2. Data profiling
Dataprep (which actually runs on top of Dataflow) can be used for data profiling, here is an overview of how you can do it.
3. AI/ML based data processing
For this, you have several options depending on your needs. For developers with limited machine learning expertise there is AutoML that allows to quickly train and deploy models. For more experienced data scientists there is ML Engine, that allows training and prediction of custom models made with frameworks like TensorFlow or scikit-learn.
Additionally, there are some pre-trained models for things like video analysis, computer vision, speech to text, speech synthesis, natural language processing or translation.
Plus, it’s even possible to perform some ML tasks in GCP’s data warehouse, BigQuery in SQL language.
4. Analytical data processing
Depending on your needs, you can use Dataproc, which is a managed Hadoop and Spark service, or Dataflow for stream and batch data processing.
BigQuery is also designed with analytical operations in mind.
5. Elastic search
There is no managed Elastic search service directly provided by GCP, but you can find several options on the marketplace, like an API service or a Kubernetes app for Google’s Kubernetes Engine.
6. User interface
If you are referring to a user interface for your own use, GCP’s console is what you’d be using. If you are referring to a UI for end-users, I’d suggest using App Engine.
If you are referring to a UI for data exploration, there is Datalab, which is essentially a managed notebook service, and Data Studio, where you can build plots of your data in real time.
7. Batch and Real-time publish
The publishing service in GCP, for both synchronous and asynchronous messages is Pub/Sub.
8. Security
Most security concerns in GCP are addressed here. Which is a wide topic by itself and should probably need a separate question.
9. Logging/Audit
GCP uses Stackdriver for logging of most of its products, and provides many ways to process and analyze those logs.
10. Monitoring
Stackdriver also has monitoring features.
11. Code repository
For this there is Cloud Source Repositories, which integrate with GCP’s automated build system and can also be easily synched with a Github repository.
12. Analytical data warehouse
You did not ask for this one, but I think it's an important part of a data analysis stack.
In the case of GCP, this would be BigQuery.