I did lots of search, but I cannot understand what the difference between google ai platform and ml engine.
It seems that both of them can be used for training and deploying models.
Other words like google-cloud-automl, google ai hub are also very confusing.
What are the differences between them? Thanks
The short answer is: there isn't. In 2019 "ML Engine" was renamed to "AI Platform" and in time some services changed and expanded. To see what has changed, check the release notes, starting from around April. "Around", as they haven't left much trace that ML Engine ever existed.
Here's one of pull requests to "Rename Cloud ML Engine to AI Platform" for Python samples.
Cloud ML Engine = AI Platform Training + AI Platform Prediction (It was just a name change). Used for training and deploying ML models.
AI Platform Training: Bring your own code and submit Training jobs using supported ML frameworks such as TensorFlow, scikit-learn, XGBoost, Keras, etc.
AI Platform Prediction: Host your Model and use AI Platform Prediction to infer target values for new data.
Google Cloud Auto ML = You don't need to code, bring your dataset and GCP automatically picks the best model for you.
Different products:
Vision
Video Intelligence
Natural Language
Translation
Tables.
Google AI Hub = It is a Catalog: Discover Notebooks, Models and Pipelines.
Edit: Now AI Platform is called Vertex AI
Correct, the previous ML Engine service is now under Cloud AI Platform portfolio of products and provides end-to-end platform to build, run, and manage ML projects.
Please follow the instructions on how to use the service here.
Related
I am looking at Google AutoML Vision API and Google Vision API. I know that if you use Google AutoML Vision API that it is a custom model because you train ML models based on your own images and define your own labels. And when using Google Vision API, you are using a pretrained model...
However, I am wondering if it is possible to use my own algorithm (one which I created and not provided by Google) and using that instead with Vision / AutoML Vision API ? ...
Sure, you can definitely deploy your own ML algorithm on Google Cloud, without being tied up to the Vision or AutoML API.
Two approaches that I have used many times for this same use case:
Serverless approach, if your model is relatively light in terms of computational resources requirement - Deploy your own custom cloud function. More info here.
To be more specific, the way it works is that you just call your cloud function, passing your image directly (base64 or pointing to a storage location). The function then automatically allocates all required resources (automatically), run your custom algorithm to process the image and/or run inferences, send the results back and vanishes (all resources released, no more running costs). Neat :)
Google AI Platform. More info here
Use AI Platform to train your machine learning models at scale, to host your trained model in the cloud, and to use your model to make predictions about new data.
In doubt, go for AI Platform, as the whole pipeline is nicely lined-up for any of your custom code/models. Perfect for deployment in production as well.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed last year.
Improve this question
We are going to do model serving infrastructure. I am comparing Google AI Prediction and kfserving. But I cannot find enough documents about the features of google ai serving and how it is implemented.
It seems that gcloud ai-platform versions create can create model version resource and start serving, which is the only point I can find.
I have three questions:
1, what is relationship between google ai serving and kfserving?
2, how gcloud ai-platform versions create works?
3, as for the features of google ai serving, do google ai serving provide all feature such as canary rollout, explainers, monitoring, etc listed in https://www.kubeflow.org/docs/components/serving/overview/?
The document you shared contains extensive information about Google AI Platform Prediction. In summary, it is a hosted service in GCP where you don't need to manage the infrastructure. You just deploy your model and a new REST endpoint will be available for you to start sending predictions via SDK or API.
Supports multiple frameworks:
TensorFlow
scikit-learn
XGBoost
Pytorch
Custom Docker containers (soon)
Support GPUs
Model versions
Online and Batch prediction
Logging and Monitoring
Multiple Regions
REST API
Answer to your questions:
KFServing you need to manage your own K8s/KubeFlow infrastructure.
Kubeflow supports two model serving systems that allow multi-framework model serving: KFServing and Seldon Core.
AI Platform Service you don't manage the infrastructure, nor need K8s/KF, you simply deploy your models and GCP takes care of the infra.
gcloud ai-platform versions create will deploy a VM(s) in Google Cloud where based on the settings (Runtime) and Framework all the dependencies will be installed automatically, also all you need to load your model will be installed so you can have access to a REST API.
Canary can be implemented used with different Models and versions, it may depend on routing your predictions. Check the What If tool and Model logging.
Google AI Platform can be used to manage the following stages in the ML workflow:
-Train an ML model on your data:
Train model
Evaluate model accuracy
Tune hyperparameters
-Deploy your trained model.
-Send prediction requests to your model:
Online prediction
Batch prediction (for TensorFlow only)
-Monitor the predictions on an ongoing basis.
-Manage your models and model versions.
KFServing enables serverless inferencing on Kubernetes and provides performant, high abstraction interfaces for common machine learning (ML) frameworks like TensorFlow, XGBoost, scikit-learn, PyTorch, and ONNX to solve production model serving use cases.
I want to plan an architecture based on GCP cloud platform. Below are the subject areas what I have to cover. Can someone please help me to find out the proper services which will perform that operation?
Data ingestion (Batch, Real-time, Scheduler)
Data profiling
AI/ML based data processing
Analytical data processing
Elastic search
User interface
Batch and Real-time publish
Security
Logging/Audit
Monitoring
Code repository
If I am missing something which I have to take care then please add the same too.
GCP offers many products with functionality that can overlap partially. What product to use would depend on the more specific use case, and you can find an overview about it here.
That being said, an overall summary of the services you asked about would be:
1. Data ingestion (Batch, Real-time, Scheduler)
That will depend on where your data comes from, but the most common options are Dataflow (both for batch and streaming) and Pub/Sub for streaming messages.
2. Data profiling
Dataprep (which actually runs on top of Dataflow) can be used for data profiling, here is an overview of how you can do it.
3. AI/ML based data processing
For this, you have several options depending on your needs. For developers with limited machine learning expertise there is AutoML that allows to quickly train and deploy models. For more experienced data scientists there is ML Engine, that allows training and prediction of custom models made with frameworks like TensorFlow or scikit-learn.
Additionally, there are some pre-trained models for things like video analysis, computer vision, speech to text, speech synthesis, natural language processing or translation.
Plus, it’s even possible to perform some ML tasks in GCP’s data warehouse, BigQuery in SQL language.
4. Analytical data processing
Depending on your needs, you can use Dataproc, which is a managed Hadoop and Spark service, or Dataflow for stream and batch data processing.
BigQuery is also designed with analytical operations in mind.
5. Elastic search
There is no managed Elastic search service directly provided by GCP, but you can find several options on the marketplace, like an API service or a Kubernetes app for Google’s Kubernetes Engine.
6. User interface
If you are referring to a user interface for your own use, GCP’s console is what you’d be using. If you are referring to a UI for end-users, I’d suggest using App Engine.
If you are referring to a UI for data exploration, there is Datalab, which is essentially a managed notebook service, and Data Studio, where you can build plots of your data in real time.
7. Batch and Real-time publish
The publishing service in GCP, for both synchronous and asynchronous messages is Pub/Sub.
8. Security
Most security concerns in GCP are addressed here. Which is a wide topic by itself and should probably need a separate question.
9. Logging/Audit
GCP uses Stackdriver for logging of most of its products, and provides many ways to process and analyze those logs.
10. Monitoring
Stackdriver also has monitoring features.
11. Code repository
For this there is Cloud Source Repositories, which integrate with GCP’s automated build system and can also be easily synched with a Github repository.
12. Analytical data warehouse
You did not ask for this one, but I think it's an important part of a data analysis stack.
In the case of GCP, this would be BigQuery.
I understand both are built over Jupyter noteboooks but run in cloud. Why do we have two then?
Jupyter is the only thing these two services have in common.
Colaboratory is a tool for education and research. It doesn’t require any setup or other Google products to be used (although notebooks are stored in Google Drive). It’s intended primarily for interactive use and long-running background computations may be stopped. It currently only supports Python.
Cloud Datalab allows you to analyse data using Google Cloud resources. You can take full advantage of scalable services such as BigQuery and Machine Learning Engine to analyse, manipulate and visualise data. You can use it with Python, SQL, and JavaScript.
Google Colaboratory is free. But, you are limited to one spec of cpu/ram/disk/gpu.
Google Datalab is paid. You pay for whatever specs you want.
The notebook interface is also a bit different between the two.
As Google Cloud Prediction API is deprecated (https://cloud.google.com/prediction/docs/end-of-life-faq), does ml-engine provide a similar black-box?
Google Cloud ML Engine is managed TensorFlow and supports higher level APIs (see Datalab notebooks for regression and image classification - runnable in Datalab). Compared to Prediction API, there are some capability differences between the data types and some user experience delta that is being addressed in the near term.
Note that TensorFlow and ML Engine allow you a greater degree of freedom to select and tune the model & much larger scale than a blackbox - albeit with some added complexity at present. That too will be addressed soon.
Dinesh Kulkarni
Product Manager, Google Cloud ML & Datalab