Accessing TPUs in the ML Engine

Accessing TPUs in the ML Engine - google-cloud-ml

I'd like to train my TensorFlow model on a TPU by setting --scaled-tier to basic-tpu. This is listed as an option in the documentation, but there's no mention of TPUs in the pricing page. Is there any way to find out how much it costs to use TPUs for training?

There is a signup form linked at the top of the Cloud TPU page. You can sign up for the alpha there, and also opt in to receive updates on Cloud TPUs.

Related

GCP | How I can see all the working virtual machines on a project?

I wrote some code to automate the training procedure on our company vm instances.
you probably know that sometimes GCP can't provide you at the current moment with a machine - 'out of resource' exception.
so , I'd like to monitor which of my machines successfully turned on and which not.
if there is some way to show it on Bigquery it will be great.
thanks .

Using the Cloud Monitoring (Stackdriver) functionality is good way for monitoring all you VMs.
Here is a detailed guide to implement Monitoring on a Compute Engine Instance.
Hope you find it useful.

You can use Google cloud's activity logs too:
Activity logging is enabled by default for all Compute Engine
projects.
You can see your project's activity logs through the Logs Viewer in
the Google Cloud Console:
In the Cloud Console, go to the Logging page. Go to the Logging page
When in the Logs Viewer, select and filter your resource type from the
first drop-down list. From the All logs drop-down list, select
compute.googleapis.com/activity_log to see Compute Engine activity
logs.
Here is the Official documentation.

Extract gcloud VM Instance Monitoring Data to BigQuery

Outline
We are running an ecommerce platform on Google Cloud on a dedicated VM Instance. Most of our traffic happens on Monday, as we then send our newsletters to our customer-base. Because of that we have huge traffic-peaks each Monday.
Goal
Because of this peak-traffic we need to make sure, that we understand how much server-load a single user is generating on average. To achieve this, we want to correlate our VM Instance Monitoring Data with our Google Analytics Data in Google Datastudio. To get a better understanding of the mentioned dynamics.
Problem
As far as we are aware (based on the docs), there is no direct data-consumption from the gcloud sdk possible in Google Datastudio. With that as a fact, we tried to extract the data via. BigQuery, but also there didn't found a possibility to access the monitoring data of our VM Instance.
Therefore we are looking for a solution, how we can extract our monitoring data of our VM Instances to Google Datastudio (preferably via BigQuery). Thank you for your help.

Here is Google official solution for monitoring export.
This page describes how to export monitoring metrics to bigquery dataset.
Solution deployments use pub/sub, app engine, Cloud scheduler and some python codes.
I think you only need to export the metrics listed in here.
If you complete exporting process successfully, then you can use Google Data studio for visualizing your metric data.

GCP auto shutdown and startup using Google Cloud Schedulers

I want to start/stop a set of Compute engine instances in Google Cloud Platform using Google Cloud Scheduler. How can I do it?

In order to start and stop a Compute Engine using the Cloud Scheduler you can follow Google this tutorial, or this other
I won’t be copy-pasting the required code here because the tutorial it's very complete but I will resume here the steps to follow.
Set up your Compute Engine instances
Deploy the starter Cloud Function. You can see an example in here
Deploy the stop Cloud Function. You can see an example in here
Set up the Cloud Scheduler jobs
If you need any help with the tutorial please just let me know!

I still wonder why gcp has still not have this feature in the first place.
Anyways These simple steps did the job for me
Create a new JobScheduler.
Fill in the required details
Choose frequency which suits your requirement.
Choose the target to Pub/Sub.
Choose the topic name (Create a new topic if not created ).
In the payload section use this stop script
gcloud compute instances stop instance-name.
To verify the change you can run the job manually and check

I use vm instance API directly. No need for a cloud function.
Here is the link to the api description:
https://cloud.google.com/compute/docs/reference/rest/v1/instances/stop
The API Call: POST https://compute.googleapis.com/compute/v1/projects/{project}/zones/{zone}/instances/{resourceId}/stop
You can start the engine in a similiar way.
Example how to configure the scheduler:

You can look at Google Article to achieve your goal https://cloud.google.com/scheduler/docs/start-and-stop-compute-engine-instances-on-a-schedule.
Also, If these VM instances are stateless then I would suggest to look at Google Cloud Run service which can help you to save cost and operation overhead to configure auto-shutdown/auto-startup.
Hope this helps.

The new Google Compute Engine feature of Instance Schedules can now be used to start and stop instances through the Cloud Console UI, using gcloud or via the API:
https://cloud.google.com/compute/docs/instances/schedule-instance-start-stop

Find the Project, Bucket, Compute Instance Details for GCP Platform

How we can find the details programmatically about GCP Infrastructure like various Folders, Projects, Compute Instances, datasets etc. which can help to have a better understanding of GCP platform.
Regards,
Neeraj

There is a service in GCP called Cloud Asset Inventory. Cloud Asset Inventory is a storage service that keeps a five week history of Google Cloud Platform (GCP) asset metadata.
It allows you to export all asset metadata at a certain timestamp to Google Cloud Storage or BigQuery.
It also allows you to search resources and IAM policies.
It supports a wide range of resource types, including:
Resource Manager
google.cloud.resourcemanager.Organization
google.cloud.resourcemanager.Folder
google.cloud.resourcemanager.Project
Compute Engine
google.compute.Autoscaler
google.compute.BackendBucket
google.compute.BackendService
google.compute.Disk
google.compute.Firewall
google.compute.HealthCheck
google.compute.Image
google.compute.Instance
google.compute.InstanceGroup
...
Cloud Storage
google.cloud.storage.Bucket
BigQuery
google.cloud.bigquery.Dataset
google.cloud.bigquery.Table
Find the full list here.
The equivalent service in AWS is called AWS Config.

I have found open source tool named as "forseti Security", which is easy to install and use. It has 5 major components in it.
Inventory : Regularly collects the data from GCP and store the results in cloudSQL under the table “gcp_inventory”. In order to refer to the latest inventory information you can refer to the max value of column : inventory_index_id.
Scanner : It periodically compares the policies applied on GCP resources with the data collected from Inventory. It stores the scanner information in table “scanner_index”
Explain : it helps to manage the cloud IAM policies.
Enforcer : This component use Google Cloud API to enforce the policies you have set in GCP platform.
Notifier : It helps to send notifications to Slack, Cloud Storage or SendGrid as show in Architecture diagram above.
You can find the official documentation here.
I tried using this tool and found it really useful.

How to suspend/resume GCE VM

We need the ability to suspend/resume GCE VMs to optimize the use of Compute Engine resources.
Right now this feature is available in gcloud alpha:
gcloud alpha compute instances suspend INSTANCE_NAMES [INSTANCE_NAMES …] [--async][--discard-local-ssd] [--zone=ZONE] [GCLOUD_WIDE_FLAG …]
But when I executed this command I got this error:
HTTPError 400: Invalid Resource Usage:'Suspend Instance Feature is not available for this project.'
Can anyone suggest to me ways to suspend Google Cloud VM? From the error stated, I get an understanding that we need some permission to include feature in the project. Can anyone suggest to me ways to include Suspend Instance Feature in this project?

You need to apply for early access for this feature and have your project registered with Google. Contact Google Support. Include your email address, the project ID in your request and how you will be using the API.
Do not expect a rapid response. My requests sometimes takes several weeks for approval.

This feature is available in beta mode with GCP now.
gcloud beta compute instances suspend <instance-name>
gcloud beta compute instances resume <instance-name>
I tried this on Gcloud SDK and it has prompted me to install beta utility which worked for me..
More details can be found at
https://cloud.google.com/sdk/gcloud/reference/beta/compute/instances/suspend

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Accessing TPUs in the ML Engine - google-cloud-ml

I'd like to train my TensorFlow model on a TPU by setting --scaled-tier to basic-tpu. This is listed as an option in the documentation, but there's no mention of TPUs in the pricing page. Is there any way to find out how much it costs to use TPUs for training?

There is a signup form linked at the top of the Cloud TPU page. You can sign up for the alpha there, and also opt in to receive updates on Cloud TPUs.

Related

GCP | How I can see all the working virtual machines on a project?

Extract gcloud VM Instance Monitoring Data to BigQuery

GCP auto shutdown and startup using Google Cloud Schedulers

Find the Project, Bucket, Compute Instance Details for GCP Platform

How to suspend/resume GCE VM

Categories

Resources