How would you use an existing Compute Engine VM instance for a Google Cloud Build pipeline?
I know there's been a similar question in the past, however, the suggested answer is not really what I want - creating and then destroying a Compute Engine with every build.
In settings, Cloud Build allows you to enable "service account permissions" for Compute Engine (Compute Instance Admin (v1)), but I've found no information how to use that permission and service for running the build process with one of your predefined VM instances.
Or maybe I misunderstand the answer in the linked thread above and
COMMAND=sudo supervisorctl restart
actually restarts the existing VM supervisorctl? Any help would be appreciated.
You can't run a Cloud Build build on a GCE instance. The most customizable option you would have is to run the build on a private pool. But even in those cases it's always managed, you never have access to the underlying VM.
Another option would be to start a powerful GCE instance with Cloud Build via the GCE API, run your operations there and then stop the GCE instance.
I want to start/stop a set of Compute engine instances in Google Cloud Platform using Google Cloud Scheduler. How can I do it?
In order to start and stop a Compute Engine using the Cloud Scheduler you can follow Google this tutorial, or this other
I won’t be copy-pasting the required code here because the tutorial it's very complete but I will resume here the steps to follow.
Set up your Compute Engine instances
Deploy the starter Cloud Function. You can see an example in here
Deploy the stop Cloud Function. You can see an example in here
Set up the Cloud Scheduler jobs
If you need any help with the tutorial please just let me know!
I still wonder why gcp has still not have this feature in the first place.
Anyways These simple steps did the job for me
Create a new JobScheduler.
Fill in the required details
Choose frequency which suits your requirement.
Choose the target to Pub/Sub.
Choose the topic name (Create a new topic if not created ).
In the payload section use this stop script
gcloud compute instances stop instance-name.
To verify the change you can run the job manually and check
I use vm instance API directly. No need for a cloud function.
Here is the link to the api description:
https://cloud.google.com/compute/docs/reference/rest/v1/instances/stop
The API Call: POST https://compute.googleapis.com/compute/v1/projects/{project}/zones/{zone}/instances/{resourceId}/stop
You can start the engine in a similiar way.
Example how to configure the scheduler:
You can look at Google Article to achieve your goal https://cloud.google.com/scheduler/docs/start-and-stop-compute-engine-instances-on-a-schedule.
Also, If these VM instances are stateless then I would suggest to look at Google Cloud Run service which can help you to save cost and operation overhead to configure auto-shutdown/auto-startup.
Hope this helps.
The new Google Compute Engine feature of Instance Schedules can now be used to start and stop instances through the Cloud Console UI, using gcloud or via the API:
https://cloud.google.com/compute/docs/instances/schedule-instance-start-stop
I have a batch job that takes a couple of hours to run. How can I run this in a serverless way on Google Cloud?
AppEngine, Cloud Functions, and Cloud Run are limited to 10-15 minutes. I don't want to rewrite my code in Apache Beam.
Is there an equivalent to AWS Batch on Google Cloud?
Note: Cloud Run and Cloud Functions can now last up to 60 minutes. The answer below remains a viable approach if you have a multi-hour job.
Vertex AI Training is serverless and long-lived. Wrap your batch processing code in a Docker container, push to gcr.io and then do:
gcloud ai custom-jobs create \
--region=LOCATION \
--display-name=JOB_NAME \
--worker-pool-spec=machine-type=MACHINE_TYPE,replica-count=REPLICA_COUNT,executor-image-uri=EXECUTOR_IMAGE_URI,local-package-path=WORKING_DIRECTORY,script=SCRIPT_PATH
You can run any arbitrary Docker container — it doesn’t have to be a machine learning job. For details, see:
https://cloud.google.com/vertex-ai/docs/training/create-custom-job#create_custom_job-gcloud
Today you can also use Cloud Batch: https://cloud.google.com/batch/docs/get-started#create-basic-job
Google Cloud does not offer a comparable product to AWS Batch (see https://cloud.google.com/docs/compare/aws/#service_comparisons).
Instead you'll need to use Cloud Tasks or Pub/Sub to delegate the work to another product, such as Compute Engine, but this lacks the ability to do this in a "serverless" way.
Finally Google released (in Beta for the moment) Cloud Batch which does exactly what you want.
You push jobs (containers or scripts) and it runs. Simple as that.
https://cloud.google.com/batch/docs/get-started
This answer to a How to make GCE instance stop when its deployed container finishes? will work for you as well:
In short:
First dockerize your batch process.
Then, create an instance:
Using a container-optmized image
And using a Startup script that pulls your docker image, runs it, and shutdown the machine at the end.
I have faced the same problem. in my case I went for:
Cloud Scheduler to start the job by pushing to Pub/Sub.
Pub/Sub triggers Cloud Functions.
Cloud Functions mounting a Compute Engine instance.
Compute Engine runs the batch workload and auto kills the instance
once it’s done. You can read by post on medium:
https://link.medium.com/1K3NsElGYZ
It might help you get started. There's also a follow up post showing how to use a Docker container inside the Compute Engine instance: https://medium.com/google-cloud/serverless-batch-workload-on-gcp-adding-docker-and-container-registry-to-the-mix-558f925e1de1
You can use Cloud Run. At the time of writing this, the timeout of Cloud Run (fully managed) is increased to 60 minutes, but in beta.
https://cloud.google.com/run/docs/configuring/request-timeout
Important: Although Cloud Run (fully managed) has a maximum timeout of 60 minutes, only timeouts of 15 minutes or less are generally available: setting timeouts greater than 15 minutes is a Beta feature.
Another alternative for batch computing is using Google Cloud Lifesciences.
An example application using Cloud Life Sciences is dsub.
Or see the Cloud Life Sciences Quickstart documentation.
I found myself looking for a solution to this problem and built something similar to what mesmacosta has described in a different answer, in the form of a reusable tool called gcp-runbatch.
If you can package your workload into a Docker image then you can run it using gcp-runbatch. When triggered, it will do the following:
Create a new VM
On VM startup, docker run the specified image
When the docker run exits, the VM will be deleted
Some features that are supported:
Invoke batch workload from the command line, or deploy as a Cloud Function and invoke that way (e.g. to trigger batch workloads via Cloud Scheduler)
stdout and stderr will be piped to Cloud Logging
Environment variables can be specified by the invoker, or pulled from Secret Manager
Here's an example command line invocation:
$ gcp-runbatch \
--project-id=long-octane-350517 \
--zone=us-central1-a \
--service-account=1234567890-compute#developer.gserviceaccount.com \
hello-world
Successfully started instance runbatch-38408320. To tail batch logs run:
CLOUDSDK_PYTHON_SITEPACKAGES=1 gcloud beta --project=long-octane-350517
logging tail 'logName="projects/long-octane-350517/logs/runbatch" AND
resource.labels.instance_id="runbatch-38408320"' --format='get(text_payload)'
GCP launched their new "Batch" service in July '22. It basically Compute Engine packaged with some utilities to easily productionize a batch job -- including defining required resources, executables (script or container-based), and define a run schedule.
Haven't used it yet, but seems like a great fit for batch jobs that take over 1 hr.
I have a dataflow job. How to define metatags for it to login via sss into dataflow workers in GCP?
If you go to the Compute Engine section in your GCP console, you will be able to find a list of VMs running.
A set of workers for a Dataflow job will run under the same instance group, and next to each instance, you can see a button with a set of options to SSH into it:
I'd like to run a very light ruby script once a day and then shutdown my Elastic Cloud instance on AWS in order to reduce its costs.
Is it possible to schedule an instance to auto-start once a day, run a script for one hour and then shut-down?
This article covers how to use auto-scaling to launch your instance for a short time to run a job.
As for the job itself, you can trigger it's execution on startup by any of the methods suitable to your OS/environment. Some are listed on this page.
You can do this with the help of Lambda and CloudWatch.
Please refer: https://aws.amazon.com/premiumsupport/knowledge-center/start-stop-lambda-cloudwatch/