GCP: Instance creation failed - google-cloud-platform

I recently tried to create an instance group on the Google Cloud Platform (GCP) with 50 n1-standard-1 instances in zone us-east1-b, each with P100 GPUs. I requested and got approval for 200 P100 GPUs in this zone. My CPU, IP addresses, and Routes for this zone and globally all meet the quotas listed on this page.
However, right now, I'm only up to 21 of these 50 instances created, with the rest with a yellow hazard sign and the accompanying warning message: Instance 'instance-group-1-<name>' creation failed: The zone 'projects/<project>/zones/us-east1-b' does not have enough resources available to fulfill the request. '(resource type:compute)'.
Is there any place on the quotas page where I can get information on exactly which compute quota I forgot to ask more of? The error message is unfortunately not very descriptive.
Note: I suspect that this could be a reference to exceeding the Compute Engine API query limiting access to at most 2000 querues per 100 seconds. The 7 day peak usage column does show that I have exceeded it at peak times. However, my Current Usage is at less than 70 queries per 1000 seconds. When I look at my compute engine query usage graphed over time, it doesn't look like I have tripped the 2000 rate limit for several hours. However, the instance group still fails to populate fully to all 50 instances.

This is a typical error which means that at a certain point in time, the resources in 'us-east1-b' are not sufficient to scale your Instance Group even though you have sufficient quota. You have two alternatives:
1- try again later
2- request GPUs in another region/zone and deploy your IG there.
Google also recommends to distribute your workloads in more than one region and zone.
For more information see this and this.

Related

You have requested more vCPU capacity than your current vCPU limit of 0

I am trying to host Unreal Engine Pixel Streaming Build to AWS and while doing the setup I am getting stuck at the launch of Instance. I followed this tutorial here.
Please look at the error below:
Any help would be appreciated since I am a beginner in AWS.
There are default limits for various Amazon EC2 instance types. These are based upon the total number of vCPUs simultaneously running. You can access this information by selecting Limits in the sidebar of the EC2 management console.
You can click the Request limit increase button to submit a request for the limit to be increased.
These limits are sometimes to prevent fraud (eg people consuming lots of resources and then not paying their bill), and sometimes to protect people from accidentally running the more-expensive instances (eg the X family).
The g4dn.4xlarge shown in that video tutorial costs $1.204/hour (depending upon region used).
Check out this aws doc:
https://aws.amazon.com/premiumsupport/knowledge-center/ec2-on-demand-instance-vcpu-increase/ It will let you see your current limit, and if you need to request an increase (or just choose a different instance type/class)
Also double-check you are in the right region. An increase is only for the region you request it in. You can check what your limit is via the Quotas page: https://console.aws.amazon.com/servicequotas/home?region=us-east-1#!/dashboard

Google cloud platform free tier limits from compute engine

In GCP, it is not notified when a virtual machine of with resources higher than the free tier limit is created. An error message of following pattern arises in the notification. So, what is the maximum allowed resourced for Google cloud platform virtual machine?
Create VM instance "instance-2" and its boot disk "instance-2"
Quota 'C2_CPUS' exceeded. Limit: 0.0 in region asia-south1.
As written in the documentation:
Compute Engine
1 non-preemptible e2-micro VM instance per month in one of the following US regions:.
Oregon: us-west1
Iowa: us-central1
South Carolina: us-east1
30 GB-months HDD.
5 GB-month snapshot storage in the following regions:.
Oregon: us-west1
Iowa: us-central1
South Carolina: us-east1
Taiwan: asia-east1
Belgium: europe-west1
1 GB network egress from North America to all region destinations (excluding China and Australia) per month
Your Free Tier e2-micro instance limit is by time, not by instance. Each month, eligible use of all of your e2-micro instances is free until you have used a number of hours equal to the total hours in the current month. Usage calculations are combined across the supported regions.
Google Cloud Free Tier does not include external IP addresses.
Compute Engine offers discounts for sustained use of virtual machines. Your Free Tier use doesn't factor into sustained use.
GPUs and TPUs are not included in the Free Tier offer. You are always charged for GPUs and TPUs that you add to VM instances.
NB: This is subject to changes, check the link for up-to-date information.
Step-by-Step guide to create a free instance:
Create instance
Now go create the instance at https://console.cloud.google.com/compute/instancesAdd
region: us-east1 or one of the region indicated in the documentation.
Select General Purpose -> N2 -> e2-micro. You will see "Your first 744 hours of e2-micro instance usage are free this month"
Select Boot disk -> public image -> ubuntu -> 20.04LS -> boot disk type: Standard persistent disk (HDD) -> size 30gb (or as per documentation)
Allow http and https traffic (or don't check the boxes, if you don't intend to use port 80 and 443)
Click on Create
You can check "view billing report" to make sure you did it right.
You can found more information at the documentation Google Cloud Free Tier:
The Google Cloud Free Tier has two parts:
A 3-month(previously 12) free trial with $300 credit to use with any Google Cloud services.
Always Free, which provides limited access to many common Google Cloud resources, free of charge.
At the section 12-month, $300 free trial you can find Program coverage details:
Your free trial credit applies to all Google Cloud resources, with the
following exceptions:
You can't have more than 8 cores (or virtual CPUs) running at the same time.
You can't add GPUs to your VM instances.
You can't request a quota increase. For an overview of Compute Engine quotas, see Resource quotas.
You can't create VM instances that are based on Windows Server images.
You must upgrade your account to perform any of the actions in the preceding list.
In addition, have a look at the End of the free trial:
The free trial ends when you use all of your credit, or after 12
months, whichever happens first. At that time, the following
conditions apply:
You must upgrade to a paid account to continue using Google Cloud.
All resources you created during the trial are stopped.
Any data you stored in Compute Engine is lost.
Your account enters a 30-day grace period, during which you can recover resources and data you stored in any Google Cloud services
during the trial period.
You might receive a message stating that your account has been canceled, which only indicates that your account has been suspended to
prevent charges.
and at the Recovering data:
Caution: There is no automated way to recover data that you used on VM instances you created with Compute Engine. You must manually
export any data that you want to keep from your Compute Engine VM
instances before the trial period ends.
I do recommend you to upgrade your account before free trial ends.
After the free trial period ends you just have to register a credit card to continue to use their services if/when you accrue charges from them. If you set it up right it might charge you .02 cents every now and then. I just set up my first one with wordpress and at first I would get charged .02cents/month but once I updated the software and the config it rarely charges me. p.s. I started getting hack attempts pretty quickly.

ERROR: (gcloud.compute.instances.create) Could not fetch resource: - Quota 'GPUS_ALL_REGIONS' exceeded. Limit: 0.0 globally

I would like to try PEGASUS to summarize article.
https://github.com/google-research/pegasus
I followed this instruction.
https://github.com/google-research/pegasus/tree/f76b63c2886748f7f5c6c9fb547456d8c6002562#setup
I checked the region which I can use NVIDIA Tesla V100 and I decided to use us-central1-a
https://cloud.google.com/compute/docs/gpus
I used this command.
gcloud compute instances create pegasustest --zone=us-central1-a
--machine-type=n1-highmem-8 --accelerator type=nvidia-tesla-v100,count=1
--boot-disk-size=500GB --image-project=ml-images --image-family=tf-1-15
--maintenance-policy TERMINATE --restart-on-failure
I got this error message.
ERROR: (gcloud.compute.instances.create) Could not fetch resource:
- The zone 'projects/covid19agent/zones/us-central1-a' does not have enough
resources available to fulfill the request.
Try a different zone, or try again later.
I took 3 hours and tried again, but I got the same result.
So, I changed the region from us-central1-a to asia-east1-c.
I used this command.
gcloud compute instances create pegasustest --zone=asia-east1-c
--machine-type=n1-highmem-8 --accelerator type=nvidia-tesla-v100,count=1
--boot-disk-size=500GB --image-project=ml-images --image-family=tf-1-15
--maintenance-policy TERMINATE --restart-on-failure
Then I got this error message.
WARNING: Some requests generated warnings:
- Disk size: '500 GB' is larger than image size: '10 GB'.
You might need to resize the root repartition manually
if the operating system does not support automatic resizing.
See https://cloud.google.com/compute/docs/disks/add-persistent-disk#resize_pd
for details.
ERROR: (gcloud.compute.instances.create) Could not fetch resource:
- Quota 'GPUS_ALL_REGIONS' exceeded. Limit: 0.0 globally.
Is it impossible for me to try PEGASUS? And, does it cost too much to try PEGASUS?
Let's start with the first issue. Have a look again at the error message:
ERROR: (gcloud.compute.instances.create) Could not fetch resource:
- The zone 'projects/covid19agent/zones/us-central1-a' does not have enough resources available to fulfill the request. Try a different
zone, or try again later.
When you start an instance it requests resources like vCPU, memory, GPU and if there's not enough resources available in the zone you'll get such message, more information available in the documentation:
If you receive a resource error (such as ZONE_RESOURCE_POOL_EXHAUSTED
or ZONE_RESOURCE_POOL_EXHAUSTED_WITH_DETAILS) when requesting new
resources, it means that the zone cannot currently accommodate your
request. This error is due to Compute Engine resource obtainability,
and is not due to your Compute Engine quota.
Resource availability are depending from users requests and therefore are dynamic.
There are a few ways to solve this issue:
Wait for a while and try to start your VM instance again (as you tried, but fruitless this time).
Move your instance to another zone (as you did).
Reserve resources for your VM by following documentation to avoid such issue in future:
Create reservations for Virtual Machine (VM) instances in a specific
zone, using custom or predefined machine types, with or without
additional GPUs or local SSDs, to ensure resources are available for
your workloads when you need them. After you create a reservation, you
begin paying for the reserved resources immediately, and they remain
available for your project to use indefinitely, until the reservation
is deleted.
Now, let's have a look at the second issue. Have a look again at this error message:
ERROR: (gcloud.compute.instances.create) Could not fetch resource:
- Quota 'GPUS_ALL_REGIONS' exceeded. Limit: 0.0 globally.
More information about quotas you can find in the documentation.
To solve this issue you should follow steps below:
Ensure that billing is enabled for your project.
Request an increase in quota:
Go to the Quotas page.
In the Quotas page, select the quotas you want to change.
Click the Edit Quotas button on the top of the page.
Check the box of the service you want to edit.
Fill out your name, email, and phone number, and click Next.
Enter your request to increase your quota, and click Next.
Submit your request.
A request to decrease quota is rejected by default. If you must reduce your quota, reply to the support email with an explanation of
your requirements. A support representative from the Compute Engine
team will respond to your request within 24 to 48 hours.
You're not able to request an increase in quota if you use 12-month, $300 free trial because of the limitations:
Your free trial credit applies to all Google Cloud resources, with the
following exceptions:
You can't have more than 8 cores (or virtual CPUs) running at the same time.
You can't add GPUs to your VM instances.
You can't request a quota increase. For an overview of Compute Engine quotas, see Resource quotas.
You can't create VM instances that are based on Windows Server images.
You must upgrade your account to perform any of the actions in
the preceding list.
You can estimate cost of usage with Google Cloud Pricing Calculator.

Google Compute returns "Quota 'GPUS_ALL_REGIONS' exceeded" when spinning up a GPU

We are in need to onboard a client to our project and we need a GPU enabled instance. In each and every US region where I am trying to spin up a GPU instance the below error is thrown:
The zone /zones/us-east4-c' does not have enough resources available to fulfill the request. Try a different zone, or try again later.
Create VM instance "instance-1" and its boot disk "instance-1"
5 minutes ago
Quota 'GPUS_ALL_REGIONS' exceeded. Limit: 0.0 globally
The zone 'projects/zones/us-central1-f' does not have enough resources available to fulfill the request. Try a different zone, or try again later.
The zone 'projects//zones/us-central1-f' does not have enough resources available to fulfill the request. Try a different zone, or try again later.
The zone 'projects//zones/us-central1-c' does not have enough resources available to fulfill the request. Try a different zone, or try again later.
I have a GPU Quota of 1 enabled for all these regions. Not sure why below error popped out:
Create VM instance "instance-1" and its boot disk "instance-1"
5 minutes ago
Quota 'GPUS_ALL_REGIONS' exceeded. Limit: 0.0 globally
What can we do to fix this?
It's very possible that the zone just doesn't have enough physical resources to fulfill the request.
I do see Limit: 0.0 globally, however, which tells me that across all regions you are allowed to have 0 GPUs. Even if you have the appropriate quota left in us-central1 you wouldn't be able to spin up a GPU instance because the global quota is too low (both quotas get used for this).

A required resource is not available. Google cloud platform instance group

Resources being used: 1 vCPU, 3.75 GB, 1 K80 GPU. (for instance template)
Region: Asia east1.
Image: ubuntu-1804-bionic-v20190918. (for instance template)
I'm currently trying to create an instance group which spans across all 3 Asia-east zones. The creation fails and the error message given is "A required resource is not available.".
This message is very vague, is there any way to pinpoint what exactly is the cause of this error? If any further information is needed about my environment feel free to ask.
I have seen similar errors cropping up when a quota is exceeded. My guess is your VMs are configured with external IPs and you have exceeded the number of external IPs allowed. It could be another quota as well. But I would suggest to try another region (us-east) or see if you can use VMs with no external IPs.
Update
Just noticed here that K80 is not available in asia-east1-c. Try excluding that from the available zones of your instance group.
if you got an error likes,
Resource exhausted (HTTP 429): RESOURCE_EXHAUSTED
it might be a credits issue in free trial account (started with $300)
when you tried to create an instance template on console page, also can show the estimated costs details
1 NVIDIA Tesla K80 GPU $357.70/month
using pricing calculator also you can check the credits before generating VMs.
in short, get rids of GPU or ACTIVATE (upgrade your account) would be helpful