A required resource is not available. Google cloud platform instance group - google-cloud-platform

Resources being used: 1 vCPU, 3.75 GB, 1 K80 GPU. (for instance template)
Region: Asia east1.
Image: ubuntu-1804-bionic-v20190918. (for instance template)
I'm currently trying to create an instance group which spans across all 3 Asia-east zones. The creation fails and the error message given is "A required resource is not available.".
This message is very vague, is there any way to pinpoint what exactly is the cause of this error? If any further information is needed about my environment feel free to ask.

I have seen similar errors cropping up when a quota is exceeded. My guess is your VMs are configured with external IPs and you have exceeded the number of external IPs allowed. It could be another quota as well. But I would suggest to try another region (us-east) or see if you can use VMs with no external IPs.
Update
Just noticed here that K80 is not available in asia-east1-c. Try excluding that from the available zones of your instance group.

if you got an error likes,
Resource exhausted (HTTP 429): RESOURCE_EXHAUSTED
it might be a credits issue in free trial account (started with $300)
when you tried to create an instance template on console page, also can show the estimated costs details
1 NVIDIA Tesla K80 GPU $357.70/month
using pricing calculator also you can check the credits before generating VMs.
in short, get rids of GPU or ACTIVATE (upgrade your account) would be helpful

Related

Failed to start instance: A e2-micro VM instance is currently unavailable in the us-central1-a zone

I am facing this issue from yesterday. This is the exact error: Failed to start feature-config: A e2-micro VM instance is currently unavailable in the us-central1-a zone. Alternatively, you can try your request again with a different VM hardware configuration or at a later time. For more information, see the troubleshooting documentation.
I had scheduled Google Compute Engine to TURN on & off at specific time using Instance scheduler but now I am locked out of it. I cannot even create a machine image to deploy on another zone
I changed the Machine Configuration. As from your answer I could figure out that resources might not be available for the US Central Zone possibly due to traffic. I changed configuration to - n2-highcpu-2 vCPU 2 & Memory -2 GB
At the end, it seems this was a general issue that multiple users experienced in us-central1 among other regions.
In this thread more is talked and it seems it got worse during the weekend.
As some suggestions in the comments, changing the zone/region/hardware can help but not always since this also depends on any constraints you may have.
As the error suggests, there aren't any available resources in that regions. I contacted GCP support after facing the same issue and got the following response:
Google Cloud Support, : Upon further checking, the reason that the e2-medium VM instance is currently unavailable is because there are limited VMs available to a specific zone and regions. Best we can do is to try another time or select a different zone so that the VMs that you desire to use will start. Rest assured that there is nothing wrong with your account and it was on the us-central1 zone who do not have available VMs you selected as of the moment.
If possible, try deploying to a different instance. For those who need an instance in us-central1 (for Qwiklabs?) might have to wait until more instances are available.
Similar issue here, but coming from a terraform apply. I've tried multiple zones and every one says both 'e2-small' and 'e2-micro' instances are unavailable. Seems google completely fumbled the "cloud game" here since AWS doesn't have this problem EVER! (not that I like using AWS, it's just "ick" compared to google).

Can't create GPU instances on GCE

I am trying to create a GPU instance (n1-standard-2 with 1 NVIDIA T4 GPU) on Compute Engine and I have been getting this error since yesterday:
Operation type [insert] failed with message "The zone 'projects/deep-learning-xxxx/zones/us-central1-a' does not have enough resources available to fulfill the request. Try a different zone, or try again later."
It seems that this region of Google Cloud doesn't have enough GPU resources, but I am getting the same error with other zones too, and after trying multiple times. Regular non-GPU instances are working fine though. I am trying to figure out if I'm doing something wrong or if there is a just a huge demand for GPU instances on GCP right now.
The reasons for GPU not being created on a VM in a particular region/zone can be,
1.Resource Unavailability. Check Resource availability here GPU availability across regions and zones.
2.Quota overuse can restrict the creation of GPUs. Refer Checking project quota for details.
3.Few GCP Restrictions, you can refer to the list of Restrictions here.
You can Check GPU Quota in Create VM with GPU's
Alternatively, GCP offers a feature called Reserving Compute Engine zonal resources to ensure that your project has resources for future use.
Finally, I was able to launch a preemptible GPU instance without a problem. So it really seems like Google Cloud doesn't have enough GPU resources to reserve an on-demand GPU VM at the moment.

How much time for GPU quota updating?

I'm trying some stuff on Google Cloud and I have the following issue. Some days ago I created a Deep Learning VM with Compute Engine, with 8 vCPU and 1 Tesla K80 GPU. All worked fine, but now I want to try another GPU with different memory size. So, I deleted the VM instance (from Compute Engine -> VM instances) and I also deleted the deployment from Deployment manager. Nevertheless, when I try to create a new VM, I get an error message referring to the fact that I no more resources available and in fact, in the quotas page, I still see the GPU usage to 1 (with a limit of 1, that's why I can't create a new instance). Does anyone knows what could be the problem? Do I just have to wait? Thank you everyone!
If you receive a resource error (such as ZONE_RESOURCE_POOL_EXHAUSTED or ZONE_RESOURCE_POOL_EXHAUSTED_WITH_DETAILS) when requesting new resources, it means that the zone cannot currently accommodate your request.
This error is due to the availability of Compute Engine resources in the zone, So, you could try to create the resources in another zone in the region or in another region.
You can search another available zone on this document: Available regions and zones
If possible, change the shape of the VM you are requesting. It's easier to get smaller machine types than larger ones. A change to your request, such as reducing the number of GPUs or using a custom VM with less memory or vCPUs, might allow your request to proceed.
Also, you can create reservations for Virtual Machine (VM) instances in a specific zone, using custom or predefined machine types, with or without additional GPUs or local SSDs, to ensure resources are available for your workloads when you need them.
Additionally, you can found more information to troubleshoot this issue in the following link

ERROR: (gcloud.compute.instances.create) Could not fetch resource: - Quota 'GPUS_ALL_REGIONS' exceeded. Limit: 0.0 globally

I would like to try PEGASUS to summarize article.
https://github.com/google-research/pegasus
I followed this instruction.
https://github.com/google-research/pegasus/tree/f76b63c2886748f7f5c6c9fb547456d8c6002562#setup
I checked the region which I can use NVIDIA Tesla V100 and I decided to use us-central1-a
https://cloud.google.com/compute/docs/gpus
I used this command.
gcloud compute instances create pegasustest --zone=us-central1-a
--machine-type=n1-highmem-8 --accelerator type=nvidia-tesla-v100,count=1
--boot-disk-size=500GB --image-project=ml-images --image-family=tf-1-15
--maintenance-policy TERMINATE --restart-on-failure
I got this error message.
ERROR: (gcloud.compute.instances.create) Could not fetch resource:
- The zone 'projects/covid19agent/zones/us-central1-a' does not have enough
resources available to fulfill the request.
Try a different zone, or try again later.
I took 3 hours and tried again, but I got the same result.
So, I changed the region from us-central1-a to asia-east1-c.
I used this command.
gcloud compute instances create pegasustest --zone=asia-east1-c
--machine-type=n1-highmem-8 --accelerator type=nvidia-tesla-v100,count=1
--boot-disk-size=500GB --image-project=ml-images --image-family=tf-1-15
--maintenance-policy TERMINATE --restart-on-failure
Then I got this error message.
WARNING: Some requests generated warnings:
- Disk size: '500 GB' is larger than image size: '10 GB'.
You might need to resize the root repartition manually
if the operating system does not support automatic resizing.
See https://cloud.google.com/compute/docs/disks/add-persistent-disk#resize_pd
for details.
ERROR: (gcloud.compute.instances.create) Could not fetch resource:
- Quota 'GPUS_ALL_REGIONS' exceeded. Limit: 0.0 globally.
Is it impossible for me to try PEGASUS? And, does it cost too much to try PEGASUS?
Let's start with the first issue. Have a look again at the error message:
ERROR: (gcloud.compute.instances.create) Could not fetch resource:
- The zone 'projects/covid19agent/zones/us-central1-a' does not have enough resources available to fulfill the request. Try a different
zone, or try again later.
When you start an instance it requests resources like vCPU, memory, GPU and if there's not enough resources available in the zone you'll get such message, more information available in the documentation:
If you receive a resource error (such as ZONE_RESOURCE_POOL_EXHAUSTED
or ZONE_RESOURCE_POOL_EXHAUSTED_WITH_DETAILS) when requesting new
resources, it means that the zone cannot currently accommodate your
request. This error is due to Compute Engine resource obtainability,
and is not due to your Compute Engine quota.
Resource availability are depending from users requests and therefore are dynamic.
There are a few ways to solve this issue:
Wait for a while and try to start your VM instance again (as you tried, but fruitless this time).
Move your instance to another zone (as you did).
Reserve resources for your VM by following documentation to avoid such issue in future:
Create reservations for Virtual Machine (VM) instances in a specific
zone, using custom or predefined machine types, with or without
additional GPUs or local SSDs, to ensure resources are available for
your workloads when you need them. After you create a reservation, you
begin paying for the reserved resources immediately, and they remain
available for your project to use indefinitely, until the reservation
is deleted.
Now, let's have a look at the second issue. Have a look again at this error message:
ERROR: (gcloud.compute.instances.create) Could not fetch resource:
- Quota 'GPUS_ALL_REGIONS' exceeded. Limit: 0.0 globally.
More information about quotas you can find in the documentation.
To solve this issue you should follow steps below:
Ensure that billing is enabled for your project.
Request an increase in quota:
Go to the Quotas page.
In the Quotas page, select the quotas you want to change.
Click the Edit Quotas button on the top of the page.
Check the box of the service you want to edit.
Fill out your name, email, and phone number, and click Next.
Enter your request to increase your quota, and click Next.
Submit your request.
A request to decrease quota is rejected by default. If you must reduce your quota, reply to the support email with an explanation of
your requirements. A support representative from the Compute Engine
team will respond to your request within 24 to 48 hours.
You're not able to request an increase in quota if you use 12-month, $300 free trial because of the limitations:
Your free trial credit applies to all Google Cloud resources, with the
following exceptions:
You can't have more than 8 cores (or virtual CPUs) running at the same time.
You can't add GPUs to your VM instances.
You can't request a quota increase. For an overview of Compute Engine quotas, see Resource quotas.
You can't create VM instances that are based on Windows Server images.
You must upgrade your account to perform any of the actions in
the preceding list.
You can estimate cost of usage with Google Cloud Pricing Calculator.

GCP: Instance creation failed

I recently tried to create an instance group on the Google Cloud Platform (GCP) with 50 n1-standard-1 instances in zone us-east1-b, each with P100 GPUs. I requested and got approval for 200 P100 GPUs in this zone. My CPU, IP addresses, and Routes for this zone and globally all meet the quotas listed on this page.
However, right now, I'm only up to 21 of these 50 instances created, with the rest with a yellow hazard sign and the accompanying warning message: Instance 'instance-group-1-<name>' creation failed: The zone 'projects/<project>/zones/us-east1-b' does not have enough resources available to fulfill the request. '(resource type:compute)'.
Is there any place on the quotas page where I can get information on exactly which compute quota I forgot to ask more of? The error message is unfortunately not very descriptive.
Note: I suspect that this could be a reference to exceeding the Compute Engine API query limiting access to at most 2000 querues per 100 seconds. The 7 day peak usage column does show that I have exceeded it at peak times. However, my Current Usage is at less than 70 queries per 1000 seconds. When I look at my compute engine query usage graphed over time, it doesn't look like I have tripped the 2000 rate limit for several hours. However, the instance group still fails to populate fully to all 50 instances.
This is a typical error which means that at a certain point in time, the resources in 'us-east1-b' are not sufficient to scale your Instance Group even though you have sufficient quota. You have two alternatives:
1- try again later
2- request GPUs in another region/zone and deploy your IG there.
Google also recommends to distribute your workloads in more than one region and zone.
For more information see this and this.