Add GPU at Google Cloud Platform - google-cloud-platform

Then I try to make a new instance I get an error
I made a request to the support team to increase the quota to 2
but I cannot create an instance even with one GPU
I do everything according to the instructions, but they do not work. Help solve the problem please!

Regarding your first screenshot and increased quota in us-east1-c, you need to increase GPU quota globally as well. Projects have a global GPU quota that applies to all regions.
Also, I recommend you to edit your screenshots to remove your project ID as it is visible in public.

Related

RESOURCE_EXHAUSTED with Vertex Pipeline by leveraging the free trial on GCP

I am fairly new to GCP and I am playing around with it taking advantage of the free trial.
I would like to run this simple pipeline in Vertex from notebook, but once I run it, I get this error in the very first task.
com.google.cloud.ai.platform.common.errors.AiPlatformException: code=RESOURCE_EXHAUSTED, message=The following quota metrics exceed quota limits: aiplatform.googleapis.com/custom_model_training_cpus, cause=null;
I've looked at the quotas of the error and I have 1 CPU for each available region. Of course I can not edit them, because of the free trial.
I also made these other attempts without success:
Set the CPU limit equal to 1 on the pipeline component;
Use the less powerful machine available (n1-standard-4, which actually uses 4 vCPUs);
Run the pipeline in different regions;
Define and run the pipeline in a completely new project;
Define and run the AutoML pipeline for classification/regression, starting from the available models.
It seems rather strange to me that it is not possible to try this service with free trial, but I don't know how to solve the problem. Any ideas? Thanks
Which regions have you tried? You can check the regions available of the resource in question at the Quotas page within your project. go to "IAM & Admin" > "Quotas" then go find the resource at the search bar:
Another alternative is to request a quota increase, be aware though that there that quota increase request will go though evaluation before getting granted. For more information about conditions about quota you can visit Google's documentation here.

Availability of V100 and P100 on Google Compute Engine

Description
I just tried for some time to set up or reserve a virtual machine for machine learning with my personal account that I'm using for some months on n1 with around 8 or more GB Ram and either a P100 or a V100 for machine learning and now tried for at least half of all zones with P100/V100 availability and always get a Resource Error like this one:
Operation type [insert] failed with message "The zone 'projects/lexical-list-285719/zones/us-central1-c' does not have enough resources available to fulfill the request. Try a different zone, or try again later."
no resources available in zone-x. I recently switched from the trial.
Questions:
A) Is that common?
B) Is there a fix?
C) What (if anything) can I do to get a machine with these specifications, or similar performance?
I know that this is because of the zone not having these specifications available and that I'm supposed to try switching. I'm aware too of managed instance groups. But it can't be that difficult, can it?
Is google that booked out?
Possible Solutions
Currently my ideas to fix it:
multizone managed group (still have to check if my project is compatible with that)
cloud shell script that iterates through all available zones (would need to research how shell scripts works)
Anyone with experience in this topic sharing their experience with the solutions or with better solutions is very appreciated.
⁣
⁣
A good answer for me would not include any of the following:
Zone Switching (tried that)
Smaller machine (tried that and project doesn't work with too small machine)
Reserving (tried that)
Waiting (already know about that and doesn't help if I want a machine right now)
Though I recommend anyone with less persistent or urgent issues to do just those.
It's not an issue, events like this happens from time to time.
This error message means that there's no available resources like CPU/RAM/GPU on the Google's side in the particular zone. More details the you can find at the documentation Troubleshooting VM creation section Resource availability:
Resource errors occur when you try to request new resources in a zone
that cannot accommodate your request due to the current unavailability
of a Compute Engine resource, such as GPUs or CPUs.
Resource errors only apply to new resource requests in the zone and do
not affect existing resources. Resource errors are not related to your
Compute Engine quota and only apply to the resource you specified in
your request at the time you sent the request, not to all resources in
the zone.
Resource availability are depending from users requests and therefore are dynamic.
There are a few ways to solve this issue:
Try to create your instance at another zone where GPU is available (request an increase in quota if needed).
Wait for a while and try again.
Request some smaller VM (if possible), later you'll be able to try to request some bigger VM (same principle as for quota requests).
Reserve resources for your VM by following documentation to avoid such issue in future (extra payment required).
I had the same issue, I was trying to create V100s, I was able to get it working by switching zones to europe-west4.
What I tried if you're curious: All the sub zones in us-central1 (failed), One sub zone in us-west1 (failed), finally europe-west4 (Success).
This tells me it's due to the zones not having the GPU available. I really wish google wouldn't list it as an option since it doesn't actually have the ability to provision it. Or provide another way of knowing.

Google cloud platform, no quota "GPUs (all regions)"

I want to increase the "GPUs (all regions)", or GPUS_ALL_REGIONS, for a project on Google Cloud.
However, the option is not in the "Metric" list on the "Quotas" page of the project.
Does anyone of you know how this can happen? For other projects I have on the same Billing Account, the option is present in the list:
Present: https://i.stack.imgur.com/6iQaJ.png
Not present: https://i.stack.imgur.com/FAhJ9.png
Please keep in mind that the Compute Engine is enabled on both.
Thank you very much for your help!
Regarding your concern, quota is based on reputation. You may find some project which has this metric “GPUs (all regions)”and some other does not. The old project might have this metric but most of the time new projects does not have.
Besides this, you can submit a quota increase request for each region and dedicated team can assist you for your concern.Also, please make sure GPU type is available in the requested region. Please use this link to submit your request

not have enough resources available to fulfil the request try a different zone

not have enough resources available to fulfill the request try a different zone
All of my machines in the different zone
have the same issue and can not run.
"Starting VM instance "home-1" failed.
Error:
The zone 'projects/extreme-pixel-208800/zones/us-west1-b' does not have enough resources available to fulfill the request. Try a different zone, or try again later."
I am having the same issue. I emailed google and figured out this has nothing to do with quota. However, you can try to decrease the need of your instance (eg. decrease RAM, CPUs, GPUs). It might work if you are lucky.
Secondly, if you want to email google again, you will get the message sent from the following template.
Good day! This is XX from Google Cloud Platform Support and I'll be
glad to help you from here. First, my apologies that you’re
experiencing this issue. Rest assured that the team is working hard to
resolve it.
Our goal is to make sure that there are available resources in all
zones. This type of issue is rare, when a situation like this occurs
or is about to occur, our team is notified immediately and the issue
is investigated.
We recommend deploying and balancing your workload across multiple
zones or regions to reduce the likelihood of an outage. Please review
our documentation [1] which outlines how to build resilient and
scalable architectures on Google Cloud Platform.
Again, we want to offer our sincerest apologies. We are working hard
to resolve this and make this an exceptionally rare event. I'll be
keeping this case open for one (1) business day in case you have
additional question related to this matter, otherwise you may
disregard this email for this ticket to automatically close.
All the best,
XXXX Google Cloud Platform Support
[1] https://cloud.google.com/solutions/scalable-and-resilient-apps
So, if you ask me how long you are expected to wait and when this issue is likely to happen:
I waited for an average of 1.5-3 days.
During the weekend (like from Friday to Sunday) daytime EST, GCP has a high probability of unavailable resources.
Usually when you have one instance that has this issue, others too. For me, keep trying in different region waste my time. (But, maybe it just that I don't have any luck)
The error message "The zone 'projects/[...]' does not have enough resources available to fulfill the request. Try a different zone, or try again later." is always in reference to a shortage of resources in a zone.
Google recommends spreading your workload across different zones to reduce the impact of these issues on your workload. Otherwise, there isn't much else to do other than wait or try another zone/region
Faced this Issue yesterday [01/Aug/2020] when GCP free credit was over and below steps helped to workaround this.
I was on asia-south-c zone and moved to us zone
Going to my Google Cloud Platform >>> Compute Engine
Went to Snapshots >>> created a snapshot >>> Select your Compute Engine instance
Once snapshot was completed I clicked on my snapshot.
Ended up under "snapshot details". There, on the top, just click create instance. Here you are basically creating an instance with a copy of your disk.
Select your new zone, don't forget to attach GPUs, all previous setting, create new name.
Click create, that's it, your image should now be running in your new zone
No worry of losting configuration as well.

Google Cloud Platform GPU Quotas not always displayed

I am using GCP with several identical projects. For each new project I need a
quota of one GPU (Tesla K80). In order to apply for an increase of my GPU quota, I open the console and navigate to "IAM & Admin" > "Quotas". There I filter for my region (europe-west-1) and look for the "NVidia K80 GPUs" entry.
I have noticed that the Compute Engine APIs only appear after visiting the "Compute Engine" menu at least once. So far so good. However, the option for the GPUs only shows up after a lot of browsing around and switching between projects and revisiting the quotas page. It seems completely random.
Here is an example of two identical projects and the available quota options:
Project "examplestudent02" has the GPU option:
Project "examplestudent03" does not have the GPU option:
I cannot figure out what makes this option appear. Did anyone experience something similar? Is there something that needs to be activated before the GPU quota option appears?
There is a related question on stack overflow. However, the GPU option also does not appear when changing the Quota type to "All quotas" (which is the default anyway). Also going to the quotas page in incognito mode did not help. Lastly, I normally use Chrome, but I also tried logging in with a different browser (Firefox) which also did not help.
The following answer is based on the current scenario as there is no issue with quota display anymore.
To answer to your question, you can go to the quotas page of your console and check your GPU quota from there.
The GPUs are currently listed by the name of NVIDIA K80 GPUs and NVIDIA P100 GPUs (not GPU only). These can be easily filtered out by selecting from Metric column on this page.
If any of these cannot be found then the quota might not have been assigned. To request the GPU quota, you can follow the steps mentioned in this article. Once the request is submitted, it might take 24 to 48 hours to get approved.
That being said, one thing you need to keep in mind that free trial accounts do not receive GPU quota by default. I would also suggest checking the restrictions on instances with GPU to avoid any future issues.
Hope this helps.