Unable to create GCP Deep Learning VM instance with GPU

Unable to create GCP Deep Learning VM instance with GPU - google-cloud-platform

I'm trying to get a GCP "Deep Learning VM" instance running with a GPU. Following these instructions. I'm being hit with a You've gone over GPUs (all regions) quota by 1 GPU. Please increase your quota in the quotas page. Learn more. However when I look at the quota's I do have a 1GPU limit for "NVIDIA v100". I have a 0 limit for the Committed NVIDIA ***.
When you create a "Deep Learning VM" instance and select GPUS, are you selecting committed GPUS?

When you request a GPU quota, you must request a quota for the GPU models that you want to create in each region, and an additional global quota for the total number of GPUs of all types in all zones. You can request to increase GPU quota from here.
Since you already have NVIDIA_V100_GPUS quota limit of 1 in (for example us-west1) region, all you need to do now is to request for GPUs(All regions) quota increase through your Quotas page. The value of the request depends on the number of GPUS that you want to deploy. This should get rid of the error that you are getting.
If you want to use committed GPUs then you need to create reservation based on your GPU types when purchasing the commitment. So, when you create a Deep Learning VM it should matched with your committed GPU types in order to use for the machine. For example, if you want to reserve 4 V100 GPUs, then you must also commit to 4 V100 GPUs and when you are creating the Deep Learning VM using one of the V100 GPU on the reservation section you can that 1 V100 GPU is being used. If you choose another GPU types then it will not selected from committed GPUs. Committed GPUS are only used to get discounts for using GPU resources.

Please, request increase of quota for GPU first.
Go here and increase quota you need from 0 to 1 (or e.g., you can search 'GPU' and increase for P100, V100, K80, etc). After receiving approval you can deploy your VM with GPU=1.

Related

How to request for GPU quota increase for Nvidia A100 (for AI Platform - Notebook)

I am currently trying out GCP before deciding if I would go on a paid service. I am currently on the $300 free trial version.
With the Machine Learning work that I am currently doing, using a GPU with only 16GB VRAM is insufficient. I am thinking of trying out the Nvidia A100 GPU which gives 40GB of VRAM.
However, when I tried to request for an Nvidia A100 quota increase from 0 to 1, I keep getting the email which says “Unfortunately, we are unable to grant you additional quota at this time”.
Does anyone know what I need to do to get access to Nvidia A100? Have I been doing something wrong when requesting the quota increase for Nvidia A100?
Many thanks in advance if anyone knows this and could help.

You can't use GPUs with the free tier of GCP.
See https://cloud.google.com/free/docs/gcp-free-tier
Your Free Trial credits apply to all Google Cloud resources, including Google Maps Platform usage, but with the following exceptions:
You can't add GPUs to your VM instances.

What is the way increasing CPU quota of VM instance in Google Compute Engine while requests were rejecting all the time?

I have a school project about parallel programming. For getting results and seeing the results are good enough, i need to increase CPU quota. I applied for increasing CPU quota in several amount -16, 24, 96- and several regions but it always rejected.
The ways that i use;
All Quotas > Select the CPUs > Select the region > Edit Quotas > Write New Quota > Send.
Edit VM instance > Select Higher CPU option > Error: Quota CPUs exceeded. Limit is 8 in region east-west-6.
Write to Sales Support Team (They haven't answered, yet.)
Write to Google Cloud Platform Support, they say; "However after careful evaluation, we have determined that we are unable to grant your quota increase due to insufficient service usage history within your preferred project. We suggest for you to make use of your current quotas and other resources readily available to serve your purposes for the meantime. To discuss further options on higher quota eligibility and to answer your questions, please reach out to your Sales team [1]" (Actually, i used this project for a while, at least for 5-6 months, not day by day, but frequently)
I have 8 CPU right now in my VM instance in Compute Engine - i was using Free Credit by the way but my credit card is added too. - I need higher quota amount. So, what is the way that i need to follow for increasing CPU quota amount after all?

Try to upgrade your free trial account to paid account. However, I checked on my account and indeed, I have quotas limitation per region (24 cpus for N1 type, 8 for other types (N2, N2D, C2))
Maybe that with a paid account, your quota increase request will be accepted! In my case, it has been accepted in 2 minutes (N2 type, request of 16 CPUs, all regions)

Google cloud ml-engine custom hardware

I tried running my job with BASIC_GPU scale tier but I got an out of memory error. So then I tried running it with a custom configuration but I can't find a way of just using 1 Nvidia K80 with additional memory. All examples and predefined options use a number of GPUs, CPUs and workers and my code is not optimized for that. I just want 1 GPU and additional memory. How can I do that?

GPU memory is not extensible currently (Till something like PASCAL is accessible)
Reducing the batch size solves some of the out of memory issues
Adding GPUs to workers doesn't help either, as the model is deployed on individual worker separately (No memory pooling b/n workers)

Adding GPU to an existing VM instance on Google Compute Engine Slows Performance

I created a VM instance with Google Compute Engine with a single initial GPU and would like to add a second GPU of the same type. The VM instance is using Windows Server 2016, has 8 CPUS and 52GB of memory. I have followed the steps to add a GPU at the following location:
https://cloud.google.com/compute/docs/gpus/add-gpus
When using the updated VM instance, the performance is very slow (Click on a window/button and 10 seconds later it opens). The CPU does not appear to be heavily utilized. If I remove the GPU so that only a single GPU is used, the performance goes back to normal.
Am I missing a step (maybe updating something in windows)?

Increasing number of vCPUs for a single computation and billing

While studying basic ML algorithms on MNIST database, I noticed that my netbook is too week for such purpose. I started a free trial on Google Cloud and successfully set up VM instance with 1 vCPU. However, it only boosts up the performance 3x and I need much more computing power for some specific algorithms.
I want to do the following:
use 1 vCPU for setting up an algorithm
switch to plenty of vCPU to perform a single algorithm
go back to 1 vCPU
Unfortunately, I am not sure how Google will charge me for such maneuver. I am afraid that it will drain my 300$ which I have on my account. It is my very first day playing with VMs and using clouds for computing purpose so I really need a good advice from someone with experience.
Question. How to manage namber of vCPUs on Google Cloud Compute Engine to compute single expensive algorithms?

COSTS
The quick answer is that you will pay what you use, if you make use of 16 cpu for 1 hour you will pay 16 cpu for 1 hour.
In order to have a rough idea of cost I would advice you to take a look to Price Calculator and try to create your own estimation with the resources you are going to use.
Having a 1VCPU and 3.75GB of RAM machine running for one day cost around 0.80$ (if it is not a preentible instance and without any committed use discounts), a machine having 32 VCPU and 120GB of RAM on the other hand would cost around 25$/day.
Remember the rule: when it is running, you are paying it; you can change the machine type how many times you want according your needs and during the transition you would pay just the persistent disk. Therefore it could make sense to switch off the machine each time you are not using it.
Consider that you will have to pay as well networking and storage, but the costs in your use case are kind of marginal, for example 100GB of storage for one day costs $0.13.
Notice that since September 2017 Google extended per-second billing, with a one minute minimum, to Compute Engine. I believe that this is how most of the Cloud Provider works.
ADDING VCPU
When the machine is off, you can modify from the edit menu the number of VCU and the amount of memory, here you can find a step to step official guide you can follow through the process. You can change machine type as well through the command line, for example setting a custom machine type with 4 vCPUs and 1 GB of memory :
$ gcloud compute instances set-machine-type INSTANCE-NAME --machine-type custom-4-1024
As soon you are done with your computation, stop the instance and reduce the size of the machine (or leave it off).

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js