What are 'managed' and 'non-managed' availability sets? - azure-availability-set

I'm trying to provision an Azure VM programmatically and I got this:
An unhandled exception of type 'Microsoft.Rest.Azure.CloudException' occurred in mscorlib.dll
Additional information:
Addition of a VM with managed disks to non-managed Availability Set or addition of a VM with blob based disks to managed Availability Set is not supported.
Please create an Availability Set with 'managed' property set in order to add a VM with managed disks to it.
Unfortunately I don't understand the distinction its talking about with managed and non-managed availability sets. The API I am using for creating availability sets doesn't even obviously have such a flag/property. What conceptual background info am I missing here?

I'm afraid it says what it should. And that state is clear.
You can't mix managed and unmanaged resources and availability sets. So if you want to have managed disk, VM it has to be created in managed availability set. Please follow this link to see more descriptive explanation at their help center.

When you create an availability set through the Azure Portal or through the ARM PowerShell module you can choose whether it will contain managed or unmanaged disks.
The following image shows the difference between both types (taken from this video):
Basically a managed availability set can only contain VMs with managed disks. A managed disk will automatically place the VM images in a different storage unit, so that if one of them fails it won't take down all of them.
There is more information here: https://learn.microsoft.com/en-gb/azure/virtual-machines/windows/managed-disks-overview

Related

Failed to start instance: A e2-micro VM instance is currently unavailable in the us-central1-a zone

I am facing this issue from yesterday. This is the exact error: Failed to start feature-config: A e2-micro VM instance is currently unavailable in the us-central1-a zone. Alternatively, you can try your request again with a different VM hardware configuration or at a later time. For more information, see the troubleshooting documentation.
I had scheduled Google Compute Engine to TURN on & off at specific time using Instance scheduler but now I am locked out of it. I cannot even create a machine image to deploy on another zone
I changed the Machine Configuration. As from your answer I could figure out that resources might not be available for the US Central Zone possibly due to traffic. I changed configuration to - n2-highcpu-2 vCPU 2 & Memory -2 GB
At the end, it seems this was a general issue that multiple users experienced in us-central1 among other regions.
In this thread more is talked and it seems it got worse during the weekend.
As some suggestions in the comments, changing the zone/region/hardware can help but not always since this also depends on any constraints you may have.
As the error suggests, there aren't any available resources in that regions. I contacted GCP support after facing the same issue and got the following response:
Google Cloud Support, : Upon further checking, the reason that the e2-medium VM instance is currently unavailable is because there are limited VMs available to a specific zone and regions. Best we can do is to try another time or select a different zone so that the VMs that you desire to use will start. Rest assured that there is nothing wrong with your account and it was on the us-central1 zone who do not have available VMs you selected as of the moment.
If possible, try deploying to a different instance. For those who need an instance in us-central1 (for Qwiklabs?) might have to wait until more instances are available.
Similar issue here, but coming from a terraform apply. I've tried multiple zones and every one says both 'e2-small' and 'e2-micro' instances are unavailable. Seems google completely fumbled the "cloud game" here since AWS doesn't have this problem EVER! (not that I like using AWS, it's just "ick" compared to google).

"Not enough resources available to fulfill the request" error in GCP

In GCP, I'm trying to create a new notebook instance.
However, I got this error from all the zones that I tried:
"tensorflow-2-4-20210214-113312: The zone 'projects/[PROJECT-ID]/zones/europe-west3-b' does not have enough resources available to fulfill the request. Try a different zone, or try again later."
Even though the whole point of Cloud Computing is not to worry about the underlying infrastructure serving your application, at the end of the day there will be some servers with limited capacity and resources hosting your applications or supporting the underlying infrastructure of the product in question that you are using.
In the specific case of AI Platform Notebooks you can use the following command:
gcloud beta notebooks locations list
to get a list of the available locations and monitor the release notes to check when new locations are added. Try to create a new notebook in another location that do have available resources or wait for resources to be available on that particular zone.

How much time for GPU quota updating?

I'm trying some stuff on Google Cloud and I have the following issue. Some days ago I created a Deep Learning VM with Compute Engine, with 8 vCPU and 1 Tesla K80 GPU. All worked fine, but now I want to try another GPU with different memory size. So, I deleted the VM instance (from Compute Engine -> VM instances) and I also deleted the deployment from Deployment manager. Nevertheless, when I try to create a new VM, I get an error message referring to the fact that I no more resources available and in fact, in the quotas page, I still see the GPU usage to 1 (with a limit of 1, that's why I can't create a new instance). Does anyone knows what could be the problem? Do I just have to wait? Thank you everyone!
If you receive a resource error (such as ZONE_RESOURCE_POOL_EXHAUSTED or ZONE_RESOURCE_POOL_EXHAUSTED_WITH_DETAILS) when requesting new resources, it means that the zone cannot currently accommodate your request.
This error is due to the availability of Compute Engine resources in the zone, So, you could try to create the resources in another zone in the region or in another region.
You can search another available zone on this document: Available regions and zones
If possible, change the shape of the VM you are requesting. It's easier to get smaller machine types than larger ones. A change to your request, such as reducing the number of GPUs or using a custom VM with less memory or vCPUs, might allow your request to proceed.
Also, you can create reservations for Virtual Machine (VM) instances in a specific zone, using custom or predefined machine types, with or without additional GPUs or local SSDs, to ensure resources are available for your workloads when you need them.
Additionally, you can found more information to troubleshoot this issue in the following link

Is "Zone" different among projects?

According to the documentation, it says a "zone" could be mapped to different cluster for different projects but is it true that a zone may map to a different cluster among projects?
I've never seen a zone mapping difference across projects. Also, since each zone provides different machine types, I'm not even sure if a zone could be mapped to different clusters among projects.
If it does, is there a way to find out which cluster my zone is mapped to like the one in AWS?
Thanks!
A cluster, as defined, is simply a set of physical servers, networks, disk, cooling. In short, a datacenter. It's impossible to know, it's google internal management.
A zone comes on top of one or several clusters. If the initial cluster (aka datacenter) is too small, Google can have chosen to expend it and if it's not possible to add another one. But at user point of view, it's invisible!
Google try to locate all the projects of the same organization in the same cluster, especially for security and performance reason in case of VPC peering or Shared VPC. However, it's not guaranteed. But, because your don't know this, you can't check it.
For example, if 2 projects are on 2 different clusters in the same region, there isn't issue. But if you create a VPC peering, it's not optimized. To solve this, Google can migrate Compute Engine from a cluster to another one, even without stopping the VM (it's called "live migration"), you aren't able to see anything of this VM placement.
Generally the cluster is consistent for a project. In case of huge resources usage, it could be different (HPC for example, or with requirement of 10k+ CPUs), but Googlers must have more detail in this case if you are a big CPU consumer
I tried to create a GKE regional cluster in europe-west3, with N2 cpu type, only available in 2 of the 3 zone and I got this error:

Ultramem VM Instances Access

I'm trying to determine how I can obtain access to Google's new ultramem instances, as described here:
https://cloudplatform.googleblog.com/2018/05/Introducing-ultramem-Google-Compute-Engine-machine-types.html
I can't see them from within 'create an instance' in my GCP, and I checked to make sure the region is matching what the blog post advertises as an available region.
Perhaps somebody has some information on this, or can tell me how I can contact Google and ask about this without having to purchase a support package.
GCP offers different machine type alternatives that can be used when creating VMs in GCE, such as predifined and custom machines.
Predefined machine types - Have a fixed collection of resources that come in four different classes. Standard, High-memory, High-CPU and Memory-optimized (ultramem machines). These machine types options can be selected by using the Machine type dropdown displayed at the New VM instance section when Creating an instance.
Customize machine types - Used to specify the number of vCPUs and the amount of system memory for your instance. This option can be selected by using the Customize link displayed at the New VM instance section when Creating an instance.