Optimizing azure virtual machine size - azure-virtual-machine

I have a web app hosted on Ubuntu-based Azure classic virtual machine (size DS14). The CPU usage, load, memory, disk I/O and network I/O changes over the previous 7 days are as follows:
Clearly, there's opportunity to save money here by scaling my infrastructure dynamically up and down alongwith changes in load, instead of having a DS14 instance running all the time.
Can someone please outline the steps I'll need to do to enable this? My VM is not part of any availability set as of now.

You could add a classic VM to an availability set. Please refer to this link:Add an existing virtual machine to an availability set.
Notes: ARM VM does not support add an existing VMs to an availability set.
If you want to VM supports dynamically auto scale, you need at least two VMs in a same availability set. You could refer to this link:Automatic scale - CPU.
According to your description, you want to your VM auto up and down, I think it is not possible. When VM is up and down, the VM needs restart, your service will be interrupted for a minutes. As a production environment, this is not acceptable.

Related

GCE Instance of groups uses only 1 VM with 100% CPU and ignores the others

I'm using the Google Compute Engine instance of group, with autoscaling, to run a heavy script that varies the CPU usage during the day, but when I'm going to perform a stress test with a maximum of 4 VM's I notice that the CPU usage increases to 100% only on the main VM, while the other 3 remain at 0%. Wasn't it to divide the use between the 4 VM's according to the target I defined? Or did I misunderstand how this API works?
The role of GCE Autoscaler / Managed Instance Group in this case is to provision the correct number of VMs based on your CPU usage. This part works correctly as I understand, because you got 4 VMs.
Your role is to make sure you spread the load over all these VMs. Typically, a "script" running on a single machine will not automagically start executing on multiple machines.
For (web) serving workloads this is usually achieved using load balancers, which would distribute incoming requests between the machines automatically.
It is a lot more tricky for batch workloads (and you are in this category, if I understand correctly). I don't think Google offers any automated tools here.

GCP VM stopped responding

We have a VM server on GCP. Yesterday, the server stopped responding, we could not even SSH into the server, but everything was ok after restarting the server. I am having a look at the metrics and this is what I have noticed:
There is no Memory Utilization data for that period. Before this, the Memory Utilization was 90%.
Read Through Put is quite high; 13 MiB/s
What could have gone wrong? What else should I consider looking at?
Harith:
The applications processes running in your VM consumed the totality of the memory assigned to the VM.
Analyze each application hosted on the VM and evaluate its MTRs (Minimal Technical Requirements) and the actual work load that each one represent, this in order to estimate if the memory amount assigned is enough to support that load.
Consult log entries if available on those applications to see if they can reveal the consumption level just after the unresponsive condition.
Consider changing the machine type if you have to increase any resource capcity assigned to your vm.
If the resource consumption of your applications running on your VM will be very variable, you will need consider the implementation of autoscaling groups of instances.

How much time for GPU quota updating?

I'm trying some stuff on Google Cloud and I have the following issue. Some days ago I created a Deep Learning VM with Compute Engine, with 8 vCPU and 1 Tesla K80 GPU. All worked fine, but now I want to try another GPU with different memory size. So, I deleted the VM instance (from Compute Engine -> VM instances) and I also deleted the deployment from Deployment manager. Nevertheless, when I try to create a new VM, I get an error message referring to the fact that I no more resources available and in fact, in the quotas page, I still see the GPU usage to 1 (with a limit of 1, that's why I can't create a new instance). Does anyone knows what could be the problem? Do I just have to wait? Thank you everyone!
If you receive a resource error (such as ZONE_RESOURCE_POOL_EXHAUSTED or ZONE_RESOURCE_POOL_EXHAUSTED_WITH_DETAILS) when requesting new resources, it means that the zone cannot currently accommodate your request.
This error is due to the availability of Compute Engine resources in the zone, So, you could try to create the resources in another zone in the region or in another region.
You can search another available zone on this document: Available regions and zones
If possible, change the shape of the VM you are requesting. It's easier to get smaller machine types than larger ones. A change to your request, such as reducing the number of GPUs or using a custom VM with less memory or vCPUs, might allow your request to proceed.
Also, you can create reservations for Virtual Machine (VM) instances in a specific zone, using custom or predefined machine types, with or without additional GPUs or local SSDs, to ensure resources are available for your workloads when you need them.
Additionally, you can found more information to troubleshoot this issue in the following link

What is the number of cores in aws.data.highio.i3 elastic cloud instance given for a 14 day trial period?

I wanted to make some performance calculations hence i need to know the number of cores that this aws.data.highio.i3 instance deployed by elastic cloud on aws has, I know that it has 4 GB of ram so if anyone can help me with the number of cores that would be really very helpfull.
I am working on elasticsearch deployed on elastic cloud and my use case requires me to make approx 40 million writes in a day so if you can help me suggest what machines i must use that can work accordingly to my use case and are I/O optimized as well.
The instance used by Elastic Cloud for aws.data.highio.i3 in the background is i3.8xlarge, see here. That means it has 32 virtual CPUs or 16 cores, see here.
But you down own the instance in Elastic Cloud, from reference hardware page:
Host machines are shared between deployments, but containerization and
guaranteed resource assignment for each deployment prevent a noisy
neighbor effect.
Each ES process runs on a large multi-tenant server with resources carved out using cgroups, and ES scales the thread pool sizing automatically. You can see the number of times that the CPU was throttled by the cgroups if you go to Stack Monitoring -> Advanced and down to graphs Cgroup CPU Performance and Cgroup CFS Stats.
That being said, if you need full CPU availability all the time, better go with AWS Elasticsearch service or host your own cluster.

In AWS, how can I get physical instance ID from instance id?

When we use small AWS instnaces (e.g., d2.xlarge etc.), it is possible that multiple instances are allocated to the same host. I want to check if two vm instances are on the same host. Is there a way for us to get the physical instance ID of vms? With this info, we can check if two instances are on the same physical host.
The primary motivation behind this is to improve the reliability of running stateful service in the cloud. We use d2.xlarge instances to run hbase/kafka workload in the cloud. These services require data replicatio. As one physical host can host up to 8 d2.xlarge instances. If one physical node is down, it may affect multiple vm instances, and cause data loss.
As far as I know Amazon wouldn't let you know anything about their underlying infrastructure. And I cannot think of a reason why they should.
But I've found this blog post saying that you can use CPUID instruction to find out the actual CPU of the underlying physical machine.
From that post:
The “cpuid” instruction is supported by all x86 CPU manufacturers, and
it is designed to report the capabilities of the CPU. This instruction
is non-trapping, meaning that you can execute it in user mode without
triggering protection trap. In the Xen paravirtualized hypervisor
(what Amazon uses), it means that the hypervisor would not be able to
intercept the instruction, and change the result that it returns.
Therefore, the output from “cpuid” is the real output from the
physical CPU.
Having that said, if you need this information to ensure they don't fail all at once, I'd recommend using launching instances from different availability zones. This way even if the whole AZ goes down you'd still have some instances up and running.
There is no official support from AWS on getting the VM placement info. Some large AWS customers are able to get customized support on this.