I have used AWS EC2 t2.large server for my web application.
I have setup the custom metric for MemoryUtilization. After setup, When i viewed the MemoryUtilization Metric, it shows more than 85% almost all time.
Also, I have checked the CPU Utilization for the same instance, it is less than 10% in most of the time.
I am wondering how MemoryUtilization has gone such high? What might be the possible options to reduce them? Is it due to the virtualization system of AWS?
Your application has memory leaks or unnecessary memory usage.
Try using any memory leak detection tools to fix the application.
If you don't want to fix your application, try changing the instance type to any Memory Optimized instance type.
Related
We have a VM server on GCP. Yesterday, the server stopped responding, we could not even SSH into the server, but everything was ok after restarting the server. I am having a look at the metrics and this is what I have noticed:
There is no Memory Utilization data for that period. Before this, the Memory Utilization was 90%.
Read Through Put is quite high; 13 MiB/s
What could have gone wrong? What else should I consider looking at?
Harith:
The applications processes running in your VM consumed the totality of the memory assigned to the VM.
Analyze each application hosted on the VM and evaluate its MTRs (Minimal Technical Requirements) and the actual work load that each one represent, this in order to estimate if the memory amount assigned is enough to support that load.
Consult log entries if available on those applications to see if they can reveal the consumption level just after the unresponsive condition.
Consider changing the machine type if you have to increase any resource capcity assigned to your vm.
If the resource consumption of your applications running on your VM will be very variable, you will need consider the implementation of autoscaling groups of instances.
I'm running web service on EC2-small which has 2gig memory.
It has soft and hard limit option.
I had set soft limit to 1gig and my server kept crashing when load is high.
Then I found the following SO post,
AWS ECS Task Memory Hard and Soft Limits says, it's better to use hard limit if I'm memory bound.
So how high should I set my hard limit for my 2gig memory EC2 machine?
I want my EC2 don't crash and scale up with auto scaling group policy.
Check how much memory is being used by the instance when no container is running on it, and then set the limit accordingly.
For ubuntu-based instances the OS uses around 400-500MB, so the hard limit would be 1.4G to play safe.
Also, the hard limit ensures that many replicas can't be launched in the same instance if there are not enough resources, whereas the soft limit does not. In other words, you might have a soft limit of 1.5G and two replicas, and both could be run on the same instance. However, a hard limit of 1.5G won't allow two replicas on the same one.
I deployed the Machine Learning classification model in AWS EC2 (UBUNTU)instance successfully. I am able to access the instance "http://ec2-18-191-31-0.us-east-2.compute.amazonaws.com" and predictions are working fine only for few minutes. After that I or my colleagues are not able to access this. Getting an error "cannot connected to the server".
Security group that I crated as attached.
t2.micro instances are not suitable for any long running calculations. They are burstable. This means that their performance can be sustained only for short periods of time, e.g., sudden, short lived spikes in CPU usage. On top of that they have only 1 GB of RAM which limits its usefulness in machine learning.
For calculations, you could consider Compute optimized or Memory optimized instances. Obviously, these instance types are not free, but they are suited for calculations.
You can change instance type if you want and test with other, more power types. What you are describing indicates that your t2.micro exhausts all its RAM and/or CPU burst credits after few minutes and it freezes.
You can use CloudWatch Metrics for EC2 to monitor your instances and observer its CPU utilization and other metrics which can help you determine what exactly is causing the backlog. You can also monitor RAM and disc usage but this requires CloudWatch Agent setup on the instance.
Our web application has 5 pages (Signin, Dashboard, Map, Devices, Notification)
We have done the load test for this application, and load test script does the following:
Signin and go to Dashboard page
Click Map
Click Devices
Click Notification
We have a basic free plan in AWS.
While performing load test, till about 100 users, we didn’t get any error. please see the below image. We could see NetworkIn, CPUUtilization seems to be normal. But the NetworkOut showed 846K.
But when reach around 114 users, we started getting error in the map page (highlighted in red). During that time, it seems only NetworkOut is high. Please see the below image.
We want to know what is the optimal score for the NetworkOut, If this number is high, is there any way to reduce this number?
Please let me know if you need more information. Thanks in advance for your help.
You are using a t2.micro instance.
This instance type has limitations on CPU that means it is good for bursty workloads, but sustained loads will consume all the available CPU credits. Thus, it might perform poorly under sustained loads over long periods.
The instance also has limited network bandwidth that might impact the throughput of the server. While all Amazon EC2 instances have limited allocations of bandwidth, the t2.micro and t2.nano have particularly low bandwidth allocations. You can see this when copying data to/from the instance and it might be impacting your workloads during testing.
The t2 family, especially at the low-end, is not a good choice for production workloads. It is great for workloads that are sometimes high, but not consistently high. It is also particularly low-cost, but please realise that there are trade-offs for such a low cost.
See:
Amazon EC2 T2 Instances – Amazon Web Services (AWS)
CPU Credits and Baseline Performance for Burstable Performance Instances - Amazon Elastic Compute Cloud
Unlimited Mode for Burstable Performance Instances - Amazon Elastic Compute Cloud
That said, the network throughput showing on the graphs is a result of your application. While the t2 might be limiting the throughput, it is not responsible for the spike on the graph. For that, you will need to investigate the resources being used by the application(s) themselves.
NetworkOut simply refers to volume of outgoing traffic from the instance. You reduce the requests you are sending from this instance to reduce the NetworkOut .So you may need to see which one of click Map, Click Devices and Click Notification is sending traffic outside of the instances. It may not necessarily related only to the number of users but a combination of number of users and application module.
I have the following scenario:
I have two Windows servers on AWS that run an application via IIS. For particularities of the application, they work with HTTP load balancing on the IIs.
To reduce costs, I was asked, that the second instance is only started when the first one reaches 90% CPU usage or 85% memory usage.
In my zone (sa-east-1), there are still no Auto Scaling Groups.
Initially, I created a cloudwatch event to start the second instance when it detected high CPU usage at first. The problem is that Cloudwatch, natively still does not monitor memory and so far I'm having trouble customizing this type of monitoring.
Is there any other way for me to be able to start the second instance based on the above conditions?
Since the first instance is always running, it might be something Windows-level, some powershell that detects the high memory usage and start the second? The script to start instances via powershell I already own, I just need help with how to detect the high memory usage event to start the second instance from it.
or some third-party application that does so...
Thanks!
Auto Scaling groups are available in sa-east-1, so use them
Pick one metric upon which to scale (memory OR CPU), do not pick both otherwise it would be confusing how to scale when one metric is high and the other is low.
If you wish to monitor Windows memory in CloudWatch, see: Sending Logs, Events, and Performance Counters to Amazon CloudWatch - Amazon Elastic Compute Cloud
Also, be careful using a metric such as "memory usage" to measure the need to launch more instances. Some systems use garbage collection to free-up memory, but only when available memory is low (rather than continuously).
Plus, make sure your application is capable of running across multiple instances, such as putting it behind a load balancer (depending on what the application actually does).