Given What is difference between Lightsail and EC2? and my own testing, I am convinced that LightSail Instances are burstable performance instances.
Are LightSail database instances subject to this kind of burstable performance too? I'm interested in both I/O burst and CPU burst.
If so, is there a way to view the remaining CPU or IO burst credits? Unlike EC2, the web console for LightSail doesn't seem to offer these numbers.
This issue hit me hard today, taking down my instances making them unable to serve any request
Just want to inform you that they include the bursts area into their metric now. But I dont see the way to get remaining credits
Also now the burstable "feature" is documented
Related
I have created a Windows custom AMI with some custom Windows application.I use this AMI to generate EC2 instances.
I have run into a strange issue:
All the applications run smoothly in the EC2 instance created from the custom AMI.
However, after 24 hours, when I created an EC2 instance using the same custom image, the performance of the applications deteriorate.
Even opening an application on the EC2 instance is much slower compared to the EC2 instance which was created 24 hours prior.
Any Suggestions would be really helpful.
This might be caused by the use of a T2 instance. These are burstable instances.
From CPU Credits and Baseline Performance for Burstable Performance Instances - Amazon Elastic Compute Cloud:
Traditional Amazon EC2 instance types provide fixed performance, while burstable performance instances provide a baseline level of CPU performance with the ability to burst above that baseline level. The baseline performance and ability to burst are governed by CPU credits. A CPU credit provides the performance of a full CPU core for one minute.
So, if your Amazon EC2 instance is consuming a lot of CPU, then it might run out of the CPU credit balance, and therefore be limited in the amount of CPU it can use.
You can monitor the CPU credit balance in Amazon CloudWatch. You can also see the historical CPU usage in CloudWatch, or do it within the Windows instance itself using the Task Manager.
I got the issue. Apparently any windows app we launch , Microsoft automatically tries to connect to internet for updates for every 24 hours . In my case , Internet was turned off , The updates where not getting downloaded. hence the connection was in wait state of 15 seconds by default, Hence the application was slow to launch
Our web application has 5 pages (Signin, Dashboard, Map, Devices, Notification)
We have done the load test for this application, and load test script does the following:
Signin and go to Dashboard page
Click Map
Click Devices
Click Notification
We have a basic free plan in AWS.
While performing load test, till about 100 users, we didn’t get any error. please see the below image. We could see NetworkIn, CPUUtilization seems to be normal. But the NetworkOut showed 846K.
But when reach around 114 users, we started getting error in the map page (highlighted in red). During that time, it seems only NetworkOut is high. Please see the below image.
We want to know what is the optimal score for the NetworkOut, If this number is high, is there any way to reduce this number?
Please let me know if you need more information. Thanks in advance for your help.
You are using a t2.micro instance.
This instance type has limitations on CPU that means it is good for bursty workloads, but sustained loads will consume all the available CPU credits. Thus, it might perform poorly under sustained loads over long periods.
The instance also has limited network bandwidth that might impact the throughput of the server. While all Amazon EC2 instances have limited allocations of bandwidth, the t2.micro and t2.nano have particularly low bandwidth allocations. You can see this when copying data to/from the instance and it might be impacting your workloads during testing.
The t2 family, especially at the low-end, is not a good choice for production workloads. It is great for workloads that are sometimes high, but not consistently high. It is also particularly low-cost, but please realise that there are trade-offs for such a low cost.
See:
Amazon EC2 T2 Instances – Amazon Web Services (AWS)
CPU Credits and Baseline Performance for Burstable Performance Instances - Amazon Elastic Compute Cloud
Unlimited Mode for Burstable Performance Instances - Amazon Elastic Compute Cloud
That said, the network throughput showing on the graphs is a result of your application. While the t2 might be limiting the throughput, it is not responsible for the spike on the graph. For that, you will need to investigate the resources being used by the application(s) themselves.
NetworkOut simply refers to volume of outgoing traffic from the instance. You reduce the requests you are sending from this instance to reduce the NetworkOut .So you may need to see which one of click Map, Click Devices and Click Notification is sending traffic outside of the instances. It may not necessarily related only to the number of users but a combination of number of users and application module.
I've got an EBS volume (16GB) attached to a EC2 instance that has full access to an RDS instance. The thing is I've extracted the DB to the RDS instance, so I don't use the EC2 instance for storing the web application database anymore. I did this because I was having a lot of problems with the EBS credits (they were consuming very quickly). I thought that by having the DB on a separate instance (RDS) this will decrease to almost cero the EBS credit consumption because I'm not reading nor writing on the EBS but on the RDS. However, the EBS credits keep consuming (and decrease to 0) every time users access to the web application and I don't understand why. Perhaps is because I still don't fully understand how EBS credit usage works... Can anyone enlighten me with this? Thanks a lot in advance.
You can review volume types including info on their burst credits here. You should also review I/O Characteristics and Monitoring. From that page:
If your I/O latency is higher than you require, check
VolumeQueueLength to make sure your application is not trying to drive
more IOPS than you have provisioned. If your application requires a
greater number of IOPS than your volume can provide, you should
consider using a larger gp2 volume with a higher base performance
level or an io1 volume with more provisioned IOPS to achieve faster
latencies.
You should review that metric and the others it mentions if this is causing you performance problems. If your IOPs are constantly above your baseline and causing them to queue you will always consume credits as fast as they are given.
I am using an Amazon EC2 instance with instance type m3.medium and an Amazon RDS database instance.
In my working hours the website goes down because CPU utilization reaches 100%, and at night (not working hours) the CPU utilization is 60%.
So please give me right solution for this site down issue. I am not sure why I am experiencing this problem.
Once I had set a cron job for every minutes, but I was removed it because of slow down issue, but still I have site down issue.
When i try to use "top" command, i had shows below images for cpu usage, in which httpd command consume more cpu usage, so any suggestion for settings to reduce cpu usage with httpd command
Without website use by any user below two images:
http://screencast.com/t/1jV98WqhCLvV
http://screencast.com/t/PbXF5EYI
After website access simultaneously 5 users
http://screencast.com/t/QZgZsiNgdCUl
If you are CPU Utilization is reaching 100% you have two options.
Increase your EC2 Instance Type to large.
Use AutoScaling to launch one more EC2 Instance of same Instance Type.
Looks like you need some scheduled actions as you donot need 100% CPU Utilization during non-working hours.
The best possible option is to use AWS AutoScaling with Scheduled actions.
http://docs.aws.amazon.com/autoscaling/latest/userguide/schedule_time.html
AWS AutoScaling can launch new EC2 instances based on your CPU Utilization (or other metrics like Network Load, Disk read/write etc). This way you can always keep your site alive.
Using the AutoScaling scheduled actions you can specify metrics such that you stop your autoscaled instances during non-working hours and autoscale instances during working hours according to CPU Utilization(or other metrics).
You can even stop your severs if you donot need them at some point of time.
If you are not familiar with AWS AutoScaling you can follow the Documentation which is very precise and easy.
http://docs.aws.amazon.com/autoscaling/latest/userguide/GettingStartedTutorial.html
If the cpu utilization reach 100% bacause of the number of visitors your site have, you must consider to change the instance type, Auto Scaling or AWS CloudFront in order to cache as many http requests as posible (static and dynamic content).
If visitors are not the problem and there are other scheduled tasks on the EC2 isntance, I strongly recomend to decouple these workload via AWS SQS & AWS Elasticbeanstalk - Worker type
We are considering upgrading from an t2.micro AWS server instance to a m3.medium instance based on the recommendation here and some research offline. We feel the need to upgrade primarily for speed issues and to ensure google bots crawl our fast growing site fast enough. We have upward of 8000 products (on magento) and that will grow.
While trying to understand what exactly could be the constraint of the current t2.micro instance, we ran through a lot of logs but couldn't find anything specific that could indicate a bottle-neck as such in the current usage.
Could anyone help point out
1. What are the clues that can be found in logs which could show potential bottleneck issues(if-any) with the current t2.micro instance
2. How could we find out if google-bot had issues while crawling and stopped crawling due to server performance related issues.
There are two things to note about t2.micro instances:
They have CPU limitations based upon a CPU credits system
They have limited network bandwidth
CPU credits
The T2 family is very powerful (see comparison between t2.medium and m3.medium), but there is a limit on the amount of CPU that can be used.
From the T2 documentation:
Each T2 instance starts with a healthy initial CPU credit balance and then continuously (at a millisecond-level resolution) receives a set rate of CPU credits per hour, depending on instance size. The accounting process for whether credits are accumulated or spent also happens at a millisecond-level resolution, so you don't have to worry about overspending CPU credits; a short burst of CPU takes a small fraction of a CPU credit.
Therefore, you should look at the CloudWatch CPUCreditBalance metric for the instance to determine whether it has consumed all available credits. If so, then the CPU will be limited to 10% of the time and you either need a larger T2 instance, or you should move away from the T2 family.
In general, T2 instances are great for bursty workloads, where the CPU only spikes at certain times. It is not good for sustained workloads.
Network Bandwidth
Each Amazon EC2 instance type has a limited amount of network bandwidth. This is done to prevent noisy neighbour situations. While AWS only describes bandwidth as Low/Moderate/High, there are some better details at: EC2 Instance Types's EXACT Network Performance?
You can monitor network traffic of your EC2 instances using CloudWatch. Pay attention to NetworkIn and NetworkOut to determine whether the instances are hitting limits.