AWS t2.medium performance issues after adding 32 gb volume

AWS t2.medium performance issues after adding 32 gb volume - amazon-web-services

On AWS EC2 t2.medium instance we run a site http://www.pricesimply.com/
Our database is installed on the same machine.
By default we had 8 gb storage and the site speed was lightning quick.
Then, we added a 32 gb volume Type - general purpose volume.
Only difference between these 2 volumes -
8 gb default volume - IOPS 24/3000
32 gb new added volume - IOPS 96/3000
Volume type & Availability zones are same.
The site MUCH SLOWER now compared to earlier.

Some random ideas:
1) Performance does vary between volumes. Do some benchmarks to see if it's really slower. (Unlikely, but possible.)
2) Perhaps the volume is a red herring -- Maybe your entire dataset was small enough to fit into RAM before, and now that you've expanded and grown, your data doesn't fit, creating constant I/O?
3) If the drive was created from a snapshot, it may be fetching your data from your snapshot in the background, slowing the drive.

Adding an additional disk shouldn't slow down your machine, you'll need to investigate things a bit more to identify the bottleneck.
Poor performing infrastructure generally fits into one of three categories:
CPU: Check your CPU utilization to see if the t2.medium instance is a suitable size. Amazon CloudWatch can show you CPU history.
Memory (RAM): Your application may be short on memory, causing page swaps to disk. You'll need to monitor memory utilization from within your instance. (CloudWatch cannot see memory utilization.)
Disk IO: If you are reading & writing to disk a lot, then this could be your bottleneck. CloudWatch can give you some metrics, especially the Queue Length, which indicates that IO was waiting to be processed.
Once you've identified which of these three factors appears to be the bottleneck, try to improve them:
CPU: Use a larger instance type
Memory: Use an instance type with more RAM
Disk: Use a faster disk
You are using General Purpose (SSD) EBS volumes. These volumes have an IOPS (Input/Output per second) related to volume size. So, your "96/3000" volume gives a guaranteed 96 IOPS (about the speed of a magnetic hard disk) with the ability to burst up to 3000 IOPS if you have enough IO 'credits'. If you are continually using more than 96 IOPS, you will run out of credits and will be limited to 96 IOPS.
See: Amazon EBS Volume Types

Related

Creating a complete copy of an AWS EBS volume from a snapshot

To Devs,
I am getting lazy reads when I create an EBS volume from a snapshot and attach it to an EC2 node.
I would like to create an EBS volume with a complete copy so that the first read is not slow.
Is there a way to do this?
Thanks,
Marc

You and everybody else. According to an AWS rep that I talked with at the AWS Summit in NYC, Amazon is well aware of the issue. Of course, there's a difference between "being aware" of an issue and actually fixing it ...
For now, the best you can do is follow the AWS instructions and use dd or fio to touch every block on the device before it's mounted. The benefit of fio is that it will run parallel threads.
Beware that you will be limited by the IO performance of your volume. One IO is 16k on an gp2 volume, so divide your volume size by that to determine how many IOs it will take to touch every block, and then divide that by the IOPS for your volume (taking into account burst IOPS).
For example (and these are rough numbers!), a 1 TB volume will require 67,108,864 IOs to read fully. The default non-provisioned performance of a 1TB gp2 volume is 3,000 IOPS, this will take 22,369 seconds or somewhat more than 6 hours. Smaller volumes will be able to use burst IOPS to get above their basic allotment, but may run into throughput limits.

AWS EBS block size

Can you point me to some resources on how EBS works behind the scenes for gp2 volumes?
The way I understand it, it is a service, but really it is some form of connecting arrays of SSD drives to the instance, in a redundant way
What is the actual, physical method of connecting?
THe documentation refers to the fact that data is transferred in 16KB or 256KB blocks, but I can't find any more about that.
If for example, in Linux, my partition is formatted with 4KB blocks, does this mean that EBS will transfer data to and from disk with 16KB block, if so wouldn't it make sense to also format the partition with 16KB block and also optimise it upstream?
If I have a set of very random 4k operations, will this trigger the same amount of 16KB block requests?
If anyone's done such testing already, I'd really like to hear it...

The actual, physical means of connection is over the AWS software-defined Ethernet LAN. EBS is essentially a SAN. The volumes are not physically attached to the instance, but they are physically within the same availability zone, the access is over the network.
If the instance is "EBS Optimized," there's a separate allocation of Ethernet bandwidth for communication between the instance and EBS. Otherwise, the same Ethernet connection that handles all of the IP traffic for the instance is also used by EBS.
The SSDs behind EBS gp2 volumes are 4KiB page-aligned.
See AWS re:Invent 2015 | (STG403) Amazon EBS: Designing for Performance beginning around 24:15 for this.
As explained in AWS re:Invent 2016: Deep Dive on Amazon Elastic Block Store (STG301), an EBS volume is not a physical volume. They're not handing you an SSD drive. An EBS volume is a logical volume that spans numerous distributed devices throughout the availability zone. (The blocks on the devices are also replicated within EBS within the availability zone to a second device.)
These factors should make it apparent that the performance of the actual SSDs is not an especially significant factor in the performance of EBS. EBS, by all appearances, allocates resources in proportion to what you're paying for the volume... which is of course directly proportional to the size of the volume as well as which feature set (volume type) you've selected.
16KiB is the nominal size of an I/O that EBS uses for establishing performance benchmarks for gp2. It probably has no other special significance, as it appears to be related as much or more to the processing resources that EBS allocates to your volume as to the media devices themselves -- EBS volumes live in storage clusters that have "resources" of their own (CPU, memory, network bandwidth, etc.) and 16KiB seems to be a nominal value related to some kind of resource allocation in the EBS infrastructure.
Note that the sc1 and st1 volumes use a very different nominal I/O size: 1 MiB. Obviously, that can't be related to anything about the physical storage device, so this lends credence to the conclusion that the 16KiB number for gp2 (and io1).
A gp2 volume can perform up to the lowest of several limits:
160 MiB/second, depending on the connected instance type‡
The current number of instantaneous IOPS available to the volume, which is the highest of
100 IOPS regardless of volume size
3 IOPS per provisioned GiB of volume size
The IOPS credits available for with in your token bucket, capped at 3,000 IOPS
10,000 IOPS per volume regardless of how large the volume is
‡Smaller instance types can't provide 160MiB/second of network bandwidth, anyway. For example, the r3.xlarge has only half a gigabit (500 Mbps) of network bandwidth, limiting your total traffic to EBS to approximately 62.5 MiB/sec, so you won't be able to push any more throughput to an EBS volume than this from an instance of that type. Unless you are using very large instances or very small volumes, the most likely constraint on your EBS performance is going to be the limits of the instance, not the limits of EBS.
You are capped at the first (lowest) threshold in the list above, the impact of the nominal 16 KiB I/O size is this: if your I/Os are smaller than 16KiB, your maximum possible IOPS does not increase, and if they are larger, your maximum possible IOPS may decrease:
an I/O size of 4KiB will not improve performance, since the nominal size of an I/O for rate limiting purposes is established 16KiB, but
an I/O size of 4KiB is unlikely to meaningfully decrease performance with sequential I/Os since, for EBS's accounting purposes, are internally combined. So, if your instance were to make 4 × 4 KiB sequential I/O requests, EBS is likely to count that as 1 I/O anyway
an I/O size of 4KiB and extremely random I/Os would indeed not be combined, so would theoretically perform poorly relative to the same number of 16KiB extremely random I/Os, but instinct and experience tells me this borders on academic and theoretical territory except perhaps in extremely rare cases. It could just as likely hurt as help, since small writes would use the same number of IOPS but transfer more unnecessary data across the wire.
if your I/Os are larger than 16KiB, your maximum IOPS will decrease if your disk bandwidth reaches the 160MiB/s threshold before reaching the IOPS threshold.
A final thought, EBS performs best under load. That is to say, a single thread making a series of random I/Os will not keep the EBS volume's queue filled with requests. When that is not the case, you will not see the maximum possible performance.
See also Amazon EBS Volume Performance on Linux Instances for more discussion of EBS performance.

What does IOPS (in Amazon EBS) mean in practice?

I have some images needed for an app. There are many images (50,000+) but the overall size is small (40 Mb). Initially, I thought I would simply use S3 but it is painfully slow to upload. As a temporary solution, I wanted to attach an EBS containing the images and that would be fine. However, reading a bit about EBS General Purpose (gp2) I noticed the following description:
GP2 is the default EBS volume type for Amazon EC2 instances. These
volumes are backed by solid-state drives (SSDs) and are suitable for a
broad range of transactional workloads, including dev/test
environments, low-latency interactive applications, and boot volumes.
GP2 is designed to offer single-digit millisecond latencies, deliver a
consistent baseline performance of 3 IOPS/GB to a maximum of 10,000
IOPS, and provide up to 160 MB/s of throughput per volume.
It is that 3 IOPS/GB quantity that is worrying me. What does this mean in practical terms? Suppose that you need an e-commerce site for a small amount of users (e.g. < 10,000 requests per minute) and these images need to be retrieved. Amazon describes how IOPS are measured:
When small I/O operations are physically contiguous, Amazon EBS
attempts to merge them into a single I/O up to the maximum size. For
example, for SSD volumes, a single 1,024 KiB I/O operation would count
as 4 operations, while 256 I/O operations at 4 KiB each would count as
256 operations.
Does this actually mean that if I want to retrieve 50 images of 10kB each in under a second, I would require 50 IOPS and easily exceed the baseline of 3 IOPS?
UPDATE:
Thanks to Mark B's suggestion, I was able to use S3 to upload my files. However, I'm still wondering about the amount of IOPS needed to perform common tasks such as running a database or serving other files for a web application. I would be glad to hear some reference values regarding the minimal values of IOPS based on your experience.

You are missing the "/GB" part of that statement. The baseline is 3 IOPS per GB. If your EBS volume is 100GB, then you would have a baseline of 300 IOPS. For a GP2 EBS volume you have to multiple the size of the volume by 3 to get the IOPS.
Note that any GP2 volume under 1TB is also able to burst at up to 3,000 IOPS, so any limited increases in IO should still perform very well.
Also, I will add that S3 sounds like a better fit for your use case. If you are seeing slow upload speeds to S3, that is a problem that can be solved. You can use CloudFront to provide a nearby edge location that you can upload to.
In my experience uploads to S3 are never any slower than uploads to an EC2 instance that your EBS volume would be attached to.
Update:
To answer your additional question, the minimum IOPS needed will depend on many variables such as the amount of RAM available, the type of application you are running, how well the application caches values in memory, the average size of your IO operations, etc. It's really difficult to pin-down an exact number and state that you need exactly X IOPS for an application.
You also need to remember that any volume under 1TB in size can still burst up to 3,000 IOPS for several seconds. So even if your application needs high IOPS when it is in use, if it doesn't see much usage the IOPS burst feature might be all it ever needs.
In general I usually start with something like a 100GB volume with 300 IOPS and test the performance of my app against that. A web server that operates entirely within RAM might never need more than that. For something like a database you would probably start out with the amount of disk space you think you will need and then start performance testing. CloudWatch will show the amount of IOPS your application is using, and if you see it maxing out at the limits of your volume then you would know you need to increase the available IOPS. Rinse and repeat until you no longer max out the available IOPS during your performance tests.

#Mark B's answer is probably correct, in that it points out your IOPs are based on the size of your EBS volume. For what you want, S3 is the best option.
But depending on your use case and requirements, EBS may be needed. This is especially true if you want to run a database. In that case, you have a couple of options.
You can get Provisioned IOPS - if you know you need 5000 IOPS, but only need say 100GB of storage (which with gp2 would normally provide you with around 300 IOPS), you can use io1 volumes. There is an extra cost to this, and you'll want to make sure that it's attached to an EBS optimized instance, but you can get up to 20k IOPS if needed.
If you're doing a lot of sequential reads (reading in a large data set?) then there's a new type of EBS, st1. This is good for 500MB/s, and is less than 1/2 the cost of gp2.
Finally, there's one other scenario you could consider (say, you're a bit of a madman, and want to try doing strange things). If you can grab an archive from somewhere, and all you care about is serving them up from a really fast file system, you could put them on an instance that has instance storage. This is a locally-attached SSD, so it's very fast. The only drawback is that when your instance stops, you data is gone.
To address your update, "how many IOPS do you need for a database", the answer is "it depends". Every database engine has different requirements, and every database use has different usage patterns. Take a look at this if you want more information. But basically, test & monitor. If you're worried, over provision at launch, and scale down as needed. Or take a guess, and increase if you run into problems - is it more important to minimize costs, or provide good performance to your end users?

As per your use case, s3 is a better option but if one wants to use an EBS volume and thinks that they require more IOPS, they can choose gp3 volume type instead of gp2. In gp3 volume, one can increase upto 16,000 IOPS independent of throughput (also, throughput can be increase upto 1000 MiB/s independently of IOPS).

General Purpose SSD (gp2) volumes offer cost-effective storage that is ideal for a broad range of workloads. These volumes deliver single-digit millisecond latencies and the ability to burst to 3,000 IOPS for extended periods of time. Between a minimum of 100 IOPS (at 33.33 GiB and below) and a maximum of 16,000 IOPS (at 5,334 GiB and above), baseline performance scales linearly at 3 IOPS per GiB of volume size. AWS designs gp2 volumes to deliver 90% of the provisioned performance 99% of the time. A gp2 volume can range in size from 1 GiB to 16 TiB.
link:
Link
Sometimes performance also varies:
According to AWS Doc, instance types can support maximum performance for 30 minutes at least once every 24 hours. If you have a workload that requires sustained maximum performance for longer than 30 minutes, select an instance type according to baseline performance
link:
Link

When should I use a t2.medium vs. a m3.medium instance type within AWS?

They appear to be approximately the same in terms of performance.
Model vCPU Mem (GiB) SSD Storage (GB)
m3.medium 1 3.75 1 x 4
Model vCPU CPU Credits / hour Mem (GiB) Storage
t2.medium 2 24 4 EBS-Only
t2.medium allows for burst-able performance whereas m3.medium doesn't. t2.medium even has more vCPU (1 vs 2) and memory (3.75 vs 4) than the m3.medium. The only performance gain is the SSD w/a m3.medium, which I recognize could be significant if I'm doing heavy I/O.
Would this be the only scenario where I would choose an m3.medium over a t2.medium?
I'd like to run a web server that gets 20-30k hits a month so I suspect either is okay for my needs, but what's the better option?

30000 hits per month is on average a visitor every 90 seconds. Unless your site is highly atypical, load on the server is likely to be invisibly small. Bursting will handle spikes up to hundreds (or thousands, with some optimizations) of visitors.
With appropriate caching, a VPS server of comparable specs to a t2.micro can serve a Wordpress blog with 30000 hits PER MINUTE. If you were saturating that continuously, you couldn't rely on burst performance for the t2.micro, of course. A t2.medium is roughly 4x as powerful in all regards as a micro, and a m3.medium has similar RAM and bandwidth but less peak CPU.
The instance storage will be a few times faster than a large EBS GP2 (SSD) volume on the m3.medium, of course. The t2 & c3 medium instances will both have roughly 300-400 Mbit/s network bandwidth, t2.micro gets ~60-70 Mbit.
One benchmark shows that t2.medium in bursting mode actually beats a c3.large (let alone the m3.medium, which is less than half as powerful, at 3 ECU vs 7).
But as noted, you can probably save money by using something less powerful than either of your suggestions and still have excellent performance.
If you don't need the power to completely configure your server, shared hosting or a platform-as-a-service solution will be easier. I recommend OpenShift, because they explicitly suggest a single small gear for up to 50k hits a month. You get 3 of those for free.
If you do need to configure the server, you really only need enough memory to run your server and/or DB. A t2.nano has 512 MB, and a t2.micro has 1 GB. The real performance bottlenecks will probably be disk I/O and network bandwidth. The first can be improved with a larger general-purpose SSD volume (more IOPS), the second by using multiple instances and an ELB.
Make sure you host all static assets in S3 and use caching well, and even the smaller AWS instances can handle hundreds of requests per second.
Basically: "don't worry about it, use the cheapest and easiest thing that will run it."

Although the "hardware" specs look similar for the T2.medium instance and the M3.medium instance, the difference is when you consider Burstable vs. Fixed Performance. See this link from Amazon Web Services:
http://aws.amazon.com/ec2/faqs/#burst
The following quote comes from that link:
Q: When should I choose a Burstable Performance Instance, such as T2?
Workloads ideal for Burstable Performance Instances (e.g. web servers, developer environments, and small databases) don’t use the full CPU often or consistently, but occasionally need to burst. If your application requires sustained high CPU performance, we recommend our Fixed Performance Instances, such as M3, C3, and R3.
A T2 instance accrues CPU credits, but only as long as it runs. If it is stopped or terminated, the credits accrued are gone.
There is an important piece of information further down the page concerning the CPU credits for the T2 instances:
Q: What happens to CPU performance if my T2 instance is running low on credits (CPU Credit balance is near zero)?
If your T2 instance has a zero CPU Credit balance, performance will remain at baseline CPU performance. For example, the t2.micro provides baseline CPU performance of 10% of a physical CPU core. If your instance’s CPU Credit balance is approaching zero, CPU performance will be lowered to baseline performance over a 15-minute interval.
This means if you run out of burstable credits, your performance will be limited to a fixed percentage of a single core until you accrue more; 10% for T2.micro, 20% for T2.small, and 40% for T2.medium.
Another important difference that the OP mentions is the M3.medium instance can be provisioned with 4GB of ephemeral storage, which has much greater I/O capacity than persistent, Elastic Block Storage (EBS). T2 instances do not have this option.
Finally, it depends on what a "hit" is. In my opinion, if a hit means a few static page downloads that are less than 64k or small dynamic pages, then I'd explore the T2 option. For longer sessions, more data traffic, or higher numbers of concurrent users, I'd consider the M3. And if performance over an extended time period is a key issue, I think you're definitely in M3 land.
Look at the logs for your present site or a site similar to what you're setting up and determine which situation you're in.

Benchmark your application on both and determine the right fit for you. That's the only way to know for sure. The "better option" is dependent on how your application runs and your cost requirements.
Alternatively, you could simply choose one, based on cost or other criteria, and if it's insufficient, or overly sufficient, then change the instance type to the other.

What AWS disk options should I use for my EC2 instance?

Created a new Ubuntu c3.xlarge instance and when I get to storage options I get the option to change ROOT to General Purpose SSD, Provisioned IOPS or magnetic, also if I pick Provisioned IOPS i can set another value. Additional data storage under Instance Store 0 has no options but if change to EBS then I have the same options.
I'm really struggling to understand:
The speed of each option
the costs of each option
The Amazon documentation is very unclear
I'm using this instance to transfer data from text files into a Postgres relational database, these files have to be processed line by line with a number of INSERT statements per line so is slow on my local computer (5 million rows of data takes 15 hours). Originally the database was separately on RDS but it was incredibly slow, so I installed the database locally on the instance itself remove network latency which has speed up things a bit but it is still considerably slower than my local humble linux server.
Looking at the instance logs whilst loading the data CPU instance is only at 6% so now thinking that disk may be limiting the factor. The database will be using the / (Not sure if SSD or magnetic - how can I find out) disk and the data files are on the /mnt (using Instance Store 0) disk.
I only need this instance to do two things:
Load database from datafiles
Create Lucene Search Index from database
(so the database is just an interim step)
The Search Index is transferred to an EBean Server and then I don;t need this instance for another month when I then repeat the process with new data so with that in mind I can afford to spend more money for faster processing because I'm only going to use 1 day a month, then I can stop the instance and incur no further costs ?
Please what can I do to determine the problem and speed things up ?

Here is my personal guideline:
If the volume is small (<33G) and only require a eventual burst in performance, such as a boot volume, use magnetic drives.
If you need predictable performance and high throughput, use PIOPS volumes and EBS optimized instances.
Otherwise, use General Purpose SSD.

Your CPU is only at 6%, maybe you can try to use multi-process?
Did you test your remote instance's volume's I/O performance?
PIOPS is expensive, but it did not significantly better than gp2, the only advantage is stable.
For example, I create a 500G gp2 and a 500G PIOPS with 1500IOPS, then I try to insert and find 1,000,000 documents by mongodb, then I check the io performanace by such as mongoperf/iostat/mongostat/dstat
Each volume's iops performance is expect to 1500,
but gp2's iops is unstable, almost from 700 to 1600(r+w), if only read, it can brust to 4000, if only write, it just reach 800.
piops is perfect stable, it iops is almost 1470.
To your situation, I suggest to consider about gp2 (volume size depend on your iops demand, 500G gp2 = 1500iops, 1T gp2 = 3000iops(maximum))

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js