I'm looking to reduce my production RDS cost by converting from GP2 to magnetic. Here's an IOPS graph to show how little IOPS my RDS is consuming versus the amount of storage allocated (2TB).
The RDS has an instance type of t2.xlarge with multi-AZ, so you can imagine the amount of workload for this kind of instance. Questions I would like to ask:
Is there any real world cases where magnetic storage is used for similar workload like this?
Assuming it's fine to use magnetic storage for production RDS, I understand that IOPS will be depleted during conversion. However since I have multi-AZ enabled, will the conversion take place at the standby RDS, before it's promoted to become the new master RDS?
Related
I've got an EBS volume (16GB) attached to a EC2 instance that has full access to an RDS instance. The thing is I've extracted the DB to the RDS instance, so I don't use the EC2 instance for storing the web application database anymore. I did this because I was having a lot of problems with the EBS credits (they were consuming very quickly). I thought that by having the DB on a separate instance (RDS) this will decrease to almost cero the EBS credit consumption because I'm not reading nor writing on the EBS but on the RDS. However, the EBS credits keep consuming (and decrease to 0) every time users access to the web application and I don't understand why. Perhaps is because I still don't fully understand how EBS credit usage works... Can anyone enlighten me with this? Thanks a lot in advance.
You can review volume types including info on their burst credits here. You should also review I/O Characteristics and Monitoring. From that page:
If your I/O latency is higher than you require, check
VolumeQueueLength to make sure your application is not trying to drive
more IOPS than you have provisioned. If your application requires a
greater number of IOPS than your volume can provide, you should
consider using a larger gp2 volume with a higher base performance
level or an io1 volume with more provisioned IOPS to achieve faster
latencies.
You should review that metric and the others it mentions if this is causing you performance problems. If your IOPs are constantly above your baseline and causing them to queue you will always consume credits as fast as they are given.
There is only one apparently abstract upside of using Lightsail, simplicity, or significantly simplified interface.
Also, the first page of Lightsail talks about lower charges.
My question is how is it considered to reduce charges compared to EC2? Consider $5 Lightsail plan which charges $0.0067/hour of an instance (which is the cheapest) where EC2's same type of instance (t2.nano) costs just $0.0059/hour.
What am I missing? A detailed price comparison would be much appreciated showing how Lightsail costs lower as advertised.
The $5 Amazon LightSail plan includes:
A CPU that appears similar to a t2.nano ($0.0059c/hr in US regions) = approximately $4.25/month
20GB SSD storage, similar to Amazon EBS general purpose SSD (10c/GB/month) = $2/month
1TB data transfer (9c/GB = approximately $92 in US regions)
So, the real saving appears to be in Data Transfer.
Also with lightsail you get a static IP built into the price, with ec2 it’s about $4/month
I need to deploy Cassandra on AWS but am confused as to what type of AWS storage is most suitable for Cassandra.
The Datastax documentation here:
http://docs.datastax.com/en/cassandra/3.0/cassandra/planning/planPlanningEC2.html
says that EBS volumes are recommended. At the same time the Datastax AMI documentation:
http://docs.datastax.com/en/cassandra/2.1/cassandra/install/installAMI.html
says that:
Uses RAID0 ephemeral disks for data storage and commit logs.
Launches EBS-backed instances for faster start-up, not database
storage.
So which one is the recommended storage type for Cassandra? The EBS storage or the Instance storage?
Many of the new eC2 instances are EBS only (http://www.ec2instances.info/) I am not sure when the cassandra document was written but EBS disk have improved a lot recently and amazon launches new type frequently, so you will be able to find what you're looking for with one of the type
You can check https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSVolumeTypes.html?icmpid=docs_ec2_console and its recommended Provisioned IOPS SSD (io1)
To add a reason why AWS is moving to EBS and why it would be good for cassandra data is because of ephemeral type of data, you might not want your data to disappear if your instance is terminated (because of a crash or a stop you made) at least when your instance is gone, you still have access to your data and can attach the EBS volume to a new instance (really useful also when up/down-grading instances)
I came upon this presentation, which clearly answers the question with a very interesting use case:
https://www.youtube.com/watch?v=1R-mgOcOSd4
To summarize:
EBS has changed a lot since 2011 when major companies like Netflix
had problems with it.
EBS and GP2 are now the recommended storage
for Cassandra and you should not expect any bottlenecks there.
Datastax have recently updated their documentation to also recommend
EBS:
http://docs.datastax.com/en/cassandra/3.0/cassandra/planning/planPlanningEC2.html
No doubt EBS,
Memory optimized boxes are best suited for cassandra
T2
T2 are Burstable Performance Instances that offer a baseline level of CPU performance with the capability to burst above the baseline
M4
M4 instances are the most recent general-purpose instances. The M4 family of instances offers a balance of memory, network, and compute resources, and it is a better option for several applications
C4
These instances are recent additions to the compute-optimized instances that feature maximum performance processors with the lowest compute/price performance in EC2 Instance types.
X1
These instances are best suited for enterprise-class, large-scale, in-memory applications and offer the lowest price for each GiB of RAM among AWS EC2 instance types. The X1 instances are the latest addition to the EC2 memory-optimized instance group and are intended for executing high-scale, in-memory databases and in-memory applications over the AWS cloud.
for pricing and other information
https://aws.amazon.com/ec2/instance-types/
I am trying to understand the fundamental differences between several different types of storage available on AWS, specifically:
SSD
Magnetic
"Provisioned IOPS"
Snapshot storage
I am stunned to find no clear definition of each of these in the AWS docs, so I ask: How are these storage types different, and what use cases/scenarios are appropriate for each?
You're referring to Elastic Block Store (EBS). EBS provides persistent block level storage volumes for Amazon EC2 instances. EBS volumes come in 3 types:
Provisioned IOPS (SSD)
General Purpose (SSD)
Magnetic
Each type has different performance characteristics and costs. See EBS volume types for more details. The list above is ordered from high to low, by both price and by potential IOPS.
EBS snapshots are something else entirely. All EBS volumes, regardless of volume type, can be snapshotted and durably stored.
Instance storage options:
Magnetic - Slowest/cheapest magnetic disk backed storage
SSD - Faster/more expensive solid state backed storage
"Provisioned IOPS" - FastEST/most expensive but guaranteed (at the physical level) speed of input/output operations per second.
from Google:
IOPS (Input/Output Operations Per Second, pronounced eye-ops) is a common performance measurement used to benchmark computer storage devices like hard disk drives (HDD), solid state drives (SSD), and storage area networks (SAN).
This link has more fine grained details on SSD/Magnetic disk comparisons, granted it seems geared towards databases.
Snapshots are backups and are entirely separate from AWS 'hard drive' offerings.
The High I/O instance in EC2 uses SSD. How does one run a database on such an instance while guaranteeing persistance of data?
From my limited understanding, I'm suppose to use Elastic Block Store (EBS) so that even if the machine goes down the data on the disk doesn't disappear. On the other hand the instance storage SSD of a High I/O instance is ephemeral and can't be used for database storage because if, for example, the machine loses power the data image isn't preserved. Is my understanding correct?
Point 1) If your workloads need High IO SSD for DB, then you should have Master Slave setup. Ideally 1 master and 2 slaves spread across 3 AZ's is suggested. Even if there is an outage on single AZ the alternate AZ's can handle the load and serve your High availability needs. Between master - slave you can employ synchronous, semi or async replication depending upon your DB. This solution is costlier.
Point 2) Generally if your DB is OLTP in nature, then Amazon EBS PIOPS + EBS optimized gives you consistent IOPS. A Single EBS Volume can provide 4000 IOPS and you can RAID 0 multiple volumes and gain 10k+ IOPS for performance. Lots of customers are taking this route in AWS. Even though you may use EBS for persistence, it is still recommended to go with Master-Slave architecture for High Availability. I have written detailed articles on this topic in blog, refer them for more information.
It is the same as other ephemeral storage, it does not guarantee persistence. Persistance is handled by replication between instances with at least one instance writing to an EBS volume.
If you want your data to persist, you're going to need to use EBS. Building a database on an ephemeral drive, regardless of performance, seems a dubious design choice.
EBS now offers 4K IOPS volumes, which is, depending on your database requirements, quite possibly more than sufficient.
My next question would really be: Do you want to host/run your own database?
Turnkey products such as RDS and DynamoDB may be sufficient for your needs. Using them is much easier than setting up and managing your own database. RDS is now advertising "You can now provision up to 3TB and 30,000 IOPS per DB Instance". That's enough database horsepower for many, many problem sets.