How to configure MariaDB(self hosted in EC2) storage on an EBS volume? - amazon-web-services

I have installed MariaDB on an EC2 instance. EC2 instance is small in size as our product is in development phase and we have very less traffic.
I want to keep an EBS volume for MariaDB data storage, so that as our traffic increases we can attach the EBS volume to a bigger EC2 instance and my data get transferred automatically.
Is keeping data storage on EBS(network) is a good approach? Will it not lead to more latent read/write?
If it won't impact my performance, what is the way to configure it?

An EC2 instance is made of 2 things - Compute & Storage.
Storage is basically an EBS volume (kind of Hard Disk of your laptop) that includes your OS files as well as other files for the software installed on your EC2.
Since, you are running your database on an EC2 instance, this EBS volume will also store your MariaDB database files on the same EBS where your OS is installed. It is not like, you can store MariaDB data on some other EBS.
So, when your data grows, you can expand EBS volume of your existing EC2 instance. You can follow this tutorial to learn how to do that - Increase the size of an Amazon EBS volume on an EC2 instance
Second thing is about performance. If your traffic increases, and you feel, you need more processing power (RAM or CPU), in that case you can upgrade your instance family. Refer this to see how to do that - Change the instance type.
Recommendation
As you will be spending a fixed monthly cost to run the EC2, my recommendation would be to use Aurora Serverless v2 with minimum capacity you need initially. Later, it can automatically scale itself when the traffic grows.

Related

AWS EC2 Local Storage Volumes/nvme volume

I am using "m5d.8xlarge" ec2 instance, which comes ready with 2*600G SSD Volumes, directly attached. They are appearing on the OS, however no mention on the console, as I can't retrieve any info about them.
And it is showing as well the serial of the volumes as AWS-*** not as normal EBS volumes vol***.
I read that these are ephemeral or something; I want to have any AWS official docs that thoroughly explain how this local storage works, as we are hosting prod workload on it, appreciate if someone can explain or provide docs.
"m5d.8xlarge" ec2 instances comes with 2 ephimeral storage which are instance store volume.
Instance store volumes (docs) are directly attached to underlying hardware to reduce latency and increase IOPS and data throughput.
However there is a caveat, if you ec2 instance is terminated,stops, hibernated or stopped or underlying hardware gets shutdown due to some glitch all the data stored on on these ephemeral storage will be lost.
Generally instance store volumes are used for buffer,cache.
In order to confirm you can follow this https://aws.amazon.com/premiumsupport/knowledge-center/ec2-linux-instance-store-volumes/ :-
ssh into ec2 instance
install nvme-cli tool -> sudo yum instal nvme-cli
sudo nvme list - to list all instance store volumes
if you want data to persist you should go for EBS or EFS
EBS docs, EFS docs
In short If you want to access data with super low latency and you can afford to loose data go for instance store but if it is business critical data for example database workload go for EBS, YOu can still achieve very high IOPS and throughput using IO1,IO2 volume types or if you have a want to go even further use nitro ec2 instance type which gives maximum 64000 IOPS.
Play with EBS volume types to increase IOPS and throughput https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-volume-types.html

AWS: i want to store all my data on EBS partition. Is it a good practise or should store on S3

I am new to AWS and trying to understand it.
Presently i have a small Django App.
I am planning to have an EC2 instance of type t3.small (2 cpus and 2 GB ram)
I will have the following EBS storages:
1) EBS root system: 10GB (mainly used for OS and other config files)
2) Attached EBS: 20GB (mainly used to store all the code and images and videos etc)
I will keep taking snapshot of both the EBS volumes daily.
I want my images and videos only accessible from the EC2 instance.
So Is this approach a good approach. or i have to use S3/
I'll go for storing in S3 for many reasons :
Duarbility : S3 is 99,99999999 (11x9). Data is replicated in all AZ of the selected region. You can also easily setup a Cross Region Replication for better data durability and security.
Price: You pay only for the disk space you're using it. You don't have to specify a volume size like with EBS.
Scalibility : You add data as you go and the bucket will scale to infinite.
If your application become successful and you have more traffic you should go to a horizontal scaling using AutoScaling and LoadBalancer and that means multiple EC2 instances. In that case it will be so easier to access Data on S3 instead on an EBS attached to an instance.

Where OS and its settings are stored in EC2 instance

Using ec2 Windows instance with Instance storage (let's say 32GB SSD) - where OS and its settings are stored? Like Program Files, User profiles. Are they all stored on Instance Storage? As far as I understood from other topics Instance storage is not-persistent and doesn't survive shutdowns/terminations. Does that mean I will lose everything under C: drive if I turn it off?
Can I use EBS storage as a default storage for OS (C drive)? Can I map multiple EBS storages to one Windows storage?
If above is true, then I will be charged for the capacity used by OS on EBS instance? It would be around 20GB I believe. Is that correct?
I am quite new in aws, and before paying for such instances or EBS I would like to know how this technical and billing model is working.
Thank you!
The Storage for the Root device is dependent on the AMI (EBS-Backed or Instance Store-Backed) used to launch the instance.
As far as I understood from other topics Instance storage is
not-persistent and doesn't survive shutdowns/terminations.
If the Root storage device is Instance Store, Stopping (shutdown) the instance is not possible. On termination, Both the storage and Instance does not survive. The Instance does not survive once terminated even if the AMI is EBS-Backed, but you can persist the Root Volume by setting the DeleteOnTermination flag set to False.
Does that mean I will lose everything under C: drive if I turn it off?
You cannot turn off (shutdown) an Instance Store-backed instance.
Can I use EBS storage as a default storage for OS (C drive)?
Yes, Choose an EBS backed Windows AMI.
Can I map multiple EBS storages to one Windows storage?
Yes, multiple EBS Volumes can be attached to one EC2 Windows Instance.
If above is true, then I will be charged for the capacity used by OS
on EBS instance?
You will be charged for the total size of the EBS volumes attached to the instance including the Root Device.
It would be around 20GB I believe. Is that correct?
The EBS Volume Size is adjustable. The upper Size limit is 16TiB.
Read Storage for Root Device and Ec2 Root Device Volume
Please spend more time on the AWS documentation, I don't think here is enough to cover all your question.
Only for specify EC2 instance come with attached SSD storage AKA instance storage. Bare in mind that, this instance storage doesn't come with Snapshot capabilities, so you must backup the file yourself. This is mean for people who need fastest disk access to process their data.
Only EBS allow you do multiple snapshot.
You can always create an AMI image for your instance after complete the deployment. AMI image is store inside EBS, so you will not lost the initial instance if you do this, so for new instance, you just trigger load it from AMI.
If you "Terminate" an instance, it will delete the virtual image. There is no way to recover it even with EBS, unless you make a snapshot. However, attached EBS storage will not be deleted.
EBS is calculate by Per GB and give you 1GB x 3 IOPS, with base 100 IOPS given. This is not enough if anyone want to carry out disk I/O intensive task.

Working with ECS container instance without the EBS

I am using the free tier of AWS. I am experimenting with ECS and am following the article http://docs.aws.amazon.com/AmazonECS/latest/developerguide/launch_container_instance.html to create an ECS instance. One this I noticed is that using the community image amzn-ami-2016.03.e-amazon-ecs-optimized adds an EBS volume which cuts into my free tier usage. My question is, is this EBS volume required and can I do it without the EBS volume?
Any EC2 instance would need a Root volume at the very least to start the OS. All volumes in AWS are EBS volumes. So if you were wondering if you can have an EC2 instance without EBS, I don't think that is possible.
However, you can still reduce your EBS cost. It costs 10 cents per GB per month for an EBS volume. If you would notice, all Amazon ECS optimized EC2 instances are configured to use 30GB of EBS volume storage. That means you pay $3.00 extra per EC2 instance for a month! 8 out of that 30 GB is for Root, and 22 out of 30GB is for docker use.
Source:
https://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs-ami-storage-config.html
By default, the Amazon ECS-optimized Amazon Linux AMI ships with 30
GiB of total storage. You can modify this value at launch time to
increase or decrease the available storage on your container instance.
This storage is used for the operating system and for Docker images
and metadata. The sections below describe the storage configuration of
the Amazon ECS-optimized Amazon Linux AMI, based on the AMI version.
Of course, you don't need the full 8GB for Root and full 22GB for docker. So you can lower your cost by reducing the size of those volumes to say 2GB for Root and 2GB for docker use. Then you would be paying 40 cents per month, and not $3.00.
Reducing volume size, as far as I know, is not easy.
Since that is out of scope for this question, I will just provide this link for interested parties.
Now that you are aware that there are 2 volumes used by ecs optimized instances, there is a way for you to NOT use the 22GB volume at all, and simply use the Root volume for docker storage. This too is not easy but can be done by creating your own AMI with docker and ecs agent installed. Then you will have to configure your docker to use the Root volume instead of the other one. Here is a thread which slightly discussed this issue.
For AWS ECS there is no additional charge for Amazon EC2 Container Service. You pay for AWS resources (e.g. EC2 instances or EBS volumes) you create to store and run your application. Free tier in AWS https://aws.amazon.com/free/ only Amazon EC2 Container Registry is part of free tier which offers 500 MB for storage.
And also if you are creating ECS containers from amzn-ami-2016.03.e-amazon-ecs-optimized AMI the volumes will be EBS so you will have to pay for EBS volumes.

Recommended AWS storage type for Cassandra?

I need to deploy Cassandra on AWS but am confused as to what type of AWS storage is most suitable for Cassandra.
The Datastax documentation here:
http://docs.datastax.com/en/cassandra/3.0/cassandra/planning/planPlanningEC2.html
says that EBS volumes are recommended. At the same time the Datastax AMI documentation:
http://docs.datastax.com/en/cassandra/2.1/cassandra/install/installAMI.html
says that:
Uses RAID0 ephemeral disks for data storage and commit logs.
Launches EBS-backed instances for faster start-up, not database
storage.
So which one is the recommended storage type for Cassandra? The EBS storage or the Instance storage?
Many of the new eC2 instances are EBS only (http://www.ec2instances.info/) I am not sure when the cassandra document was written but EBS disk have improved a lot recently and amazon launches new type frequently, so you will be able to find what you're looking for with one of the type
You can check https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSVolumeTypes.html?icmpid=docs_ec2_console and its recommended Provisioned IOPS SSD (io1)
To add a reason why AWS is moving to EBS and why it would be good for cassandra data is because of ephemeral type of data, you might not want your data to disappear if your instance is terminated (because of a crash or a stop you made) at least when your instance is gone, you still have access to your data and can attach the EBS volume to a new instance (really useful also when up/down-grading instances)
I came upon this presentation, which clearly answers the question with a very interesting use case:
https://www.youtube.com/watch?v=1R-mgOcOSd4
To summarize:
EBS has changed a lot since 2011 when major companies like Netflix
had problems with it.
EBS and GP2 are now the recommended storage
for Cassandra and you should not expect any bottlenecks there.
Datastax have recently updated their documentation to also recommend
EBS:
http://docs.datastax.com/en/cassandra/3.0/cassandra/planning/planPlanningEC2.html
No doubt EBS,
Memory optimized boxes are best suited for cassandra
T2
T2 are Burstable Performance Instances that offer a baseline level of CPU performance with the capability to burst above the baseline
M4
M4 instances are the most recent general-purpose instances. The M4 family of instances offers a balance of memory, network, and compute resources, and it is a better option for several applications
C4
These instances are recent additions to the compute-optimized instances that feature maximum performance processors with the lowest compute/price performance in EC2 Instance types.
X1
These instances are best suited for enterprise-class, large-scale, in-memory applications and offer the lowest price for each GiB of RAM among AWS EC2 instance types. The X1 instances are the latest addition to the EC2 memory-optimized instance group and are intended for executing high-scale, in-memory databases and in-memory applications over the AWS cloud.
for pricing and other information
https://aws.amazon.com/ec2/instance-types/