How to automate EC2 Instance snapshots? - amazon-web-services

How can I automate EC2 instance snapshots every X time?
By snapshot, I mean an image of all data and state and configuration of the virtual machine, so I can recover it quickly. Is there an AWS service for this purpose? What's the best way?
My EC2 instance type is m5.2xlarge

You may want to investigate a service called AWS Backup. I have only read abou it, as it's relatively new, and I had imlemented a custom solution using a Lambda function before it became available,
If I were to do it again, I'd use AWS Backup.

Some of the options are;
Use AWS Back Up service
OR
Schedule AWS Systems Manager Automation AWS-CreateImage or AWS-CreateSnapshot (choose either of these or both depends upon your use-cases) using AWS EventBridge or AWS CloudWatch

AWS has an service called Lifecycle Manager in EC2 dashboard. With the help of it, you can automate the backup procedure of the EBS volumes.
You can define backup and retention schedules for EBS snapshots by
creating lifecycle policies based on tags defined for volumes.
With this feature, you no longer have to rely on custom scripts to
create and manage your backups.
This feature is now available in the US East (N. Virginia), US
West (Oregon), and Europe (Ireland) AWS regions at no additional
cost.

You mention "all data and state and configuration of the virtual machine". This really consists of two parts:
Contents of the disks
Configuration of the Amazon EC2 virtual machine ("instance")
Configuration of virtual machine
Backups normally only consist of the contents of the disks. The configuration of the virtual machine is specified when a replacement machine is launched, including:
Instance Type
Subnet & IP Address
Tags
IAM Role
It is not possible to "back-up" these settings, but you could create a CloudFormation template that launches an instance with matching settings. This can then become a repeatable, automated process.
Contents of the disks
The easiest way to backup the contents of the disks to allow the launch of an equivalent instance would be to create an Amazon Machine Image (AMI). The AMI contains a copy of all disks connected to the instance.
A new Amazon EC2 instance can then be launch from the AMI, and it will contain exactly the same data on the disks. (An AMI consists of Amazon EBS Snapshots, plus some metadata about the instance configuration. A new instance can be launched from an AMI, but not from an EBS Snapshot.)
If you wish to automate the regular creation of an AMI, you can use the Amazon Data Lifecycle Manager.
See: New – Lifecycle Management for Amazon EBS Snapshots | AWS News Blog
I also recommend that you test the backup by launching a new EC2 instance from the AMI.

Related

How to take a backup of EC2 instance in AWS and move to a low cost alternative?

We have an EC2 instance running in AWS EC2 instance. We have our ML algorithms and data that. We have also hosted a web-based interface also in that machine.
Now there are no new developments happening in that EC2 instance. We would like to terminate AWS subscription for a short period of time (for the purpose of cost-reduction and exploring new cloud services). Most importantly, we want to be in a position where we can purchase a new EC2 instance with a fresh AWS subscription, use the backup which we take now, and resume all operations (web-backend, SMS services for our app which is hosted in AWS, etc.).
What is the best way to do it? Is temporary termination of AWS subscription advisable?
There is no concept of an "AWS Subscription". AWS is charged on-demand, which means you only pay when you use resources.
If you temporarily do not want the Amazon EC2 instance, you could:
Stop the instance, which is like turning off the power. You will not be charged for the instance, but you will still pay for the disk storage attached to the instance. You can simply Start the instance again when you wish to use it. You will only be charged while the instance is running. OR
Create an image of the instance, then terminate the instance. This will create an Amazon Machine Image (AMI), which contains a copy of the disks. You can then launch a new Amazon EC2 instance from the AMI when you wish to use it again. This is a lower-cost option compared to simply stopping the instance, but it takes more effort to stop/start.
It is quite common for companies to stop Amazon EC2 instances at night or over the weekend to reduce costs while they are not needed.
EDIT: Just thought of a third option. Will test it and be back. Not worth it; it would involve creating an image from the EC2 instance and then convert that image to a VM image, storing the VM image in S3. There may be some advantages to this, but I do not see them.
I think you have two options, both of them very reasonably priced. If you can separate the data from the operating system, then your best option would be to use an S3 bucket as a file system within the EC2 instance. Your EC2 instance would use this bucket to store all your "ML algorithms and data" and, possibly, even your "web-based interface". Whenever you decide that you no longer need the processing capacity of the EC2, you would unmount the S3 bucket file system from the EC2 instance and terminate that instance. After configuring an appropriate lifecycle rule for the S3 bucket, it would transition to Glacier, or even Glacier Deep Archive [you must considerer the different options of long term storage]. In the future, whenever you want to work with your data again, you would move your data from Glacier back to S3, create a new EC2 instance, install your applications, mount your S3 bucket as a file system and you would have access to all your data. I think this is your least expensive and shortest recovery time objective option. To implement this option, look at my answer to this question; everything you need to use an S3 bucket as a regular folder inside the EC2 instance is there.
The second option provides an integrated solution, meaning the operating system and the data stay together, and allows you to restore everything as it was the day you stopped processing your data. It's made up of the following cycle:
Shutdown your EC2 and make a note of all the specs [you need them further down].
Export your instance to a virtual image, vmdk for example, and store it in your S3 bucket. Something like this:
aws ec2 create-instance-export-task --instance-id i-0d54b0682aa3998a0
--target-environment vmware --export-to-s3-task DiskImageFormat=VMDK,ContainerFormat=ova,S3Bucket=sm-vm-backup,S3Prefix=vms
Configure an appropriate lifecycle rule for the S3 bucket so that it transitions to Glacier, or even Glacier Deep Archive.
Terminate the EC2 instance.
In the future you will need to implement the inverse, so you will need to restore the archived S3 Object [make sure you you can live with the time needed by AWS to do this]
Import the virtual image as an EC2 AMI, something like this [this is not complete - you will need some more options that you saved above]:
aws ec2 import-image --disk-containers
Format=ova,UserBucket="{S3Bucket=sm-vm-backup,S3Key=vmsexport-i-0a1c382e740f8b0ee.ova}"
Create an EC2 instance based on the image and you're back in business.
Obviously you should do some trial runs and even automate the entire process if it's something that will be done frequently. I have a feeling, based on what you said, that the first option is a better option, provided you can easily install whatever applications they use.
I'm assuming that you launched an EC2 instance from a base Amazon Machine Image and then added your own software and models to it. As opposed to launched an EC2 instance from an AWS Marketplace offering.
The simplest thing to do is to create an Amazon Machine Image (AMI) from your running EC2 instance. That will capture the current state of the instance and persist it in your AWS account. Then you can terminate the instance. Later, when you want to recreate it, launch a new instance, selecting the saved AMI instead of a standard AMI.
An alternative is to avoid the need to capture machine state at all, by using standard DevOps practices to revision-control everything you need to recreate the state of a running machine.
Note that there are costs associated with an AMI, though they are minimal ($0.05 per GB-month of data stored, for example).
I had contacted AWS customer care regarding this issue. Given below is the response I received. Please add your comments on which option might be good for me.
Note: I acknowledge the AWS customer care team for their help.
I understand that you require some information on cost saving for your
Instance since you will not be utilizing the service for a while.
To assist you with this I would recommend checking out the Instance
Stop/Start link here:
==>https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/Stop_Start.html .
When you stop an Instance, you do not lose any data & you are not
charged for the resources any further. However please keep in mind
that you will still be charged for any EBS Storage Volumes attached to
the stopped Instance(s).
I also recommend checking out the below links on how you can reduce
your costs.
==>https://aws.amazon.com/premiumsupport/knowledge-center/reduce-aws-bill/
==>https://aws.amazon.com/blogs/compute/10-things-you-can-do-today-to-reduce-aws-costs/
That being said, please note that as I am in the billing department,
for the best assistance with the various plans you will require the
assistance of our Sales Team.
The Sales Team will be able to assist with ways to save while
maintaining your configurations.
You will be able to reach the Sales Team here:
==>https://aws.amazon.com/websites/contact-us/.
Once you have completed the details in the link, a member of the team
will be in touch with you at their soonest.

AWS RDS - creating a "replica" database daily automatically

Apologies if this isn't the right place to ask this question.
I'm looking to create an automated copy (backup) of my AWS RDS (MySQL) database daily and have this backup restored daily to another RDS instance and made available to another set of applications
I already have daily backups running and I can create a new rds instance from a backup but I want this to happen automatically within AWS.
Looking through AWS documentation and I can't find anything that fits this purpose but maybe there's a service that I'm not aware of.
AWS Aurora for MySQL and PostgreSQL support AutoScaling.
Autoscaling dynamically adjusts the number of Replicas available for cluster based on different metrics and policy. When sudden workload increase it'll add move read replicas and when it'll decrease it'll also remove so you don't have to pay for it.
Aurora AutoScaling
AWS RDS doesn't support autoscaling but you can always scale horizontally and vertically manually.
Scaling Your Amazon RDS Instance Vertically Horizontally

AWS Auto-Scaling

I'm trying AWS auto-scaling for the first time, as far as I understand it creates instances if for example my CPU Utilization reaches critical level, that I define.
So I am curious, after I lunch my instance I spend a fair amount of time configuring it and copying the data, if AWS auto-scales my instance how will it configure the new instances and move the data to it?
You can't store any data that you want to keep on an instance that is part of an autoscaling group (well you can, but you will lose it).
There are (at least) two ways to answer your question:
Create a 'golden image', in other words spin-up an instance, configure it, install the software etc and then save it as an AMI (amazon machine image). Then tell the autoscaling group to use that AMI each time an instance starts - it will be pre-configured when it starts.
Put a script on the instance that tells the instance how to configure itself when it starts up (in the user data). SO basically each time an instance scales up, it runs the script and does all the steps it needs to to configure itself.
As for you data, best practice would be to store any data you want to keep in a database or object store that is not on the instance - so something like RDS, DynamoDB or even S3 objects.
You could also use AWS EFS, store there your data/scripts that the EC2 Instances will be sharing, and automatically mount it every time a new EC2 Instance is created via /etc/fstab.
Once you have configured the EFS to be mounted on the EC2 Instance (/etc/fstab), you should create a new AMI, and use this new AMI to create a new Launch Configuration and AutoScaling Group, so that the new Instances automatically mount your EFS and are able to consume that shared data.
https://aws.amazon.com/efs/faq/
Q. What use cases is Amazon EFS intended for?
Amazon EFS is designed to provide performance for a broad spectrum of
workloads and applications, including Big Data and analytics, media
processing workflows, content management, web serving, and home
directories.
Q. When should I use Amazon EFS vs. Amazon Simple Storage Service (S3)
vs. Amazon Elastic Block Store (EBS)?
Amazon Web Services (AWS) offers cloud storage services to support a
wide range of storage workloads.
Amazon EFS is a file storage service for use with Amazon EC2. Amazon
EFS provides a file system interface, file system access semantics
(such as strong consistency and file locking), and
concurrently-accessible storage for up to thousands of Amazon EC2
instances. Amazon EBS is a block level storage service for use with
Amazon EC2. Amazon EBS can deliver performance for workloads that
require the lowest-latency access to data from a single EC2 instance.
Amazon S3 is an object storage service. Amazon S3 makes data available
through an Internet API that can be accessed anywhere.
https://docs.aws.amazon.com/efs/latest/ug/mount-fs-auto-mount-onreboot.html
You can use the file fstab to automatically mount your Amazon EFS file
system whenever the Amazon EC2 instance it is mounted on reboots.
There are two ways to set up automatic mounting. You can update the
/etc/fstab file in your EC2 instance after you connect to the instance
for the first time, or you can configure automatic mounting of your
EFS file system when you create your EC2 instance.
I recommend using a shared data container if it is data that is updated and the updated data is needed by all instances that might be spinning up.
If it is database data or you could store the needed data in a database I would consider using an RDS.
If it is static data only used to configure the instances like dumps or configuration files which are not updated by running instances then I would recommend pulling them from CloudFlare or S3 of iT is not possible to pull them from a repository.
Good luck

AWS EC2 instance snapshot in another region

i m running ec2 instance in 1 region i want to create snapshots of ec2 instances in other region directly without coping and cross region replication in s3, is this possible? if possible then how?
Amazon EBS Snapshots are created in the same region as the original EBS Volume. They can then be used to create a new Volume within the same Region.
If you wish to use an Amazon EBS Snapshot in a different region, the snapshot must first be copied to the other Region. This can done via the Amazon EC2 management console, the AWS Command-Line Interface (CLI) aws ec2 copy-snapshot command, or an AWS API call.
Please note that snapshots are incremental backups. The first snapshot isn't really a full backup. Rather, every snapshot simply copies any blocks that have been modified since any previous snapshot. Blocks are retained while snapshots still require the blocks. This means that blocks made during the initial snapshot could actually be deleted if they are not required by any active snapshots. This is why I say they are not the same as a full backup, which traditionally never has content deleted.
However, when a snapshot is copied to a new region it is copied in full, rather than incrementally.
If you do not with to copy an EBS snapshot between regions, you would need to find a different way to transfer the disk volume (eg filesystem-level synchronisation).
In fact, there should typically be no need to transfer a disk volume -- rather, your systems should be capable of configuring a new server based upon a startup configuration script and data should be stored in a separate database so that it is accessible to multiple instances. It is a very rare case that requires a complete copy of a disk volume.

Backing up root device (mounted at /) of an AWS t2.micro instance running Ubuntu

I want to back up the root device (mounted at /) of my t2.micro instance running Ubuntu. I think the instance is EBS-backed as it is a t2 instance. So I was going to take snapshots of my root device to back it up.
However, it is recommended that I detach the root device before I back it up. There are two problems with this:
I have to use umount to unmount it first, which may cause my instance to crash. What is a safe way to handle this?
I want to run these backups as a cron job on the instance itself, but if my instance's root device is unmounted, will the cron job even run?
A more general question is: what is the best way to do this?
A possible solution might be: use AWS Lambda and execute a Lambda function based on a schedule executing the following commands by the use of the AWS SDK:
Stop EC2 instance
Create EBS snapshot
Start EC2 instance
First, I would confirm that your root device is in fact EBS backed.
Here are the basic steps to confirm:
To determine the root device type of an instance using the console
Open the Amazon EC2 console.
In the navigation pane, click Instances, and select the instance.
Check the value of Root device type in the Description tab as follows:
If the value is ebs, this is an Amazon EBS-backed instance.
If the value is instance store, this is an instance store-backed instance.
(Source: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/RootDeviceStorage.html#display-instance-root-device-type)
AWS states as a best practice is to use snapshots or a backup tool.
Regularly back up your instance using Amazon EBS snapshots or a backup
tool.
(Source: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-best-practices.html)
AWS states in the documentation that root EBS volumes should be shutdown before taking a snapshot.
To create a snapshot for Amazon EBS volumes that serve as root
devices, you should stop the instance before taking the snapshot.
(Source: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-creating-snapshot.html)
So depending on your RPO (Recovery Point Objective), as a general rule it is a good practice to separate your data from your root volume. Store data that you need to keep on a separate EBS volume and take snapshots on the second EBS volume. This way you never have to worry about the instance itself - if it bonks out just launch a new instance and attach your snapshot.
If you have a special case that prevents you from using EBS snapshots, try using a role for your instance(s) that have permissions to read/write data to S3 buckets using your cron job.