I am using AWS to launch a EC2 instance. Fortunately I did it without problem.
What I need now is to make a backup of the data.
I think snapshot is a good way to do it. I have been doing some research and I found a good tool to do it automatically (https://github.com/colinbjohnson/aws-missing-tools/tree/master/ec2-automate-backup).
The problem is that I think it is not enough to make snapshots. In my opinion a copy of the last snapshot needs to be in another region, but I don't know how to do it automatically. I have been searching on internet and only found this:
http://docs.aws.amazon.com/cli/latest/reference/ec2/copy-snapshot.html. The problem is that I don't know the snapshot id (considering it is generated automatically by the first tool I mentioned).
The question is: Do you know any tool that can help me with this problem? If don't, do you know another approach to get a solution.
It is important to know that the service which is given doesn't need to be up 24 hs.
It is my first time using servers so I don't know how long a region in amazon can be down.
You do not need to know a volume ID to use copy-snapshot in the AWS CLI. When executing the command you provide a value to the --source-snapshot-id option. This specifies the ID of the snapshot you want to copy. A snapshot can be copied in the same region or to another region via the --destination-region option.
You can simply call create-snapshot and then copy-snapshot giving it the generated snapshot ID to copy the snapshot to another region. This could be automated via a cron job if necessary.
You can set up a cron job to invoke aws cli which can copy the snapshots to an S3 bucket 'A'. And, cross region bucket replication can be enabled from the source bucket 'A' in region 1 to destination bucket 'B' in region 2. Whenever a snapshot is uploaded to 'A', it'll get replicated to 'B' as well. So, in case first region becomes offline, you can restore volumes from the snapshots in 'B' bucket in region 2.
Related
I configured a set of Lifecycle manager Policies to back-up my EC2 instances last week, but I cannot find any relevant snapshots in the EBS snapshots section. Can someone please advise if I should look for the snapshots somewhere else, or if I should review any specific parameters that regard the policy, or if even I should use another method?
Thank you.
Schedule details
Lifecycle manager will create the backup as a regular EBS snapshot, EBS snapshots are stored in S3 however you do not have access to the snapshot other than through the console/API.
Based on your configuration it will only apply to a tag of Name with the value of Graylog v3.3.2. This will happen once a week at 12:30PM UTC on a Monday.
If the snapshots are not being generated check the following:
Do the target instances have this name and value assigned to them?
Does the execution IAM role have permissions to perform this action? If it has the default permissions then it will be fine to run.
Thank you for the answer and apologies for this issue I'm kind of new to AWS and I managed to solve it.
The issue was simply in the filter applied to the snapshots tab I wasn't seeing the full name. bottom line the policy was working fine after all.
We have an EC2 instance running in AWS EC2 instance. We have our ML algorithms and data that. We have also hosted a web-based interface also in that machine.
Now there are no new developments happening in that EC2 instance. We would like to terminate AWS subscription for a short period of time (for the purpose of cost-reduction and exploring new cloud services). Most importantly, we want to be in a position where we can purchase a new EC2 instance with a fresh AWS subscription, use the backup which we take now, and resume all operations (web-backend, SMS services for our app which is hosted in AWS, etc.).
What is the best way to do it? Is temporary termination of AWS subscription advisable?
There is no concept of an "AWS Subscription". AWS is charged on-demand, which means you only pay when you use resources.
If you temporarily do not want the Amazon EC2 instance, you could:
Stop the instance, which is like turning off the power. You will not be charged for the instance, but you will still pay for the disk storage attached to the instance. You can simply Start the instance again when you wish to use it. You will only be charged while the instance is running. OR
Create an image of the instance, then terminate the instance. This will create an Amazon Machine Image (AMI), which contains a copy of the disks. You can then launch a new Amazon EC2 instance from the AMI when you wish to use it again. This is a lower-cost option compared to simply stopping the instance, but it takes more effort to stop/start.
It is quite common for companies to stop Amazon EC2 instances at night or over the weekend to reduce costs while they are not needed.
EDIT: Just thought of a third option. Will test it and be back. Not worth it; it would involve creating an image from the EC2 instance and then convert that image to a VM image, storing the VM image in S3. There may be some advantages to this, but I do not see them.
I think you have two options, both of them very reasonably priced. If you can separate the data from the operating system, then your best option would be to use an S3 bucket as a file system within the EC2 instance. Your EC2 instance would use this bucket to store all your "ML algorithms and data" and, possibly, even your "web-based interface". Whenever you decide that you no longer need the processing capacity of the EC2, you would unmount the S3 bucket file system from the EC2 instance and terminate that instance. After configuring an appropriate lifecycle rule for the S3 bucket, it would transition to Glacier, or even Glacier Deep Archive [you must considerer the different options of long term storage]. In the future, whenever you want to work with your data again, you would move your data from Glacier back to S3, create a new EC2 instance, install your applications, mount your S3 bucket as a file system and you would have access to all your data. I think this is your least expensive and shortest recovery time objective option. To implement this option, look at my answer to this question; everything you need to use an S3 bucket as a regular folder inside the EC2 instance is there.
The second option provides an integrated solution, meaning the operating system and the data stay together, and allows you to restore everything as it was the day you stopped processing your data. It's made up of the following cycle:
Shutdown your EC2 and make a note of all the specs [you need them further down].
Export your instance to a virtual image, vmdk for example, and store it in your S3 bucket. Something like this:
aws ec2 create-instance-export-task --instance-id i-0d54b0682aa3998a0
--target-environment vmware --export-to-s3-task DiskImageFormat=VMDK,ContainerFormat=ova,S3Bucket=sm-vm-backup,S3Prefix=vms
Configure an appropriate lifecycle rule for the S3 bucket so that it transitions to Glacier, or even Glacier Deep Archive.
Terminate the EC2 instance.
In the future you will need to implement the inverse, so you will need to restore the archived S3 Object [make sure you you can live with the time needed by AWS to do this]
Import the virtual image as an EC2 AMI, something like this [this is not complete - you will need some more options that you saved above]:
aws ec2 import-image --disk-containers
Format=ova,UserBucket="{S3Bucket=sm-vm-backup,S3Key=vmsexport-i-0a1c382e740f8b0ee.ova}"
Create an EC2 instance based on the image and you're back in business.
Obviously you should do some trial runs and even automate the entire process if it's something that will be done frequently. I have a feeling, based on what you said, that the first option is a better option, provided you can easily install whatever applications they use.
I'm assuming that you launched an EC2 instance from a base Amazon Machine Image and then added your own software and models to it. As opposed to launched an EC2 instance from an AWS Marketplace offering.
The simplest thing to do is to create an Amazon Machine Image (AMI) from your running EC2 instance. That will capture the current state of the instance and persist it in your AWS account. Then you can terminate the instance. Later, when you want to recreate it, launch a new instance, selecting the saved AMI instead of a standard AMI.
An alternative is to avoid the need to capture machine state at all, by using standard DevOps practices to revision-control everything you need to recreate the state of a running machine.
Note that there are costs associated with an AMI, though they are minimal ($0.05 per GB-month of data stored, for example).
I had contacted AWS customer care regarding this issue. Given below is the response I received. Please add your comments on which option might be good for me.
Note: I acknowledge the AWS customer care team for their help.
I understand that you require some information on cost saving for your
Instance since you will not be utilizing the service for a while.
To assist you with this I would recommend checking out the Instance
Stop/Start link here:
==>https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/Stop_Start.html .
When you stop an Instance, you do not lose any data & you are not
charged for the resources any further. However please keep in mind
that you will still be charged for any EBS Storage Volumes attached to
the stopped Instance(s).
I also recommend checking out the below links on how you can reduce
your costs.
==>https://aws.amazon.com/premiumsupport/knowledge-center/reduce-aws-bill/
==>https://aws.amazon.com/blogs/compute/10-things-you-can-do-today-to-reduce-aws-costs/
That being said, please note that as I am in the billing department,
for the best assistance with the various plans you will require the
assistance of our Sales Team.
The Sales Team will be able to assist with ways to save while
maintaining your configurations.
You will be able to reach the Sales Team here:
==>https://aws.amazon.com/websites/contact-us/.
Once you have completed the details in the link, a member of the team
will be in touch with you at their soonest.
Currently I am taking manual backup of our EC2 instance by zipping the data and downloading it locally as well as on DropBox.
But I am wondering, can I have an option where I just take a complete copy of the whole system automatically daily so if something goes wrong/crashes, I can replace it with previous copy immediately rather than spending hours installing and configuring things ?
I can see there is an option of take "Image" but can I automated them to have just 1 latest image and replace the system with single click ?
You can create a single Image of your instance as Backup of your instance Configuration.
And
To keep back up of your data you can use snapshots of your volumes.
snapshots store data in incremental format whenever you make any changes.
When ever needed you can just attach the volume from the snapshot to your Instance.
It is not a good idea to do "external backup" for EC2 instance snapshot, before you read AWS pricing details.
First, AWS is charging every GB of data your transfer OUTside AWS cloud. Check out this pricing. Generally speaking, after the 1st GB, the rest will be charge at least $0.09/GB, against S3-standard pricing ~ $0.023/GB.
Second, the snapshot created is actually charges as S3 pricing(Check :
Copying an Amazon EBS Snapshot), not EBS pricing. After offset the transfer cost, perhaps you should consider create multiple snapshot than keep doing the data transfer out backup.
HOWEVER, if you happens to use an instance that use ephemeral storage, snapshot will not help. You need to copy the data out from ephemeral storage yourself. Then it is your choice to store under S3 or other place.
Third. If you worry the AWS region going down, check the multiple AZ option. Or checkout alternate AWS region option.
Fourth. When storing backup data in S3, you can always store them under Infrequent-Access, which save you some bucks, and you don't need to face an insane Glacier bills during emergency restore(Avoid Glacier, unless you are pretty sure about your own requirement).
Fifth, after done your plan of doing everything inside AWS, you can write bash script (AWS CLI) or use boto3, etc API to do the automatic backup.
Lastly , here is way of AWS create and maintain snapshot. Though each snapshot are deem "incremental", when u delete old snap shot :
the snapshot deletion process is designed so that you need to retain
only the most recent snapshot in order to restore the volume.
You can always "test" restore by create another EC2 instance that load the backup snapshot. Or you can mount the snapshot volume from another EC2 instance to check the contents.
I would like to setup a batch process as follows on Amazon AWS:
take snapshot of volumes tagged "must_backup"
share those snapshots with account B
make a copy of those snapshots within account B
the purpose of this is to protect the backups in case the first Amazon AWS account gets compromised.
I know how to automate steps 1 & 3, however I cannot find a commandline example on how to perform step 2.
The official documentation https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-modifying-snapshot-permissions.html
does not provide any sample and does not clearly state how to specify the target account on the commandline.
I've double checked the previous solution and it's not ok. Basically "sharing" a snapshot means allowing other accounts to create a volume from that snapshot.
This implies adding a value to the "createVolumePermission" attribute
aws ec2 modify-snapshot-attribute --snapshot-id snap-<id> --user-ids <user-id-without-hypens> --attribute createVolumePermission --operation add
the operation might take some time (minutes?) after that you'll be able to query the attribute this way:
aws ec2 describe-snapshot-attribute --snapshot-id snap-<id> --attribute createVolumePermission
PS: for the purposes mentioned in the question this is probably not enough since the 'destination' account will not be able to see any of the tags from the source account, thus it will be impossible to perform a correct backup if the source account shares multiple snapshots with the same size
Example Commands for aws cli: copy ec2 snapshot
aws ec2 modify-snapshot-attribute --snapshot-id snap-1234567890 --user-ids other-amazon-account-id
How can I find out which region an EBS snapshot is in using the aws cli tool?
Well... you can't, and you don't actually need to find out the Region, since you already know it.
Confused? Let me explain...
EBS Snapshots exist in a Region. But the only way to obtain information about a Snapshot is to connect to a Region and make an API call to describe the Snapshot (or use the Command Line Interface to make the API call for you).
API calls are made to an Endpoint, which is a URL that 'points' to a Region.
So... To describe a Snapshot, you first connect to the Region, then ask for details about the Snapshot. It won't tell you the Region in which the Snapshot is located, but you already know the Region since you had to select a Region when making the API call.
So... you can't, but you already know it!