Changes required to create AMI from OS disk EBS volume manually - amazon-web-services

I have a VMware VM whose OS raw disk is backed up to AWS S3. I can create AMI from the OS disk raw using import-image. I cannot use import-image everytime because it is extremely slow and because I am creating an application where you can backup your VM to AWS cloud, where in the first backup will be FULL backup which will take longer, but the consequent INCREMENTAL backup should take very less time(depending on the amount of data changed).I am creating AMI during every backup i.e. FULL or INCREMENTAL backup.
Hence, it is OK and explainable that FULL backup is taking time but for INCREMENTAL it should take less time.
The problem is, while creating AMI from RAW data during incremental backup, AWS is not aware that there is already an AMI (and also corresponding EBS snapshot) created during FULL backup which should be used(or compared) with latest data to find data changes and hence should create AMI out of the changed data only, which will in turn take less time.
So, I have following options :
1) import-snapshot API = converts the raw OS disk to EBS snapshot file.
2) copy OS Disk data = create a EBS volume and attach it to a running EC2 instance. Then copy all OS disk raw data to the volume. Then create snapshot from the EBS volume. From the EBS snapshot, we can create AMI.
I have tried both options but everytime I try to launch EC2 instance from the AMI, I get below error :
Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(8,0)
After going through various forums, I came to know that the above error occurs if there is mismatch in AKI and ARI while creating AMI from snapshot. Correct AKI and ARI is fetched from source EC2 instance from which the snapshot is created (as this is expected by AWS).
In my case, I have not created snapshot from a running EC2 instance but from a VMWare VM OS disk.
I figured out that import-image API also creates snapshots while creating AMI. So, I compared the snapshot created by import-image and the snapshot created by me using option-1 and option-2.
I compared the list of files in /boot/ and their md5sum. I found out the snapshot created by AWS import-image API has "initramfs-3.10.0-327.36.3.el7.x86_64.img.vmimport" file and has modified many files in /boot/grub2 directory.
As per AWS documentation, https://docs.aws.amazon.com/vm-import/latest/userguide/vm-import-ug.pdf,
AWS modifies filesystem :
- installs Citrix PV drivers either directly in OS or modifies initrd/initramfs to contain them,
- modifies /etc/fstab,
- modifyies grub bootloader settings such as the default entry and timeout.
So, do I need to do above changes in my EBS volume as well ? How to do these changes (code, script, tool, etc) ?
Please suggest any better option if someone has.
I explored Packer but found out that it needs source_ami to create AMI, hence not applicable to me as I am not creating AMI from source AMI, Please correct me if I am wrong.

Related

Which point in time is reflected in the files of an EC2 AMI taken while rebooting?

If you take an AMI from an EC2, and the AMI takes, say, 1 hour to be available; and you choose the option not to skip the reboot.
All the files in the AMI will:
a) reflect their exact condition from the time the EC2 was rebooted? or
b) they may reflect any condition in this 1 hour interval which is what it took for the AMI to be available.
I always considered option a, but I'm not so sure any more, specially after I noticed that when you take an AMI in the console, it gives this message:
"Currently creating AMI ..... Check that the AMI status is 'Available' before deleting the instance or carrying out other actions related to this AMI."
I want to know if it's safe to start applying changes in an EC2 instance after an AMI is requested and the EC2 rebooted, but before the AMI is available.
An Amazon Machine Image (AMI) will contain a copy of the disk at it was at exactly at the point in time when the API call was issued.
Or, if the instance is rebooted as part of the image creation, it will contain a copy of the disk as it was between the time when the operating system shutdown and when the operating system started again.
The time taken for an AMI to become available involves copying disk blocks to the Snapshot used by the AMI. Any disk changes during that time will not be reflected in the AMI. This is possible because the disk is virtual. (It's a bit like a database being able to roll-back due to the use of log files.)
From Create Amazon EBS snapshots - Amazon Elastic Compute Cloud:
Snapshots occur asynchronously; the point-in-time snapshot is created immediately, but the status of the snapshot is pending until the snapshot is complete (when all of the modified blocks have been transferred to Amazon S3), which can take several hours for large initial snapshots or subsequent snapshots where many blocks have changed. While it is completing, an in-progress snapshot is not affected by ongoing reads and writes to the volume... snapshots only capture data that has been written to your Amazon EBS volume at the time the snapshot command is issued.

Automate AWS AMI creation without downtime and Data loss

I wanted to know is it possible to automate the creation of AMI in AWS without downtime and data loss, if possible how can we achieve it.
I have use system manager-> maintenance window in which i have set the reboot to true for data integrity, but i need a way so that the data is not lost.
Any help will be appreciated.
Thank-you.
Answering it as per comments discussion, question is somehow still vague to me
You have EBS right now. I'm not sure if your Instances are in Same AZ or not. If they are in same AZ then you can use EBS multi attach feature (available for IO volumes only) to share same storage with all of them.
Regarding backup you can choose EBS snapshots
Ideally my suggestion to you would be create a launch template, use EFS that can be mounted to multiple instances in same region, if you want it across regions then create mount targets. EFS is natively integrated with AWS backup.
Whenever any failover happens or your EC2 crashes for any reason and it goes less than your target capacity, auto scaling would automatically provision a new instance using launch template which would be using same EFS
but i need a way so that the data is not lost.
if you want to achieve this, then According to Docs, you need to ensure that application or os is not writing to ebs, which can be managed by either a script or a custom logic.
You can take a snapshot of an attached volume that is in use. However, snapshots only capture data that has been written to your Amazon EBS volume at the time the snapshot command is issued. This might exclude any data that has been cached by any applications or the operating system. If you can pause any file writes to the volume long enough to take a snapshot, your snapshot should be complete. However, if you can't pause all file writes to the volume, you should unmount the volume from within the instance, issue the snapshot command, and then remount the volume to ensure a consistent and complete snapshot.
if you achieved the above then you can automate the creation, retention, and deletion of EBS snapshots and EBS-backed AMIs it using Data Lifecycle Manager
I haven't tried this but I think exporting VM to S3 and then automating the entire pipeline with Ec2 image builder should do the trick, you can customise your further images with build components
Refers importing and exporting vm's
Unfortunately there is not of box solution other than compromising data integrity but you can try above mentioned which can ensure data integrity and automation

Restore managed OS disk of GCP instance using snapshot to existing VM

I am trying to restore OS disk of GCP instance to existing VM, by using Snapshot taken, of OS disk, is there any API provided by Google Cloud Platform to do same.I can't find anything to help me restore a snapshot to an existing disk. Is this even possible?
You can restore a boot disk from a snapshot as described here; however, it will be given a different Instance ID. VM Instance IDs are unique to each one you create.

What is the difference between Amazon AMI and EBS snapshot

My basic need is that I should be able to make new instance from my saved image for current running Centos with all settings.
I am thinking of two options:
Create the AMI from the any state
Create the snap shots of EBS
I am confused what is the differnece between them. Are they same or different?
Can I make new instances from EBS snapshots/?
Also, can I use AMI on my localhost to create the same OS?
There are two types of AMIs/instances: EBS boot and instance-store (sometimes referenced as S3-based). You are probably using EBS boot, so this answer will relate to that type.
An EBS boot AMI is an EBS snapshot of a boot EBS volume with some extra attributes including:
Registered as an AMI with an AMI id
AKI (kernel)
ARI (ramdisk)
architecture (e.g., 64-bit)
block device mappings (e.g., where volumes should be created/attached)
description, name
permissions (who is allowed to run the AMI)
If you create an AMI of the running instance, you should be able to start new instances in the same state. Make sure you test this process so that you know it works.
If you simply snapshot the EBS volume(s) of your running instance, you will be able to create volumes from those snapshots to access the configuration and data.
It is also possible to take an EBS snapshot of an EBS boot volume and register it as an EBS boot AMI so that you can run more instances starting with that state. When registering the AMI, you'll need to specify the correct AKI, architecture, and other meta-data in order for this to work, so research and practice before you trust this approach.
It took me a while to understand it as I am new with it, but here is a thing if you are using EBS backed:
If you want to start immediately create AMI Image( which creates image of OS and store data as EBS Snapshot), then the whole AMI Image contains current state of your instance which is installed OS which is all config and data files.
If you only take EBS snapshot, then for restore you need to launch new AMI, and you can attach this volume to it for just to access data. If your new AMI has different OS or upgraded may be few of your config won't work and you need to install your packages from scratch. So you should check this first.
In simple words EBS Snapshot can not be used as root volume unless you make and own its AMI image :-)
In brief, EBS boot AMI = EBS root volume snapshot + metadata
For better understanding, you can play it through hands on.
create an EBS snapshot for a particular running instance.
Find this snapshot.
Fill some meta data, and build image(AMI)
You did it. A brand new AMI has been created.

Amazon EC2 EBS backup: AMI vs Snapshot

I am trying to create a backup mechanism for our server, so that if my system crashes, I should be able to create the whole system by running a single script
After going through Amazon documentation, this is my understanding of creating a backup and restoring
Backup
Create a AMI Image (this can be updated monthly)
Create a snapshot (This can be done using a daily script creating a snapshot)
Restore (A script to)
Create an EBS instance using AMI
Attach the EBS volume to Instance created
Now my Questions are
Is it the best way to take a backup and restore?
Do we actually need to backup 2 things, AMI and EBS volume (using snapshot), Can we just keep snapshots?
I understand this cannot work for a local instance store instance, as there is no snapshot functionality. So how can I create a backup and restore process for local instance store instances?
As I could not find any better alternative, I am sticking with the initial approach.
For EBS
Backup:
Create a AMI Image (this can be updated monthly).
Create a snapshot (This can be done using a daily script creating a snapshot).
Restore (A script to)
Create an EBS instance using AMI.
Attach the EBS volume to Instance created.
For instance store, I am only keeping the application (no database), so no need to keep a backup of that.
EBS Snapshots are an excellent way to create backups.
You can perform frequent Snapshots of your EBS Volumes via scripts. Weekly, Daily, Hourly, or as frequently as your Credit Card will allow. The only limit is around how many simultaneous snapshots you can be doing - when you hit that, the EBS API will start giving back errors until a few of the in-flight operations complete.
Snapshots can also be copied from Region to Region in order to provide backup against a catastrophic event.
When you snapshot an EBS volume, that snapshot is of the entire volume. Even if it was created from an AMI, your snapshot contains everything you need to create a new instance of the volume. You can pretty easily try this yourself.
If your instances are Linux based, there is no need to create an AMI if you're taking snapshots. You can create the AMI on the fly, from the snapshots, when you need to recover. If you got that process automated, it's pretty easy to do.
In Windows there is a limitation not allowing to launch an EC2 instance from a snapshot, so AMIs must be used. There are ways to workaround that limitation: You can check out the this post I wrote in our company's blog:
http://www.n2ws.com/blog/3-ways-ec2-windows-backup-and-recovery.html
I would suggest to use Auto Scaling in addition to EBS snapshots. If Instance is dying because of Hardware failure or it's scheduled for retirement by Amazon, Auto Scaling will start new Instance automatically.
But in this case, you have to setup NAS for your dynamic data. Depending on Server Load, the number of running Instances will be different and all your scaling servers must mount NAS storage which is shared across them.
Your Database should be on separate server or servers as well. Or you might want to use Amazon RDS as it has great auto-backup / Point-In-Time-Restore features, but you have to pay extra for that.
1) Yes.Snapshot is best way to backup and restore EBS volumes.
2) Depends, if you have the root volume as EBS backed AMI, then you can snapshot them as well and improves the manageability
3) Rsync and AMI is the option available for instance store