Will restoring a snapshot recreate the environment EXACTLY as it was at the point of the snapshot? I am specifically referring to the operating system and installed software.
If not, then I assume that a disk image is a correct approach
Snapshot and Disk Image use the same process. Both take a point in time copy of a storage device by performing a block copy.
Will either restore exactly (bit-for-bit) as the source? Yes, if you shutdown the VM instance. Maybe if you do not. Google (and AWS, Azure, etc.) strongly recommend that you shutdown your VM before these types of operations. The reason is that file system data could be cached in memory that has not been flushed to disk. A snaphot requires that all applications and the OS participate in the snaphost process. Few applications do.
Related
I have several VMs that are configured and powered off for both operational and DR purposes.
How can I achieve the equivalent of a Azure VM + configuration + resources stored only on (blob) storage and turn it on as needed (with a preferred set of name names, or name variants?
I'm familiar with Azure Automation, but just want to recognize the pieces, and if necessary automation that needs to occur to start/stop this "snapshot" of a base OS VM + lots of customization + software addins.
well, its kinda hard to figure out what you mean exactly, but your best bet is just creating vms and configuring them automatically or using a golden image. if you choose to use ephemeral OS disk - you are not paying for storage, only for compute, but you cant shut it down (all the changes are lost). Also, ephemeral OS disk might not be supported for custom images.
Apart from that, just deallocating a VM means you are not paying for it - only for the storage.
I would like to update the samba on a 3TB NAS. My boss suggested making a clone, however, there is no storage that will fit him whole. If a snapshot of the VM costs a smaller size, and serves to, in case of failure, restore the samba as it was, making it a better idea.
There's no real guide on how much space snapshots occupy. That will greatly depend on the activity on the VM where the snapshot has been taken. If it's an active VM (database or something of the like), there could be a considerable amount of data written. If it's not a very used VM, there could be limited to no data written to the backend datastore.
On one of my AWS instances running Ubuntu 16.04, I've a MySQL replica database on a 1TB ext4 EBS volume. I plan to increase it to 2TB. Before I increase the size of the volume and extend the filesystem using the resize2fs command, do I need to take any precautions? Is there any possibility of data corruption? If so would it be sane to create a EBS snapshot of this volume?
Do I need to take any precautions?
You shouldn't need to take any unusual precautions -- just standard best practices, like maintaining backups and having a tested recovery plan. Anything can go wrong at any time, even when you're sitting, doing nothing.
Important
Before modifying a volume that contains valuable data, it is a best practice to create a snapshot of the volume in case you need to roll back your changes.
https://docs.aws.amazon.com/AWSEC2/latest/WindowsGuide/ebs-modify-volume.html
But this is not indicative of the operation being especially risky. Anecdotally, I've never experienced complications, and have occasionally resized an EBS volume and then its filesystem under a live, master, production database.
Is there any possibility of data corruption?
The possibility of data corruption is always there, no matter what you are doing... but this seems to be a safe operation. The additional space becomes available immediately, and there is no I/O freeze or disruption.
If so would it be sane to create a EBS snapshot of this volume?
As noted above, yes.
Concerns about errors creeping in later are valid, but EBS maintains internal consistency checks and will disable a volume if this fails to help avoid further scrambling of data so that you can do a controlled recovery and repair operation.
This would not help if EBS is prefectly storing data that was corrupted by something on the instance, such as might be caused by a defect in resize2fs, but it seems to be a solid utility. It doesn't move your existing data -- it just fleshes out the filesystem structures as needed to the filesystem use the entire free space that has become available.
Snapshot is a way usig which VM states can be saved and it can be reverted back to point in time when the snapshot was taken.
Are there any other ways of doing this? For example, create incremental copies of VM files and restore those copies as needed. Copies can contain only incremental data. Are there any such different alternatives to snaphots? One of the other considerations for me is to use only VMware tools/technologies.
Thanks,
Vivek.
Snapshot is one of the best thing you have for maintaining Virtual Machine state.
It locks the current disk and creates a new disk which will have the incremental data stored.
So when you revert to snapshot same state is restored.
VCB is another way to take backups, it internally uses snapshots for taking backup.
So AFAIK taking snapshots is the only available way to maintain state of a VM.
I put our application on EC2 (Windows 2003 x64 server) and attached up to 7 EBS volumes. The app is very I/O intensive to storage -- typically we use DAS with NTFS mount points (usually around 32 mount points, each to 1TB drives) so i tried to replicate that using EBS but the I/O rates are bad as in 22MB/s tops. We suspect the NIC card to the EBS (which are dymanic SANs if i read correctly) is limiting the pipeline. Our app uses mostly streaming for disk access (not random) so for us it works better when very little gets in the way of our talking to the disk controllers and handling IO directly.
Also when I create a volume and attach it, I see it appear in the instance (fine) and then i make it into a dymamic disk pointing to my mount point, then quick format it -- when I do this does all the data on the volume get wiped? Because it certainly seems so when i attach it to another AMI. I must be missing something.
I'm curious if anyone has any experience putting IO intensive apps up on the EC2 cloud and if so what's the best way to setup the volumes?
Thanks!
I've had limited experience, but I have noticed one small thing:
The initial write is generally slower than subsequent writes.
So if you're streaming a lot of data to disk, like writing logs, this will likely bite you. But if you make a big file fill it with data, and do a lot of random access I/O to it, it gets better on the second time writing to any specific location.