EDIT: How to restore instance from scheduled snapshots in GCP - google-cloud-platform

I have a scheduled daily snapshot in GCP for one of my instances. I have several snapshots now. The first one is the full snapshot and the rest of the snapshots only contain changed data.
I want to be able to restore and boot the instance but it fails to boot. Checking the serial console I see reference to a blue screen and then it reboots and shows the same errors again, repeating the reboot cycle.
I have followed the guide in GCP on how to restore an instance from a snapshot by creating a new instance, selecting the snapshot tab and then selecting my snapshot. After saving the instance and trying to boot it I get the blue screen message.
Also, if I create a new instance and use a Windows 2008 R2 Datacenter image the system obviously boots fine but if I try to attach the snapshot disk as a secondary disk (non-boot) then I get the error: Editing VM instance failed. Error: Supplied fingerprint does not match current metadata fingerprint. I'm not sure if this is related to my issue with unable to boot the OS from my snapshot.
I did find a workaround:
1) create an image of running instance (my instance is Win'2008 R2 Datacenter)
2) enable scheduled snapshots of this new instance (with VSS)
3) wait for a scheduled snapshot to get created (hourly so must wait 1 hr)
4) create new instance from the scheduled snapshot
After all this work the instance boots just fine with all my data. Obviously not a very good workaround as now I have two instances with the same data. So I have to schedule the production system for maintenance so that I can bring it down and use the new instance so that future scheduled snapshots work if I try to restore it again. A major paint in the butt.
Anyone have any ideas as to why none of my instances boot from scheduled snapshots without my workaround? Manual snapshots work fine. And new instances also work fine with the same snapshot schedule.

I had this exact same problem. I tried multiple scheduled snapshots with the same result UNTIL I made a change in the VM Instance when attaching the restored snapshot. Maybe it's just Windows but if you named the disk something different then it seems to fail.
My original disk was called disk-1 for example. When I restored the snapshot I did it to disk-1-a and attached it to my instance. It failed the same way yours did. When I attach it and under "Device name" for the boot disk, select Custom and entered my original disk name of disk-1, it booted and RDP worked.

Related

Which point in time is reflected in the files of an EC2 AMI taken while rebooting?

If you take an AMI from an EC2, and the AMI takes, say, 1 hour to be available; and you choose the option not to skip the reboot.
All the files in the AMI will:
a) reflect their exact condition from the time the EC2 was rebooted? or
b) they may reflect any condition in this 1 hour interval which is what it took for the AMI to be available.
I always considered option a, but I'm not so sure any more, specially after I noticed that when you take an AMI in the console, it gives this message:
"Currently creating AMI ..... Check that the AMI status is 'Available' before deleting the instance or carrying out other actions related to this AMI."
I want to know if it's safe to start applying changes in an EC2 instance after an AMI is requested and the EC2 rebooted, but before the AMI is available.
An Amazon Machine Image (AMI) will contain a copy of the disk at it was at exactly at the point in time when the API call was issued.
Or, if the instance is rebooted as part of the image creation, it will contain a copy of the disk as it was between the time when the operating system shutdown and when the operating system started again.
The time taken for an AMI to become available involves copying disk blocks to the Snapshot used by the AMI. Any disk changes during that time will not be reflected in the AMI. This is possible because the disk is virtual. (It's a bit like a database being able to roll-back due to the use of log files.)
From Create Amazon EBS snapshots - Amazon Elastic Compute Cloud:
Snapshots occur asynchronously; the point-in-time snapshot is created immediately, but the status of the snapshot is pending until the snapshot is complete (when all of the modified blocks have been transferred to Amazon S3), which can take several hours for large initial snapshots or subsequent snapshots where many blocks have changed. While it is completing, an in-progress snapshot is not affected by ongoing reads and writes to the volume... snapshots only capture data that has been written to your Amazon EBS volume at the time the snapshot command is issued.

Google Cloud snapshot's boot issue

Hope all are safe and doing well.
I have few running servers on google cloud and for them, snapshots are scheduled on daily basis in an incremental way.
I am trying to create a new instance on a different VPC zone by using the same snapshots but it will be giving me an error.
For reference, I have added an attachment to this question.
Please help me to resolve this issue and thanks in advance.
Assuming that you have created a Snapshot with Application consistency(VSS) enabled:
When you create a VSS snapshot, Windows Server marks the volume in the
snapshot as read-only. Any disks that you create from the VSS
snapshot are also in read-only mode. So, the read-only flag on the new
boot disk prevents the VM instance from booting correctly.
You can follow this documentation to resolve your issue here.
If the disk you created from the VSS snapshot is a boot disk and you want to use it to boot a VM instance, you must temporarily attach the disk to a separate, existing VM instance. Once you complete the following steps, you can detach the disk from that existing VM instance and use it to boot a new VM instance.

The zone 'projects/xxxx/zones/us-west3-a' does not have enough resources available to fulfill the request. Try a different zone, or try again later

I have a suspended instance that I want to bring up, but cannot because of the error in the title. I'd expect that there'd be an obvious way to choose a different zone in which to resume the instance, but I see nothing. As long as us-west3-a is overbooked, how can I resume execution of this instance elsewhere?
I'm not running a major service - this one instance is the entire operation, and given what I'm running (an ancient game server) load balancing or multi-region availability is out of the question. I just need to be able to run this instance somewhere when the need strikes.
To be able to resume your instance in another zone you will need to create a snapshot first then create a new instance using the snapshot you have created. There are no possible ways to directly transfer an instance to another zone. Below are the step by step procedure to do so:
How to create snapshot
Go to the Compute Engine page > then select Snapshot.
Click Create Snapshot.
Select the disk of your instance.
Please check all your settings
Once you're done, please click the "Create" button.
How to create instance from a snapshot with new zone
Go to Compute Engine > Snapshots
Select the snapshot you need
Click Create Instance
Provide a name for your new instance
Select the new Region or Zone
Select other options needed ie. Machine type or GPU
Edit other settings like network and disk if needed
Click Create
Once the instance is created and started, it will be in the same state at the time you created the snapshot.
For more information and troubleshooting about the Stockout error, you can check the GCP official documentation.

Accidentally deleted GCP instance connected to AI notebook

I accidentally deleted my ai notebook vm and I hadn't downloaded the notebooks connected to it. I still have the url. Does anybody know if there's a way for me to recover my work?
According to the documentation, there is a life cycle for the instances. Verify the state of your AI Notebook VM to make sure that it is deleted or just turned off.
Unfortunately, if an AI Notebook instance is deleted and there is no snapshot configured, there is no way to restore that instance neither recover the notebooks stored there. There are three ways to prevent this from happening in the future:
Create snapshots to periodically or schedule back up data from your zonal persistent disks (snapshot can be located in multiple zones) or regional persistent disks (You must indicate the region where the disk is located ).
Edit VM instance, go to the deletion protection checkbox to enable it as this option is disabled by default. This setup will avoid that your Notebook instance was deleted by accident.
In the VM instance, go to boot disk, in the drop down list under “When deleting instance” select “Keep Disk” (or you can use gcloud command to disable set-disk-auto-delete)

cloning an amazon machine instance

I have two amazon machine instances running.Both of them are m3.xlarge instances. One of them has the right software and configuration that I want to use.I want to create a snapshot of the EBS volume for that machine and use that as the EBS volue to boot the second machine from. Can I do that and expect it to work without shutting down the first machine.
It is well described in the AWS documentation...
"You can take a snapshot of an attached volume that is in use. However, snapshots only capture data that has been written to your Amazon EBS volume at the time the snapshot command is issued. This might exclude any data that has been cached by any applications or the operating system. If you can pause any file writes to the volume long enough to take a snapshot, your snapshot should be complete. However, if you can't pause all file writes to the volume, you should unmount the volume from within the instance, issue the snapshot command, and then remount the volume to ensure a consistent and complete snapshot.
I have amazon as well, with 3 different clusters. With one of my clusters after setting up 25 of them I realized there was a small issue in the configuration and had live traffic going to them so I couldn't' shut down.
You can snapshot the first machines volume while it's still running, I had to do this myself. It took a little while, but ultimately it worked out. Please note that amazon cannot guarantee the consistency of the disk when doing this.
I did a snapshot of the entire thing, fixed what needed to be fixed, and spooled up 25 new servers and terminated the other 25 ( easier than modifying volumes, etc ).. But you can create a new volume with the new snapshot, and attach it to an instance and do what needs to be done to get it to boot off that volume without much of a headache.
Being that I went the easy route of spooling up new instances after my snapshot was complete, I can't walk you through on how to get a running instance to boot off a new volume.