Permanently binding static IP to preemptible google cloud VM - google-cloud-platform

For our project we need a static IP binding to our Google Cloud VM instance due to IP whitelisting.
Since it's a managed group preemptible, the VM will terminate once in a while.
However, when it terminates I see in the operations log compute.instances.preempted directly followed by compute.instances.repair.recreateInstance with the note:
Instance Group Manager 'xxx' initiated recreateInstance on instance
'xxx'.
Reason: instance's intent is RUNNING but instance's status is
STOPPING.
After that follows a delete and a insert operation in order to restore the instance.
The documentation states:
You can simulate an instance preemption by stopping the instance.
In which case the IP address will stay attached when the VM is started again.
A) So my question, is it possible to have the instance group manager stop and start the VM in the event of preemption, instead of recreating? Since recreating means that the static IP will be detached and needs to be manually attached each time.
B) If option A is not possible, how can I attach the static IP address automatically so that I don't have to attach it manually when the VM is recreated? I'd rather not have an extra NAT VM instance to take care of this problem.
Thanks in advance!

I figured out a workaround to this (specifically, keeping a static IP address assigned to a preemptible VM instance between recreations), with the caveat that your managed instance group has the following properties:
Not autoscaling.
Max group size of 1 (i.e. there is only ever meant to be one VM in this group)
Autohealing is default (i.e. only recreates VMs after they are terminated).
The steps you need to follow are:
Reserve a static IP.
Create an instance template, configured as preemptible.
Create your managed group, assigning your template to the group.
Wait for the group to spin up your VM.
After the VM has spun up, assign the static IP that you reserved in step 1 to the VM.
Create a new instance template derived from the VM instance via gcloud (see https://cloud.google.com/compute/docs/instance-templates/create-instance-templates#gcloud_1).
View the newly create instance template in the Console, and note that you see your External IP assigned to the template.
Update the MiG (Managed Instance Group) to use the new template, created in step 6.
Perform a proactive rolling update on the MiG using the Replace method.
Confirm that your VM was recreated with the same name, the disks were preserved (or not, depending on how you configured the disks in your original template), and the VM has maintained its IP address.
Regards to step 6, my gcloud command looked like this:
gcloud compute instance-templates create vm-template-with-static-ip \
--source-instance=source-vm-id \
--source-instance-zone=us-east4-c
Almost goes without saying, this sort of setup is only useful if you want to:
Minimize your costs by using a single preemptible VM.
Not have to deal with the hassle of turning on a VM again after it's been preempted, ensuring as much uptime as possible.
If you don't mind turning the VM back on manually (and possibly not being aware it's been shutdown for who knows how long) after it has been preempted, then do yourself a favor and don't bother with the MiG and just standup the singular VM.

Answering your questions:
(A) It is not possible at the moment, and I am not sure if it will ever be possible. By design preemptible VMs are deleted to make space for normal VMs (if there are capacity constraints in the given zone) or regularly to differentiate them from normal VMs. In the latter case preemption might seem like a start/stop event, but in the former it may take a substantial amount of time before the VM is recreated.
(B) At the moment there is not good way to achieve it in generality.
If you have a special case where your group has only one instance you can hardcode the IP address in the Instance Template
Otherwise at the moment the only solution I can think of (other than using a Load Balancer) is to write a startup script that would attach the NAT IP.

I've found one way that ensures that all VM's in your network have the same outgoing IP address. Using Cloud NAT you can assign a static IP which all VM's will use, there is a downside though:
GCP forwards traffic using Cloud NAT only when there are no other
matching routes or paths for the traffic. Cloud NAT is not used in the
following cases, even if it is configured:
You configure an external IP on a VM's interface.
If you configure an external IP on a VM's interface, IP packets with the VM's internal IP as the source IP will use the VM's
external IP to reach the Internet. NAT will not be performed on
such packets. However, alias IP ranges assigned to the interface
can still use NAT because they cannot use the external IP to reach
the Internet. With this configuration, you can connect directly to
a GKE VM via SSH, and yet have the GKE pods/containers use Cloud
NAT to reach the Internet.
Note that making a VM accessible via a load balancer external IP does not prevent a VM from using NAT, as long as the VM network
interface itself does not have an external IP address.
Removing the VM's external IP also prevents you from direct SSH access to the VM, even SSH access from the gcloud console itself. The quote above shows an alternative with a load balancer, another way is a bastion, but doesn't directly solve access from for example Kubernetes/kubectl.
If that's no problem for you, this is the way to go.

One solution is to let the instances have dynamically chosen ephemeral IPs, but set the group as the target of a Load Balancer with a static IP. This way even when instances are created or destroyed, the LB acts as a frontend keeping the IP continious over time.

Related

Coordinating multiple VMs in a VPC

I'm using a CloudFormation stack that deploys 3 EC2 VMs. Each needs to be configured to be able to discover the other 2, either via IP or hostname, doesn't matter.
Amazon's private internal DNS seems very unhelpful, because it's based on the IP address, which can't be known at provisioning time. As a result, I can't configure the nodes with just what I know at CloudFormation stack time.
As far as I can tell, I have a couple of options. All of them seem to me more complex than necessary - are there other options?
Use Route53, set up a private DNS hosted zone, make an entry for each of the VMs which is attached to their network interface, and then by naming the entries, I should know ahead of time the private DNS I assign to them.
Stand up yet another service to have the 3 VMs "phone home" once initialized, which could then report back to them who is ready.
Come up with some other VM-based shell magic, and do something goofy like using nmap to scan the local subnet for machines alive on a certain port.
On other clouds I've used (like GCP) when you provision a VM it gets an internal DNS name based on its resource name in the deploy template, which makes this kind of problem extremely trivial. Boy I wish I had that.
What's the best approach here? (1) seems straightforward, but requires people using my stack to have extra permissions they don't really need. (2) is extra resource usage that's kinda wasted. (3) Seems...well goofy.
Use Route53, set up a private DNS hosted zone, make an entry for each of the VMs which is attached to their network interface, and then by naming the entries
This is the best solution, but there's a simpler implementation.
Give each of your machines a "resource name".
In the CloudFormation stack, create a AWS::Route53::RecordSet resource that associates a hostname based on that "resource name" to the EC2 instance via its logical ID.
Inside your application, use the resource-name-based hostname to access the other isntance(s).
An alternative may be to use an Application Load Balancer, with your application instances in separate target groups. The various EC2 instances then send all traffic through the ALB, so you only have one reference that you need to propagate (and it can be stored in the UserData for the EC2 instance). But that's a lot more work.
This assumes that you already have the private hosted zone set up.
I think what you are talking about is known as service discovery.
If you deploy the EC2 instances in the same subnet in the same VPC with the same security group that allows the port the want to communicate over, they will be "discoverable" to each other.
You can then take this a step further. If autoscaling is on the group and machines die and respawn they can write there IPs into a registry i.e. dynamo so that other machines will know where to find them.

Assigning static IPs to auto scaled EC2 instance

We have a 3rd party integration which needs the EC2 instance IP to be whitelisted. The 3rd party whitelists the IP on their server and then only the EC2 instance can communicate with them.
In the case of single instance this works.
However when auto scaling kicks in, we would end up in more than 1 instance. These new instances automatically get new IPs for every autoscale action.
Is it possible for us to ask AWS to assign IPs from a say a set of 4 predefined Elastic IPs? ( Assumption is that autoscaling is restricted to say 4 and we have 4 floating EIPs )
I'm trying to avoid gateway NAT since there is a big cost associated with it.
Any ideas?
With autoscaling this is not directly possible to assign an Elastic IP to autoscaled instances. However there are couple of options you can consider.
After instance autoscales, having a boot up script(e.g UserData in Linux) with AWS EC2 CLI commands to associate an Elastic IP address you have allocated to your account writing a command line script. Note that you need to handle the health checks accordingly for the transition to happen smoothly.
Having a CloudWatch alarm trigger to execute an Lambda function which will associate an Elastic IP address to the instance newly started. For this you can use AWS SDK and code to check the instance without EIP and Associate an available EIP to it.
Auto Scaling will not automatically assign an Elastic IP address to an instance.
You could write some code to do this and include it as part of the User Data that is executed when an instance starts. It would:
Retrieve a list of Elastic IP addresses
Find one that is not currently associated with an EC2 instance
Associate it with itself (that is, with the EC2 instance that is running the User Data script)
Use a NAT instance. There's only a small cost associated with a t2.nano and you should find that more than adequate for the purpose.
http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_NAT_Instance.html
While not as reliable as a NAT Gateway (you're paying for hands-off reliability and virtually infinite scalability), it's unlikely you'll have trouble with a NAT instance unless the underlying hardware fails, and you can help mitigate this by configuring Instance Recovery:
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-recover.html

Launching an AWS EC2 with a specified private address and/or hostname?

We do continuous integration from Jenkins, and have Jenkins deploy to an EC2 instance. This EC2 instance exports an NFS share of the deployed code to EC2 processing nodes. The processing nodes mount the NFS share.
Jenkins needs to be able to "find" this code-sharing EC2 instance and scp freshly-built code, and the processing nodes need to "find" this code-sharing EC2 instance and mount its NFS share.
These communications happen over private IP space, with our on-premise Jenkins communicating with our EC2 in a Direct Connect VPC subnet, not using public IP addresses.
Is there a straightforward way to reliably "address" (by static private IP address, hostname, or some other method) this code-sharing EC2 that receives scp'd builds and exports them via NFS? We determine the subnet at launch, of course, but we don't know how to protect against changes of IP address if the instance is terminated and relaunched.
We're also eagerly considering other methods for deployment, such as the new EFS or S3, but those will have to wait a little bit until we have the bandwidth for them.
Thanks!
-Greg
If it is a single instance at any given time that is this "code-sharing" instance you can assign an Elastic IP to it when after you've launched it. This will give you a fixed public IP that you can target.
Elastic IPs are reserved and static until you release them. Keep in mind that they cost money when they are not reserved.
Further on you can use SecurityGroups to limit access to the instance.
In the end, we created & saved a network interface, assigning one of our private IPs to it. When recycling the EC2, and making a new one in the same role, we just assign that saved interface (with its IP) to the new EC2. Seems to get the job done for us!

Is it possible to re-associate an elastic ip with an ec2 instance after reboot if the elastic ip is associated to another running ec2 instance?

We have a setup where 3 ec2 instances each are associated with an elastic ip on its primary network interface eth0 so incoming requests can be served by these instances.
Each of these instances has a secondary network interface eth1 where in the event of a failure/ crash/ reboot of an instance, the elastic ip associated with that instance would be associated to one of the remaining running ec2 instances on that interface. This is some sort of failover mechanism as we always want those elastic ips to be served by some running instance so we don't lose any incoming requests.
The problem I have experienced is specifically on reboot of an instance. When an instance reboots, it cannot get back the public ip it had where this public ip is that of the elastic ip that is now associated with another instance. Thus this instance cannot access the internet unless I manually re-assign the elastic ip back to this instance.
Is it possible to automatically reclaim/re-associate the elastic ip it once had onto its eth1 interface on reboot? If not, do you have suggestions for a workaround?
Reboot is necessary as we would be doing unattended upgrades on the instances.
Update:
Also note that I need to use these elastic ips as they are the ones allowed in the firewall of a partner company we integrate with. Using ELBs won't work as its IP changes over time.
So here's how I finally solved this problem. What I missed out on was that Amazon only provides a new public IP to an instance under two conditions.
Its elastic IP is detached
It has just one network interface
So based on this, on startup, i configure the instance with two instances but i detach the secondary eth1 interface. Hence this makes the instance eligible for getting a new public IP (if for any reason it reboots).
Now for failover, once one of the running instances detects an instance has gone offline from the cluster (in this case, lets say it rebooted), it will then on the fly attach the secondary interface and associate the elastic IP to it. Hence, the elastic IP is now being served by atleast one of the running instances. The effect is instant.
Now when the failed instance comes back up after reboot, amazon already provided it a new non-elastic public IP. This was because it fulfilled the two conditions of having just one network interface and also its elastic IP was disassociated and re-associated to another running instance. Hence, this rebooted instance now has a new public IP and can connect to the internet on startup and do the necessary tasks it needs to configure itself and re-join the cluster. After that it re-associates back the elastic ip it needed to have.
Also, when the running instance that took over the elastic IP detects a new instance or the rebooted instance has come online, it detaches the secondary interface again so it would be eligible to get a new public ip as well if it rebooted.
This is how i handle the failover and making sure the elastic ips are always served. However this solution is not perfect and can be improved. It can scale to handling N failed/rebooted instances provided N network interfaces can be used for failover!
However if the instance that attached secondary interface(s) during failover reboots, it will not get a new public IP and will remain disconnected from the cluster, but atleast the elastic IPs would still be served by remaining live instances. This is only in the case of reboots.
BTW, atleast from all that i read, these conditions of getting a new public ip wasn't clearly mentioned in the amazon docs.
It sounds like you would be better served by using an elastic load balancer (ELB). You could just use one ELB and it would serve requests to your 3 application servers.
If one goes down, the ELB detects that and stops routing requests there. When it comes back online, the ELB detects that and adds it to the routing group again.
http://aws.amazon.com/elasticloadbalancing/

Connect to an EC2 instance running MySQL from another EC2 instance

It takes several minutes for a newly deployed version to ElasticBeanstalk to become available, so I am hoping that someone can spare me all the testing/experimenting :-)
Scenario 1:
I need to connect to an EC2 instance running MySQL from another EC2 instance but belonging to a different security group. Do I use the public DNS or the private IP to specify the MySQL host?
Scenario 2:
Same as above except both instances belong to the same security group. I believe that I need to use the private IP in this case, correct? Would the public DNS also work?
Thank you!
You should always use the private IP when possible for ec2 instances communicating with each other.
Among other reasons, you will get charged money for using the public ip even though the machines are started in the same availability zone.
Also, the security group is just a set of inbound and outbound rules, it doesn't matter that the two machines are in different groups with different rules so long as your mysql server can accept traffic on the port from the other ec2 instance based on the ruleset.
If you're going to be starting and stopping instances frequently, you might benefit from creating an elastic IP and attaching it to instances as needed instead of constantly changing configuration files.