Google Cloud VM Autoscaling in Terraform - Updating Images - google-cloud-platform

I can create a fully-functioning autoscaling group in GCP Compute via Terraform, but it's not clear to me how to update the ASG to use a new image.
Here's an example of a working ASG:
resource "google_compute_region_autoscaler" "default" {
name = "example-autoscaler"
region = "us-west1"
project = "my-project
target = google_compute_region_instance_group_manager.default.id
autoscaling_policy {
max_replicas = 10
min_replicas = 3
cooldown_period = 60
cpu_utilization {
target = 0.5
}
}
}
resource "google_compute_region_instance_group_manager" "default" {
name = "example-igm"
region = "us-west1"
version {
instance_template = google_compute_instance_template.default.id
name = "primary"
}
target_pools = [google_compute_target_pool.default.id]
base_instance_name = "example"
}
resource "google_compute_target_pool" "default" {
name = "example-pool"
}
resource "google_compute_instance_template" "default" {
name = "example-template"
machine_type = "e2-medium"
can_ip_forward = false
tags = ["my-tag"]
disk {
source_image = data.google_compute_image.default.id
}
network_interface {
subnetwork = "my-subnet"
}
}
data "google_compute_image" "default" {
name = "my-image"
}
My goal is to be able to create a new Image (out of band) and then update my infrastructure to utilize it. It doesn't appear possible to change a google_compute_instance_template while it's in use.
One option I can think of is to create two separate templates, and then adjust the google_compute_region_instance_group_manager to refer to a different google_compute_instance_template which references the new image.
Another possible option is to use the version block inside the instance group manager. You can use this similarly to above to essentially toggle between two versions. You start with one "version" at 100%, and the other at 0%. When you create a new image, you update the version that is at 0% to point to the new image and change its skew to 100% and the other to 0%. I'm not actually sure this works, because you'd still have to update the template of the version that's at 0% and it might actually be in use.
Regardless, both of those methods are incredibly bulky for a large-scale production environment where we have multiple autoscaling groups in multiple regions and update them frequently.
Ideally I'd be able to change a variable that represents the image, terraform apply and that's it. Is there any way to do what I'm describing?

You need to use the lifecycle block with the create_before_destroy but also the attribute name of the google_compute_instance_template resource must be omitted or replaced by name_prefix attribute.
With this setup Terraform generates a unique name for your Instance Template and can then update the Instance Group manager without conflict before destroying the previous Instance Template.
References:
https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_instance_template#using-with-instance-group-manager

Related

How can I configure Terraform to update a GCP compute engine instance template without destroying and re-creating?

I have a service deployed on GCP compute engine. It consists of a compute engine instance template, instance group, instance group manager, and load balancer + associated forwarding rules etc.
We're forced into using compute engine rather than Cloud Run or some other serverless offering due to the need for docker-in-docker for the service in question.
The deployment is managed by terraform. I have a config that looks something like this:
data "google_compute_image" "debian_image" {
family = "debian-11"
project = "debian-cloud"
}
resource "google_compute_instance_template" "my_service_template" {
name = "my_service"
machine_type = "n1-standard-1"
disk {
source_image = data.google_compute_image.debian_image.self_link
auto_delete = true
boot = true
}
...
metadata_startup_script = data.local_file.startup_script.content
metadata = {
MY_ENV_VAR = var.whatever
}
}
resource "google_compute_region_instance_group_manager" "my_service_mig" {
version {
instance_template = google_compute_instance_template.my_service_template.id
name = "primary"
}
...
}
resource "google_compute_region_backend_service" "my_service_backend" {
...
backend {
group = google_compute_region_instance_group_manager.my_service_mig.instance_group
}
}
resource "google_compute_forwarding_rule" "my_service_frontend" {
depends_on = [
google_compute_region_instance_group_manager.my_service_mig,
]
name = "my_service_ilb"
backend_service = google_compute_region_backend_service.my_service_backend.id
...
}
I'm running into issues where Terraform is unable to perform any kind of update to this service without running into conflicts. It seems that instance templates are immutable in GCP, and doing anything like updating the startup script, adding an env var, or similar forces it to be deleted and re-created.
Terraform prints info like this in that situation:
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
~ update in-place
-/+ destroy and then create replacement
Terraform will perform the following actions:
# module.connectors_compute_engine.google_compute_instance_template.airbyte_translation_instance1 must be replaced
-/+ resource "google_compute_instance_template" "my_service_template" {
~ id = "projects/project/..." -> (known after apply)
~ metadata = { # forces replacement
+ "TEST" = "test"
# (1 unchanged element hidden)
}
The only solution I've found for getting out of this situation is to entirely delete the entire service and all associated entities from the load balancer down to the instance template and re-create them.
Is there some way to avoid this situation so that I'm able to change the instance template without having to manually update all the terraform config two times? At this point I'm even fine if it ends up creating some downtime for the service in question rather than a full rolling update or something since that's what's happening now anyway.
I was triggered by this issue as well.
However, according to:
https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_instance_template#using-with-instance-group-manager
Instance Templates cannot be updated after creation with the Google
Cloud Platform API. In order to update an Instance Template, Terraform
will destroy the existing resource and create a replacement. In order
to effectively use an Instance Template resource with an Instance
Group Manager resource, it's recommended to specify
create_before_destroy in a lifecycle block. Either omit the Instance
Template name attribute, or specify a partial name with name_prefix.
I would also test and plan with this lifecycle meta argument as well:
+ lifecycle {
+ prevent_destroy = true
+ }
}
Or more realistically in your specific case, something like:
resource "google_compute_instance_template" "my_service_template" {
version {
instance_template = google_compute_instance_template.my_service_template.id
name = "primary"
}
+ lifecycle {
+ create_before_destroy = true
+ }
}
So terraform plan with either create_before_destroy or prevent_destroy = true before terraform apply on google_compute_instance_template to see results.
Ultimately, you can remove google_compute_instance_template.my_service_template.id from state file and import it back.
Some suggested workarounds in this thread:
terraform lifecycle prevent destroy

Is it possible to update existing compute engine using terraform?

I'm a beginner in terraform and I'm planning to create a terraform module that updates the tags and service account of an existing compute engine in google cloud.
The problem is this existing VMs were provisioned manually and there is a need to update the tags and service account occasionally.
So far, I created a small module that use data to fetch the details and resource to create my changes but this doesn't fit the requirement as its creating a new resource.
data "google_compute_instance" "data_host" {
name = var.instance_name
project = var.project_name
zone = var.project_zone
}
resource "google_compute_instance" "host" {
name = data.google_compute_instance.data_host.self_link
zone = data.google_compute_instance.data_host.self_link
machine_type = data.google_compute_instance.data_host.self_link
boot_disk {
source = data.google_compute_instance.data_host.self_link
}
network_interface {
network = data.google_compute_instance.data_host.self_link
}
service_account {
email = var.service_account
scopes = ["cloud-platform"]
}
etc...
}
are there anyway to update my existing VMs without recreating the entire system?

Dinamically add resources in Terraform

I set up a jenkins pipeline that launches terraform to create a new EC2 instance in our VPC and register it to our private hosted zone on R53 (which is created at the same time) at every run.
I also managed to save the state into S3 so it doesn't fail with the hosted zone being re-created.
the main issue I have is that at every run terraform keeps replacing the previous instance with the new one and not adding it to the pool of instances.
How can avoid this?
here's a snippet of my code
terraform {
backend "s3" {
bucket = "<redacted>"
key = "<redacted>/terraform.tfstate"
region = "eu-west-1"
}
}
provider "aws" {
region = "${var.region}"
}
data "aws_ami" "image" {
# limit search criteria for performance
most_recent = "${var.ami_filter_most_recent}"
name_regex = "${var.ami_filter_name_regex}"
owners = ["${var.ami_filter_name_owners}"]
# filter on tag purpose
filter {
name = "tag:purpose"
values = ["${var.ami_filter_purpose}"]
}
# filter on tag os
filter {
name = "tag:os"
values = ["${var.ami_filter_os}"]
}
}
resource "aws_instance" "server" {
# use extracted ami from image data source
ami = data.aws_ami.image.id
availability_zone = data.aws_subnet.most_available.availability_zone
subnet_id = data.aws_subnet.most_available.id
instance_type = "${var.instance_type}"
vpc_security_group_ids = ["${var.security_group}"]
user_data = "${var.user_data}"
iam_instance_profile = "${var.iam_instance_profile}"
root_block_device {
volume_size = "${var.root_disk_size}"
}
ebs_block_device {
device_name = "${var.extra_disk_device_name}"
volume_size = "${var.extra_disk_size}"
}
tags = {
Name = "${local.available_name}"
}
}
resource "aws_route53_zone" "private" {
name = var.hosted_zone_name
vpc {
vpc_id = var.vpc_id
}
}
resource "aws_route53_record" "record" {
zone_id = aws_route53_zone.private.zone_id
name = "${local.available_name}.${var.hosted_zone_name}"
type = "A"
ttl = "300"
records = [aws_instance.server.private_ip]
depends_on = [
aws_route53_zone.private
]
}
the outcome is that my previously created instance is destroyed and a new one is created. what I want is to keep adding instances with this code.
thank you
Your code creates only one instance aws_instance.server, and any change to its properties will modify that one instance only as your backend is in S3, thus it acts as a global state for each pipeline. The same goes for aws_route53_record.record and anything else in your script.
If you want different pipelines to reuse the same exact script, you should either use different workspaces, or create different TF states for each pipeline. The other alternative is to redefine your TF script to take a map of instances as an input variable and use for_each to create different instances.
If those instances should be same, you should manage their count using using aws_autoscaling_group and desired capacity.

How to set auto-delete option for additional attached_disk in gcp instance uing terraform?

I am trying to create a vm instance in gcp with a boot_disk and additional attached_disk using terraform. I could not find any parameter to auto delete the additional attached_disk when instance is deleted.
auto-delete option is availble in gcp console.
Terraform code:
resource "google_compute_disk" "elastic-disk" {
count = var.no_of_elastic_intances
name = "elastic-disk-${count.index+1}-data"
type = "pd-standard"
size = "10"
}
resource "google_compute_instance" "elastic" {
count = var.no_of_elastic_intances
name = "${var.elastic_instance_name_prefix}-${count.index+1}"
machine_type = var.elastic_instance_machine_type
boot_disk {
auto_delete = true
mode = "READ_WRITE"
initialize_params {
image = var.elastic_instance_image_type
type = var.elastic_instance_disc_type
size = var.elasitc_instance_disc_size
}
}
attached_disk {
source = "${element(google_compute_disk.elastic-disk.*.self_link, count.index)}"
mode = "READ_WRITE"
}
network_interface {
network = var.elastic_instance_network
access_config {
}
}
}
The feature to set auto-delete for attached disks is not supported. HashiCorp/Google decided to not support this feature for Terraform.
Reference this issue:
If Terraform were told to remove the instance, but not the disks, and
auto-delete were enabled, then it would not specifically delete the
disks, but they would still be deleted by GCP. This behaviour would
not be shown in a plan run, and so could lead to unwanted outcomes, as
well as the state still showing the disks existing.
My opinion is that Terraform should manage the entire lifecycle from creation to destruction. For disks that you want to attach to a new instance, create those disks as part of your Terraform HCL and destroy them as part of your HCL.

Terraform and DigitalOcean: assign volume to specific droplet created with count parameter

just started exploring terraform to spin up droplets and volumes on digital ocean.
My question is to know the right way to do the following:
create a certain number of droplet instances using count within digitalocean_dropletresource named ubuntu16
assign a digitalocean_volume only to one or a subset of previously created droplets.
How to do it?I was assuming to use droplets_id property on digitalocean_volume resource. Something like:
resource "digitalocean_volume" "foovolume" {
...
droplet_ids = ["${digitalocean_droplet.ubuntu16.0.id}"]
}
Validating it with terraform validate I got:
Error: digitalocean_volume.foovolume: "droplet_ids": this field cannot be set
Any advice? Thanks to any inputs on it.
Regards
The way the Terraform provider for DigtialOcean is currently implemented requires that you take the opposite approach. You can specify which volumes are attached to which Droplets by defining the volume_ids of the Droplet resource. For example:
resource "digitalocean_volume" "volume" {
region = "nyc3"
count = 3
name = "volume-${count.index + 1}"
size = 100
description = "an example volume"
}
resource "digitalocean_droplet" "web" {
count = 3
image = "ubuntu-17-10-x64"
name = "web-${count.index + 1}"
region = "nyc3"
size = "1gb"
volume_ids = ["${element(digitalocean_volume.volume.*.id, count.index)}"]
}
If you look at the docs for the volume resource, you'll see that droplet_ids is a "computed" field. This means that you are unable to set the field, and that its value is computed by Terraform via the provider's API.