Terraform: order of creating array of resources - amazon-web-services

I have two ec2 instances defined in terraform using the count method.
resource "aws_instance" "example" {
count = "2"
ami = "ami-2d39803a"
instance_type = "t2.micro"
tags {
Name = "example-${count.index}"
}
}
How can I enforce that they are launched one after the other? e.g. the second instance should be created when the first one finishes.
Attempt 1:
depends_on = [aws_instance.example[0]]
result:
Error: aws_instance.example: resource depends on non-existent resource 'aws_instance.example[0]'
Attempt 2:
tags {
Name = "example-${count.index}"
Active = "${count.index == "1" ? "${aws_instance.example.1.arn}" : "this"}"
}
result:
Error: aws_instance.example[0]: aws_instance.example[0]: self reference not allowed: "aws_instance.example.0.arn"
Which leads me to believe the interpolation is calculated after the instance configurations are complete thus it doesn't see that there isn't in fact a circular dependency.
Any ideas?
Thanks

Use terraform apply -parallelism=1 to limit the number of concurrent operations to 1 at a time.

Related

How can I configure Terraform to update a GCP compute engine instance template without destroying and re-creating?

I have a service deployed on GCP compute engine. It consists of a compute engine instance template, instance group, instance group manager, and load balancer + associated forwarding rules etc.
We're forced into using compute engine rather than Cloud Run or some other serverless offering due to the need for docker-in-docker for the service in question.
The deployment is managed by terraform. I have a config that looks something like this:
data "google_compute_image" "debian_image" {
family = "debian-11"
project = "debian-cloud"
}
resource "google_compute_instance_template" "my_service_template" {
name = "my_service"
machine_type = "n1-standard-1"
disk {
source_image = data.google_compute_image.debian_image.self_link
auto_delete = true
boot = true
}
...
metadata_startup_script = data.local_file.startup_script.content
metadata = {
MY_ENV_VAR = var.whatever
}
}
resource "google_compute_region_instance_group_manager" "my_service_mig" {
version {
instance_template = google_compute_instance_template.my_service_template.id
name = "primary"
}
...
}
resource "google_compute_region_backend_service" "my_service_backend" {
...
backend {
group = google_compute_region_instance_group_manager.my_service_mig.instance_group
}
}
resource "google_compute_forwarding_rule" "my_service_frontend" {
depends_on = [
google_compute_region_instance_group_manager.my_service_mig,
]
name = "my_service_ilb"
backend_service = google_compute_region_backend_service.my_service_backend.id
...
}
I'm running into issues where Terraform is unable to perform any kind of update to this service without running into conflicts. It seems that instance templates are immutable in GCP, and doing anything like updating the startup script, adding an env var, or similar forces it to be deleted and re-created.
Terraform prints info like this in that situation:
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
~ update in-place
-/+ destroy and then create replacement
Terraform will perform the following actions:
# module.connectors_compute_engine.google_compute_instance_template.airbyte_translation_instance1 must be replaced
-/+ resource "google_compute_instance_template" "my_service_template" {
~ id = "projects/project/..." -> (known after apply)
~ metadata = { # forces replacement
+ "TEST" = "test"
# (1 unchanged element hidden)
}
The only solution I've found for getting out of this situation is to entirely delete the entire service and all associated entities from the load balancer down to the instance template and re-create them.
Is there some way to avoid this situation so that I'm able to change the instance template without having to manually update all the terraform config two times? At this point I'm even fine if it ends up creating some downtime for the service in question rather than a full rolling update or something since that's what's happening now anyway.
I was triggered by this issue as well.
However, according to:
https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_instance_template#using-with-instance-group-manager
Instance Templates cannot be updated after creation with the Google
Cloud Platform API. In order to update an Instance Template, Terraform
will destroy the existing resource and create a replacement. In order
to effectively use an Instance Template resource with an Instance
Group Manager resource, it's recommended to specify
create_before_destroy in a lifecycle block. Either omit the Instance
Template name attribute, or specify a partial name with name_prefix.
I would also test and plan with this lifecycle meta argument as well:
+ lifecycle {
+ prevent_destroy = true
+ }
}
Or more realistically in your specific case, something like:
resource "google_compute_instance_template" "my_service_template" {
version {
instance_template = google_compute_instance_template.my_service_template.id
name = "primary"
}
+ lifecycle {
+ create_before_destroy = true
+ }
}
So terraform plan with either create_before_destroy or prevent_destroy = true before terraform apply on google_compute_instance_template to see results.
Ultimately, you can remove google_compute_instance_template.my_service_template.id from state file and import it back.
Some suggested workarounds in this thread:
terraform lifecycle prevent destroy

How do I iterate over a 'count' resource in Terraform?

In this example, I'm trying to create 3 EC2 instances which each have an elastic IP assigned. I want to achieve this by saying the following.
resource "aws_instance" "web_servers" {
ami = "ami-09e67e426f25ce0d7"
instance_type = "t3.micro"
...
count = 3
}
and, along with other networking instances,
resource "aws_eip" "elastic_ip" {
for_each = aws_instance.web_servers
instance = each.key
vpc = true
}
However, this is saying the following:
The given "for_each" argument value is unsuitable: the "for_each" argument must be a map, or set of strings, and you have provided a value of type tuple.
I have tried wrapping the for_each in a toset() which also says there is an issue with an unknown number of instances - I know there are 3 though. Is there something I'm missing around the count & for_each keywords?
If you really want to use for_each, rather then count again, it should be:
resource "aws_eip" "elastic_ip" {
for_each = {for idx, val in aws_instance.web_servers: idx => val}
instance = each.value.id
vpc = true
}
But since you are using count in the first place, it would be probably better to use count for your aws_eip as well:
resource "aws_eip" "elastic_ip" {
count = length(aws_instance.web_servers)
instance = aws_instance.web_servers[count.index].id
vpc = true
}
In AWS is your instance field = EC2 ID = ARN?
If so you may be able to access the arn attribute in the aws_instance resource block.
You could try toset(aws_instance.web_servers.arn) and keep instance = each.key. I generally just use each.value but they should be the same when dealing with a set.

Terraform: Handle error if no EC2 Instance Type offerings found in AZ

We are spinning up G4 instances in AWS through Terraform and often encounter issues where one or two of the AZs in the given Region don't support G4 Instance type.
As of now I have hardcoded our TF configuration as per below where I am creating a map of Region and AZs as "azs" variable. From this map I can spin up clusters in targeted AZs of the Region where we have G4 Instance support.
I am using aws command line mentioned in this AWS article to find which AZs are supported in a given Region and updating our "azs" variable as we expand to other Regions.
variable "azs" {
default = {
"us-west-2" = "us-west-2a,us-west-2b,us-west-2c"
"us-east-1" = "us-east-1a,us-east-1b,us-east-1e"
"eu-west-1" = "eu-west-1a,eu-west-1b,eu-west-1c"
"eu-west-2" = "eu-west-2a,eu-west-2b,eu-west-2c"
"eu-west-3" = "eu-west-3a,eu-west-3c"
}
However the above approach required human intervention and updates frequently (If AWS adds support to non-supported AZs in a given region later on )
There is this stack overflow question where User is trying to do the same thing however he can use the fallback instance type lets say if any of the AZs are not supported for given instance type.
In my use-case , I can't use any other fall back instance type since our app-servers only runs on G4.
I have tried to use the workaround mentioned as an Answer in the above stack overflow question however its failing with the following error message.
Error: no EC2 Instance Type Offerings found matching criteria; try
different search
on main.tf line 8, in data "aws_ec2_instance_type_offering"
"example": 8: data "aws_ec2_instance_type_offering" "example" {
I am using the TF config as below where my preferred_instance_types is g4dn.xlarge.
provider "aws" {
version = "2.70"
}
data "aws_availability_zones" "all" {
state = "available"
}
data "aws_ec2_instance_type_offering" "example" {
for_each = toset(data.aws_availability_zones.all.names)
filter {
name = "instance-type"
values = ["g4dn.xlarge"]
}
filter {
name = "location"
values = [each.value]
}
location_type = "availability-zone"
preferred_instance_types = ["g4dn.xlarge"]
}
output "foo" {
value = { for az, details in data.aws_ec2_instance_type_offering.example : az => details.instance_type }
}
I would like to know how to handle this failure as Terraform is not able to find the g4 instance type in one of the AZs of a given region and failing.
Is there any Terraform Error handling I can do to by pass this error for now and get the supported AZs as an Output ?
I had checked that other question you mentioned earlier, but i could never get the output correctly. Thanks to #ydaetskcoR for this response in that post - I could learn a bit and get my loop working.
Here is one way to accomplish what you are looking for... Let me know if it works for you.
Instead of "aws_ec2_instance_type_offering", use "aws_ec2_instance_type_offerings" ... (there is a 's' in the end. they are different Data Sources...
I will just paste the code here and assume you will be able to decode the logic. I am filtering for one specific instance type and if its not supported, instance_types will be black and i make a list of AZ thats does not do not have blank values.
variable "az" {
default="us-east-1"
}
variable "my_inst" {
default="g4dn.xlarge"
}
data "aws_availability_zones" "example" {
filter {
name = "opt-in-status"
values = ["opt-in-not-required"]
}
}
data "aws_ec2_instance_type_offerings" "example" {
for_each=toset(data.aws_availability_zones.example.names)
filter {
name = "instance-type"
values = [var.my_inst]
}
filter {
name = "location"
values = ["${each.key}"]
}
location_type = "availability-zone"
}
output "az_where_inst_avail" {
value = keys({ for az, details in data.aws_ec2_instance_type_offerings.example :
az => details.instance_types if length(details.instance_types) != 0 })
}
The output will look like below. us-east-1e does not have the instance type and its not there in the Output. Do test a few cases to see if it works everytime.
Outputs:
az_where_inst_avail = [
"us-east-1a",
"us-east-1b",
"us-east-1c",
"us-east-1d",
"us-east-1f",
]
I think there's a cleaner way. The data source already filters for the availability zone based off of the given filter. There is an attribute -> locations that will produce a list of the desired location_type.
provider "aws" {
region = var.region
}
data "aws_ec2_instance_type_offerings" "available" {
filter {
name = "instance-type"
values = [var.instance_type]
}
location_type = "availability-zone"
}
output "azs" {
value = data.aws_ec2_instance_type_offerings.available.locations
}
Where the instance_type is t3.micro and region is us-east-1, this accurately produces:
azs = tolist([
"us-east-1d",
"us-east-1a",
"us-east-1c",
"us-east-1f",
"us-east-1b",
])
You don't need to feed it a list of availability zones because it already gets those from the supplied region.

terraform count dependent on data from target environment

I'm getting the following error when trying to initially plan or apply a resource that is using the data values from the AWS environment to a count.
$ terraform plan
Refreshing Terraform state in-memory prior to plan...
The refreshed state will be used to calculate this plan, but will not be
persisted to local or remote state storage.
------------------------------------------------------------------------
Error: Invalid count argument
on main.tf line 24, in resource "aws_efs_mount_target" "target":
24: count = length(data.aws_subnet_ids.subnets.ids)
The "count" value depends on resource attributes that cannot be determined
until apply, so Terraform cannot predict how many instances will be created.
To work around this, use the -target argument to first apply only the
resources that the count depends on.
$ terraform --version
Terraform v0.12.9
+ provider.aws v2.30.0
I tried using the target option but doesn't seem to work on data type.
$ terraform apply -target aws_subnet_ids.subnets
Apply complete! Resources: 0 added, 0 changed, 0 destroyed.
The only solution I found that works is:
remove the resource
apply the project
add the resource back
apply again
Here is a terraform config I created for testing.
provider "aws" {
version = "~> 2.0"
}
locals {
project_id = "it_broke_like_3_collar_watch"
}
terraform {
required_version = ">= 0.12"
}
resource aws_default_vpc default {
}
data aws_subnet_ids subnets {
vpc_id = aws_default_vpc.default.id
}
resource aws_efs_file_system efs {
creation_token = local.project_id
encrypted = true
}
resource aws_efs_mount_target target {
depends_on = [ aws_efs_file_system.efs ]
count = length(data.aws_subnet_ids.subnets.ids)
file_system_id = aws_efs_file_system.efs.id
subnet_id = tolist(data.aws_subnet_ids.subnets.ids)[count.index]
}
Finally figured out the answer after researching the answer by Dude0001.
Short Answer. Use the aws_vpc data source with the default argument instead of the aws_default_vpc resource. Here is the working sample with comments on the changes.
locals {
project_id = "it_broke_like_3_collar_watch"
}
terraform {
required_version = ">= 0.12"
}
// Delete this --> resource aws_default_vpc default {}
// Add this
data aws_vpc default {
default = true
}
data "aws_subnet_ids" "subnets" {
// Update this from aws_default_vpc.default.id
vpc_id = "${data.aws_vpc.default.id}"
}
resource aws_efs_file_system efs {
creation_token = local.project_id
encrypted = true
}
resource aws_efs_mount_target target {
depends_on = [ aws_efs_file_system.efs ]
count = length(data.aws_subnet_ids.subnets.ids)
file_system_id = aws_efs_file_system.efs.id
subnet_id = tolist(data.aws_subnet_ids.subnets.ids)[count.index]
}
What I couldn't figure out was why my work around of removing aws_efs_mount_target on the first apply worked. It's because after the first apply the aws_default_vpc was loaded into the state file.
So an alternate solution without making change to the original tf file would be to use the target option on the first apply:
$ terraform apply --target aws_default_vpc.default
However, I don't like this as it requires a special case on first deployment which is pretty unique for the terraform deployments I've worked with.
The aws_default_vpc isn't a resource TF can create or destroy. It is the default VPC for your account in each region that AWS creates automatically for you that is protected from being destroyed. You can only (and need to) adopt it in to management and your TF state. This will allow you to begin managing and to inspect when you run plan or apply. Otherwise, TF doesn't know what the resource is or what state it is in, and it cannot create a new one for you as it s a special type of protected resource as described above.
With that said, go get the default VPC id from the correct region you are deploying in your account. Then import it into your TF state. It should then be able to inspect and count the number of subnets.
For example
terraform import aws_default_vpc.default vpc-xxxxxx
https://www.terraform.io/docs/providers/aws/r/default_vpc.html
Using the data element for this looks a little odd to me as well. Can you change your TF script to get the count directly through the aws_default_vpc resource?

Terraform: length() on data source cannot be determined until apply?

I am trying to dynamically declare multiple aws_nat_gateway data sources by retrieving the list of public subnets through the aws_subnet_ids data source. However, when I try to set the count parameter to be equal to the length of the subnet IDs, I get an error saying The "count" value depends on resource attributes that cannot be determined until apply....
This is almost in direct contradiction to the example in their documentation!. How do I fix this? Is their documentation wrong?
I am using Terraform v0.12.
data "aws_vpc" "environment_vpc" {
id = var.vpc_id
}
data "aws_subnet_ids" "public_subnet_ids" {
vpc_id = data.aws_vpc.environment_vpc.id
tags = {
Tier = "public"
}
depends_on = [data.aws_vpc.environment_vpc]
}
data "aws_nat_gateway" "nat_gateway" {
count = length(data.aws_subnet_ids.public_subnet_ids.ids) # <= Error
subnet_id = data.aws_subnet_ids.public_subnet_ids.ids.*[count.index]
depends_on = [data.aws_subnet_ids.public_subnet_ids]
}
I expect to be able to apply this template successfully, but I am getting the following error:
Error: Invalid count argument
on ../src/variables.tf line 78, in data "aws_nat_gateway" "nat_gateway":
78: count = "${length(data.aws_subnet_ids.public_subnet_ids.ids)}"
The "count" value depends on resource attributes that cannot be determined
until apply, so Terraform cannot predict how many instances will be created.
To work around this, use the -target argument to first apply only the
resources that the count depends on.
It seems you are trying to fetch subnets that weren't created yet or they couldn't be determinated, the terraform cmd output suggests you add -target flag to create the VPC and subnets or do another task first, after that, you'll apply the nat_gateway resource. I suggest you use the AZs list instead of subnets ids, I'll add a simple example below.
variable "vpc_azs_list" {
default = [
"us-east-1d",
"us-east-1e"
]
}
resource "aws_nat_gateway" "nat" {
count = var.enable_nat_gateways ? length(var.azs_list) : 0
allocation_id = "xxxxxxxxx"
subnet_id = "xxxxxxxxx"
depends_on = [
aws_internet_gateway.main,
aws_eip.nat_eip,
]
tags = {
"Name" = "nat-gateway-name"
"costCenter" = "xxxxxxxxx"
"owner" = "xxxxxxxxx"
}
}
I hope will be useful to you and other users.