I'm getting the following error when trying to initially plan or apply a resource that is using the data values from the AWS environment to a count.
$ terraform plan
Refreshing Terraform state in-memory prior to plan...
The refreshed state will be used to calculate this plan, but will not be
persisted to local or remote state storage.
------------------------------------------------------------------------
Error: Invalid count argument
on main.tf line 24, in resource "aws_efs_mount_target" "target":
24: count = length(data.aws_subnet_ids.subnets.ids)
The "count" value depends on resource attributes that cannot be determined
until apply, so Terraform cannot predict how many instances will be created.
To work around this, use the -target argument to first apply only the
resources that the count depends on.
$ terraform --version
Terraform v0.12.9
+ provider.aws v2.30.0
I tried using the target option but doesn't seem to work on data type.
$ terraform apply -target aws_subnet_ids.subnets
Apply complete! Resources: 0 added, 0 changed, 0 destroyed.
The only solution I found that works is:
remove the resource
apply the project
add the resource back
apply again
Here is a terraform config I created for testing.
provider "aws" {
version = "~> 2.0"
}
locals {
project_id = "it_broke_like_3_collar_watch"
}
terraform {
required_version = ">= 0.12"
}
resource aws_default_vpc default {
}
data aws_subnet_ids subnets {
vpc_id = aws_default_vpc.default.id
}
resource aws_efs_file_system efs {
creation_token = local.project_id
encrypted = true
}
resource aws_efs_mount_target target {
depends_on = [ aws_efs_file_system.efs ]
count = length(data.aws_subnet_ids.subnets.ids)
file_system_id = aws_efs_file_system.efs.id
subnet_id = tolist(data.aws_subnet_ids.subnets.ids)[count.index]
}
Finally figured out the answer after researching the answer by Dude0001.
Short Answer. Use the aws_vpc data source with the default argument instead of the aws_default_vpc resource. Here is the working sample with comments on the changes.
locals {
project_id = "it_broke_like_3_collar_watch"
}
terraform {
required_version = ">= 0.12"
}
// Delete this --> resource aws_default_vpc default {}
// Add this
data aws_vpc default {
default = true
}
data "aws_subnet_ids" "subnets" {
// Update this from aws_default_vpc.default.id
vpc_id = "${data.aws_vpc.default.id}"
}
resource aws_efs_file_system efs {
creation_token = local.project_id
encrypted = true
}
resource aws_efs_mount_target target {
depends_on = [ aws_efs_file_system.efs ]
count = length(data.aws_subnet_ids.subnets.ids)
file_system_id = aws_efs_file_system.efs.id
subnet_id = tolist(data.aws_subnet_ids.subnets.ids)[count.index]
}
What I couldn't figure out was why my work around of removing aws_efs_mount_target on the first apply worked. It's because after the first apply the aws_default_vpc was loaded into the state file.
So an alternate solution without making change to the original tf file would be to use the target option on the first apply:
$ terraform apply --target aws_default_vpc.default
However, I don't like this as it requires a special case on first deployment which is pretty unique for the terraform deployments I've worked with.
The aws_default_vpc isn't a resource TF can create or destroy. It is the default VPC for your account in each region that AWS creates automatically for you that is protected from being destroyed. You can only (and need to) adopt it in to management and your TF state. This will allow you to begin managing and to inspect when you run plan or apply. Otherwise, TF doesn't know what the resource is or what state it is in, and it cannot create a new one for you as it s a special type of protected resource as described above.
With that said, go get the default VPC id from the correct region you are deploying in your account. Then import it into your TF state. It should then be able to inspect and count the number of subnets.
For example
terraform import aws_default_vpc.default vpc-xxxxxx
https://www.terraform.io/docs/providers/aws/r/default_vpc.html
Using the data element for this looks a little odd to me as well. Can you change your TF script to get the count directly through the aws_default_vpc resource?
Related
I have a service deployed on GCP compute engine. It consists of a compute engine instance template, instance group, instance group manager, and load balancer + associated forwarding rules etc.
We're forced into using compute engine rather than Cloud Run or some other serverless offering due to the need for docker-in-docker for the service in question.
The deployment is managed by terraform. I have a config that looks something like this:
data "google_compute_image" "debian_image" {
family = "debian-11"
project = "debian-cloud"
}
resource "google_compute_instance_template" "my_service_template" {
name = "my_service"
machine_type = "n1-standard-1"
disk {
source_image = data.google_compute_image.debian_image.self_link
auto_delete = true
boot = true
}
...
metadata_startup_script = data.local_file.startup_script.content
metadata = {
MY_ENV_VAR = var.whatever
}
}
resource "google_compute_region_instance_group_manager" "my_service_mig" {
version {
instance_template = google_compute_instance_template.my_service_template.id
name = "primary"
}
...
}
resource "google_compute_region_backend_service" "my_service_backend" {
...
backend {
group = google_compute_region_instance_group_manager.my_service_mig.instance_group
}
}
resource "google_compute_forwarding_rule" "my_service_frontend" {
depends_on = [
google_compute_region_instance_group_manager.my_service_mig,
]
name = "my_service_ilb"
backend_service = google_compute_region_backend_service.my_service_backend.id
...
}
I'm running into issues where Terraform is unable to perform any kind of update to this service without running into conflicts. It seems that instance templates are immutable in GCP, and doing anything like updating the startup script, adding an env var, or similar forces it to be deleted and re-created.
Terraform prints info like this in that situation:
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
~ update in-place
-/+ destroy and then create replacement
Terraform will perform the following actions:
# module.connectors_compute_engine.google_compute_instance_template.airbyte_translation_instance1 must be replaced
-/+ resource "google_compute_instance_template" "my_service_template" {
~ id = "projects/project/..." -> (known after apply)
~ metadata = { # forces replacement
+ "TEST" = "test"
# (1 unchanged element hidden)
}
The only solution I've found for getting out of this situation is to entirely delete the entire service and all associated entities from the load balancer down to the instance template and re-create them.
Is there some way to avoid this situation so that I'm able to change the instance template without having to manually update all the terraform config two times? At this point I'm even fine if it ends up creating some downtime for the service in question rather than a full rolling update or something since that's what's happening now anyway.
I was triggered by this issue as well.
However, according to:
https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_instance_template#using-with-instance-group-manager
Instance Templates cannot be updated after creation with the Google
Cloud Platform API. In order to update an Instance Template, Terraform
will destroy the existing resource and create a replacement. In order
to effectively use an Instance Template resource with an Instance
Group Manager resource, it's recommended to specify
create_before_destroy in a lifecycle block. Either omit the Instance
Template name attribute, or specify a partial name with name_prefix.
I would also test and plan with this lifecycle meta argument as well:
+ lifecycle {
+ prevent_destroy = true
+ }
}
Or more realistically in your specific case, something like:
resource "google_compute_instance_template" "my_service_template" {
version {
instance_template = google_compute_instance_template.my_service_template.id
name = "primary"
}
+ lifecycle {
+ create_before_destroy = true
+ }
}
So terraform plan with either create_before_destroy or prevent_destroy = true before terraform apply on google_compute_instance_template to see results.
Ultimately, you can remove google_compute_instance_template.my_service_template.id from state file and import it back.
Some suggested workarounds in this thread:
terraform lifecycle prevent destroy
I current have code that I have been using for quiet sometime that calls a custom S3 module. Today I tried to run the same code and I started getting an error regarding the provider.
╷ │ Error: Failed to query available provider packages │ │ Could not
retrieve the list of available versions for provider hashicorp/s3:
provider registry registry.terraform.io does not have a provider named
│ registry.terraform.io/hashicorp/s3 │ │ All modules should specify
their required_providers so that external consumers will get the
correct providers when using a module. To see which modules │ are
currently depending on hashicorp/s3, run the following command: │
terraform providers
Doing some digging seems that terraform is looking for a module registry.terraform.io/hashicorp/s3, which doesn't exist.
So far, I have tried the following things:
Validated that the S3 Resource code meets the standards of the upgrade Hashicorp did to 4.x this year. Plus I have been using it for a couple of months with no issues.
Delete .terraform directory and rerun terraform init (No success same error)
Delete .terraform directory and .terraform.hcl lock and run terraform init -upgrade (No Success)
I have tried to update my provider's file to try to force an upgrade (no Success)
I tried to change the provider to >= current version to pull the latest version with no success
Reading further, it refers to a caching problem of the terraform modules. I tried to run terraform providers lock and received this error.
Error: Could not retrieve providers for locking │ │ Terraform failed
to fetch the requested providers for darwin_amd64 in order to
calculate their checksums: some providers could not be installed: │ -
registry.terraform.io/hashicorp/s3: provider registry
registry.terraform.io does not have a provider named
registry.terraform.io/hashicorp/s3.
Kind of at my wits with what could be wrong. below is a copy of my version.tf which I changed from providers.tf based on another post I was following:
version.tf
# Configure the AWS Provider
provider "aws" {
region = "us-east-1"
use_fips_endpoint = true
}
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = ">= 4.9.0"
}
local = {
source = "hashicorp/local"
version = "~> 2.2.1"
}
}
required_version = ">= 1.2.0" #required terraform version
}
S3 Module
I did not include locals, outputs, or variables unless someone thinks we need to see them. As I said before, the module was running correctly until today. Hopefully, this is all you need for the provider's issue. Let me know if other files are needed.
resource "aws_s3_bucket" "buckets" {
count = length(var.bucket_names)
bucket = lower(replace(replace("${var.bucket_names[count.index]}-s3", " ", "-"), "_", "-"))
force_destroy = var.bucket_destroy
tags = local.all_tags
}
# Set Public Access Block for each bucket
resource "aws_s3_bucket_public_access_block" "bucket_public_access_block" {
count = length(var.bucket_names)
bucket = aws_s3_bucket.buckets[count.index].id
block_public_acls = var.bucket_block_public_acls
ignore_public_acls = var.bucket_ignore_public_acls
block_public_policy = var.bucket_block_public_policy
restrict_public_buckets = var.bucket_restrict_public_buckets
}
resource "aws_s3_bucket_acl" "bucket_acl" {
count = length(var.bucket_names)
bucket = aws_s3_bucket.buckets[count.index].id
acl = var.bucket_acl
}
resource "aws_s3_bucket_versioning" "bucket_versioning" {
count = length(var.bucket_names)
bucket = aws_s3_bucket.buckets[count.index].id
versioning_configuration {
status = "Enabled"
}
}
resource "aws_s3_bucket_lifecycle_configuration" "bucket_lifecycle_rule" {
count = length(var.bucket_names)
bucket = aws_s3_bucket.buckets[count.index].id
rule {
id = "${var.bucket_names[count.index]}-lifecycle-${count.index}"
status = "Enabled"
expiration {
days = var.bucket_backup_expiration_days
}
transition {
days = var.bucket_backup_days
storage_class = "GLACIER"
}
}
}
# AWS KMS Key Server Encryption
resource "aws_s3_bucket_server_side_encryption_configuration" "bucket_encryption" {
count = length(var.bucket_names)
bucket = aws_s3_bucket.buckets[count.index].id
rule {
apply_server_side_encryption_by_default {
kms_master_key_id = aws_kms_key.bucket_key[count.index].arn
sse_algorithm = var.bucket_sse
}
}
}
Looking for any other ideas I can use to fix this issue. thank you!!
Although you haven't included it in your question, I'm guessing that somewhere else in this Terraform module you have a block like this:
resource "s3_bucket" "example" {
}
For backward compatibility with modules written for older versions of Terraform, terraform init has some heuristics to guess what provider was intended whenever it encounters a resource that doesn't belong to one of the providers in the module's required_providers block. By default, a resource "belongs to" a provider by matching the prefix of its resource type name -- s3 in this case -- to the local names chosen in the required_providers block.
Given a resource block like the above, terraform init would notice that required_providers doesn't have an entry s3 = { ... } and so will guess that this is an older module trying to use a hypothetical legacy official provider called "s3" (which would now be called hashicorp/s3, because official providers always belong to the hashicorp/ namespace).
The correct name for this resource type is aws_s3_bucket, and so it's important to include the aws_ prefix when you declare a resource of this type:
resource "aws_s3_bucket" "example" {
}
This resource is now by default associated with the provider local name "aws", which does match one of the entries in your required_providers block and so terraform init will see that you intend to use hashicorp/aws to handle this resource.
My colleague and I finally found the problem. Turns out that we had a data call to the S3 bucket. Nothing was wrong with the module but the place I was calling the module had a local.tf action where I was calling s3 in a legacy format see the change below:
WAS
data "s3_bucket" "MyResource" {}
TO
data "aws_s3_bucket" "MyResource" {}
Appreciate the responses from everyone. Resource was the root of the problem but forgot that data is also a resource to check.
TLDR: We deploy Lambda functions using Terraform. A new lambda requires VPC attachment to an existing VPC. How do I define this network attachment in terraform? My current solution passes all terraform steps, but when inspect my Lambda in the console, it's not attached to any VPC.
I found this article Deploy AWS Lambda to VPC with Terraform insightful, but the example involves adding a new VPC (with subnets, security groups, etc.) as opposed to attaching to existing VPC, existing subnets, security groups etc.
Here's my current solution. From my project's main.tf I call a module...
module "lambda" {
source = "git::https://corpsource.io/corp-cloud-platform-team/corpcloudv2/terraform/lambda-modules.git?ref=dev"
lambda_name = var.name
lambda_role = "arn:aws:iam::${var.ACCOUNT}:role/${var.lambda_role}"
lambda_handler = var.handler
lambda_runtime = var.runtime
default_lambda_timeout = var.timeout
ACCOUNT = var.ACCOUNT
vpc_subnet_ids = "${var.SUBNET_IDS}"
vpc_security_group_ids = "${var.SECURITY_GROUP_IDS}"
}
And here is the module:
resource "aws_lambda_function" "lambda_function" {
filename = "lambda_package.zip"
function_name = var.lambda_name
role = var.lambda_role
handler = var.lambda_handler
runtime = var.lambda_runtime
memory_size = 256
timeout = var.default_lambda_timeout
source_code_hash = filebase64sha256("lambda_code/lambda_package.zip")
vpc_config {
subnet_ids = var.vpc_subnet_ids
security_group_ids = var.vpc_security_group_ids
}
}
It passes all Terraform steps without error, and yet doesn't appear to attach my Lambda to a VPC. What am I doing wrong?
Thanks in advance.
Update:
Output of Terraform Plan:
$ terraform plan
Acquiring state lock. This may take a few moments...
module.lambda.aws_lambda_function.lambda_function: Refreshing state... [id=create-vault-entry]
module.lambda_iam.aws_iam_policy.base_policy: Refreshing state... [id=arn:aws:iam::############:policy/create-vault-entry-role]
module.lambda_iam.aws_iam_role.module_role: Refreshing state... [id=create-vault-entry-role]
module.lambda_iam.aws_iam_role_policy_attachment.lambda_attach: Refreshing state... [id=create-vault-entry-role-############################]
An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
~ update in-place
Terraform will perform the following actions:
# module.lambda.aws_lambda_function.lambda_function will be updated in-place
~ resource "aws_lambda_function" "lambda_function" {
id = "create-vault-entry"
~ last_modified = "2022-01-11T19:48:18.000+0000" -> (known after apply)
~ source_code_hash = "g/hash/hash=" -> "hash/hash"
tags = {}
# (18 unchanged attributes hidden)
# (2 unchanged blocks hidden)
}
Plan: 0 to add, 1 to change, 0 to destroy.
Warning: Interpolation-only expressions are deprecated
on main.tf line 3, in locals:
3: vault_HOST = "${var.vault_HOST}",
Terraform 0.11 and earlier required all non-constant expressions to be
provided via interpolation syntax, but this pattern is now deprecated. To
silence this warning, remove the "${ sequence from the start and the }"
sequence from the end of this expression, leaving just the inner expression.
Template interpolation syntax is still used to construct strings from
expressions when the template includes multiple interpolation sequences or a
mixture of literal strings and interpolations. This deprecation applies only
to templates that consist entirely of a single interpolation sequence.
(and 5 more similar warnings elsewhere)
You appear to be converting lists to strings. The Lambda VPC subnet_ids and security_group_ids attributes expect a list, not a string. I'm really not sure how your current code is working without any errors being reported.
It looks like you need to change this:
vpc_subnet_ids = "${var.SUBNET_IDS}"
vpc_security_group_ids = "${var.SECURITY_GROUP_IDS}"
To this:
vpc_subnet_ids = var.SUBNET_IDS
vpc_security_group_ids = var.SECURITY_GROUP_IDS
First of all, I am surprised that I have found very few resources on Google that mention this issue with Terraform.
This is an essential feature for optimizing the cost of cloud instances though, so I'm probably missing out on a few things, thanks for your tips and ideas!
I want to create an instance and manage its start and stop daily, programmatically.
The resource "google_compute_resource_policy" seems to meet my use case. However, when I change the stop or start time, Terraform plans to destroy and recreate the instance... which I absolutely don't want!
The resource "google_compute_resource_policy" is attached to the instance via the argument resource_policies where it is specified: "Modifying this list will cause the instance to recreate."
I don't understand why Terraform handles this simple update so badly. It is true that it is not possible to update a scheduler, whereas it is perfectly possible to detach it manually from the instance, then to destroy it before recreating it with the new stop/start schedule and the attach to the instance again.
Is there a workaround without going through a null resource to run a gcloud script to do these steps?
I tried to add an "ignore_changes" lifecycle on the "resource_policies" argument of my instance, Terraform no longer wants to destroy my instance, but it gives me the following error:
Error when reading or editing ResourcePolicy: googleapi: Error 400: The resource_policy resource 'projects/my-project-id/regions/europe-west1/resourcePolicies/my-instance-schedule' is already being used by 'projects/my-project-id/zones/europe-west1-b/instances/my-instance', resourceInUseByAnotherResource"
Here is my Terraform code
resource "google_compute_resource_policy" "instance_schedule" {
name = "my-instance-schedule"
region = var.region
description = "Start and stop instance"
instance_schedule_policy {
vm_start_schedule {
schedule = var.vm_start_schedule
}
vm_stop_schedule {
schedule = var.vm_stop_schedule
}
time_zone = "Europe/Paris"
}
}
resource "google_compute_instance" "my-instance" {
// ******** This is my attempted workaround ********
lifecycle {
ignore_changes = [resource_policies]
}
name = "my-instance"
machine_type = var.machine_type
zone = "${var.region}-b"
allow_stopping_for_update = true
resource_policies = [
google_compute_resource_policy.instance_schedule.id
]
boot_disk {
device_name = local.ref_name
initialize_params {
image = var.boot_disk_image
type = var.disk_type
size = var.disk_size
}
}
network_interface {
network = data.google_compute_network.default.name
access_config {
nat_ip = google_compute_address.static.address
}
}
}
If it can be useful, here is what the terraform apply returns
An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
- destroy
-/+ destroy and then create replacement
Terraform will perform the following actions:
# google_compute_resource_policy.instance_schedule must be replaced
-/+ resource "google_compute_resource_policy" "instance_schedule" {
~ id = "projects/my-project-id/regions/europe-west1/resourcePolicies/my-instance-schedule" -> (known after apply)
name = "my-instance-schedule"
~ project = "my-project-id" -> (known after apply)
~ region = "https://www.googleapis.com/compute/v1/projects/my-project-id/regions/europe-west1" -> "europe-west1"
~ self_link = "https://www.googleapis.com/compute/v1/projects/my-project-id/regions/europe-west1/resourcePolicies/my-instance-schedule" -> (known after apply)
# (1 unchanged attribute hidden)
~ instance_schedule_policy {
# (1 unchanged attribute hidden)
~ vm_start_schedule {
~ schedule = "0 9 * * *" -> "0 8 * * *" # forces replacement
}
# (1 unchanged block hidden)
}
}
Plan: 1 to add, 0 to change, 1 to destroy.
Do you want to perform these actions in workspace "prd"?
Terraform will perform the actions described above.
Only 'yes' will be accepted to approve.
Enter a value: yes
google_compute_resource_policy.instance_schedule: Destroying... [id=projects/my-project-id/regions/europe-west1/resourcePolicies/my-instance-schedule]
Error: Error when reading or editing ResourcePolicy: googleapi: Error 400: The resource_policy resource 'projects/my-project-id/regions/europe-west1/resourcePolicies/my-instance-schedule' is already being used by 'projects/my-project-id/zones/europe-west1-b/instances/my-instance', resourceInUseByAnotherResource
NB: I am working with Terraform 0.14.7 and I am using google provider version 3.76.0
An instance inside GCP can be power off without destroy it with the module google_compute_instance using the argument desired_status, keep in mind that if you are creating the instance for the first time this argument needs to be on “RUNNING”. This module can be used as the following.
resource "google_compute_instance" "default" {
name = "test"
machine_type = "f1-micro"
zone = "us-west1-a"
desired_status = "RUNNING"
}
You can also modify your “main.tf” file if you need to stop the VM first and then started creating a dependency in terraform with depends_on.
As you can see in the following comment, the service account will be created but the key will be assigned until the first sentence is done.
resource "google_service_account" "service_account" {
account_id = "terraform-test"
display_name = "Service Account"
}
resource "google_service_account_key" "mykey" {
service_account_id = google_service_account.service_account.id
public_key_type = "TYPE_X509_PEM_FILE"
depends_on = [google_service_account.service_account]
}
If the first component already exists, terraform only deploys the dependent.
I faced same problem with snapshot policy.
I controlled resource policy creation using a flag input variable and using count. For the first time, I created policy resource using flag as 'true'. When I want to change schedule time, I change the flag as 'false' and apply the plan. This will detach the resource.
I then make flag as 'true' again and apply the plan with new time.
This worked for me for snapshot policy. Hope it could solve yours too.
I solved the "resourceInUseByAnotherResource" error by adding the following lifecycle to the google_compute_resource_policy resource:
lifecycle {
create_before_destroy = true
}
Also, this requires to have a unique name with each change, otherwise, the new resource can't be created, because the resource with the same name already exists. So I appended a random ID to the end of the schedule name:
resource "random_pet" "schedule" {
keepers = {
start_schedule = "${var.vm_start_schedule}"
stop_schedule = "${var.vm_stop_schedule}"
}
}
...
resource "google_compute_resource_policy" "schedule" {
name = "schedule-${random_pet.schedule.id}"
...
lifecycle {
create_before_destroy = true
}
}
I have the following deploy.tf file:
provider "aws" {
region = "us-east-1"
}
provider "aws" {
alias = "us_west_1"
region = "us-west-2"
}
resource "aws_us_east_1" "my_test" {
# provider = "aws.us_east_1"
count = 1
ami = "ami-0820..."
instance_type = "t2.micro"
}
resource "aws_us_west_1" "my_test" {
provider = "aws.us_west_1"
count = 1
ami = "ami-0d74..."
instance_type = "t2.micro"
}
I am trying to use it to deploy 2 servers, one in each region. I keep getting errors like:
aws_us_east_1.narc_test: Provider doesn't support resource: aws_us_east_1
I have tried setting alias's for both provider blocks, and referring to the correct region in a number of different ways. I've read up on multi region support, and some answers suggest this can be accomplished with modules, however, this is a simple test, and I'd like to keep it simple. Is this currently possible?
Yes it can be used to create resources in different regions even inside just one file. There is no need to use modules for your test scenario.
Your error is caused by a typo probably. If you want to launch an ec2 instance the resource you wanna create is aws_instance and not aws_us_west_1 or aws_us_east_1.
Sure enough Terraform does not know this kind of resource since it does simply not exist. Change it to aws_instance and you should be good to go! Additionally you should probably name them differently to avoid double naming using my_test for both resources.
Step 1
Add region alias in the main.tf file where you gonna execute the terraform plan.
provider "aws" {
region = "eu-west-1"
alias = "main"
}
provider "aws" {
region = "us-east-1"
alias = "useast1"
}
Step 2
Add providers block inside your module definition block
module "lambda_edge_rule" {
providers = {
aws = aws.useast1
}
source = "../../../terraform_modules/lambda"
tags = var.tags
}
Step 3
Define "aws" as providers inside your module. ( source = ../../../terraform_modules/lambda")
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = ">= 2.7.0"
}
}
}
resource "aws_lambda_function" "lambda" {
function_name = "blablabla"
.
.
.
.
.
.
.
}
Note: Terraform version v1.0.5 as of now.