How to fix LimitExceeded issue in AWS with terraform? - amazon-web-services

When I run a terraform apply, got this error:
running "...terraform apply" in ".../aws-config": exit status 1
module.aws_config_role.aws_iam_policy.default: Modifying... [id=arn:aws:iam::10391031030:policy/my-aws-config-role]
Error: Error updating IAM policy arn:aws:iam::10391031030:policy/my-aws-config-role: LimitExceeded: Cannot exceed quota for PolicySize: 6144
status code: 409, request id: 313b0l20-a29d-0121-la32-01210d0103c01
In fact this task will update the IAM policy and add many new items, knows from terraform plan:
~ update in-place
Terraform will perform the following actions:
# module.aws_config_role.aws_iam_policy.default will be updated in-place
~ resource "aws_iam_policy" "default" {
arn = "arn:aws:iam::10391031030:policy/my-aws-config-role"
id = "arn:aws:iam::10391031030:policy/my-aws-config-role"
name = "my-aws-config-role"
path = "/"
~ policy = jsonencode(
~ {
~ Statement = [
~ {
~ Action = [
"..." // old
"..." // old
"..." // old
+ "..." // new
+ "..." // new
+ "..." // new
+ "..." // new
+ "..." // new
...
Plan: 0 to add, 1 to change, 0 to destroy.
After an aws provider version upgrade, got this change. How to avoid it? Ignore this task or is it possible to extend AWS' plicy size?

The limit of 6144 characters for a managed policy is from AWS, not TF. So you have to fix/workaround it from AWS perspective, not TF.
You can follow official AWS recommendations:
How can I increase the default managed policies or character size limit for an IAM role or user?

Related

Terraform always updating iam_policy_document's condition block

I have a iam_policy_document resource with a condition block. In terraform condition blocks have a required argument named values. This is a set of possible values for the condition. However, in my case, the variable is a boolean. So the set is just one value, ["false"].
When I run terraform apply and then later run either terraform plan or terraform apply, terraform always says it will update this value from "false" to ["false"]. Is there a way to get it to stop updating this policy when nothing has actually changed?
Reference: https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/iam_policy_document#values
I am using v0.15.1 on AWS.
Here is output of terraform plan when no terraform code has changed and no manual manipulation of the policy on AWS has happened. Notice it is updating the policy.
Terraform will perform the following actions:
# module.s3_test_bucket.aws_s3_bucket.main will be updated in-place
~ resource "aws_s3_bucket" "main" {
id = "test-for-stackoverflow"
~ policy = jsonencode(
~ {
~ Statement = [
~ {
~ Condition = {
~ Bool = {
~ aws:SecureTransport = "false" -> [
+ "false",
]
}
}
[...]
you type aws:SecureTransport = array of 1, AWS update it to be a string so you have diff between deployed configuration and terrafrom configuration file.
change the value to string
aws:SecureTransport = ["false"]
to
aws:SecureTransport = "false"
And it should solve it.

What happens if running "terraform apply" twice? (Terraform)

What happens if running "terraform apply" twice?
Does it create all the resources twice?
I'm assuming that when you say "terraform deploy" here you mean running the terraform apply command.
The first time you run terraform apply against an entirely new configuration, Terraform will propose to create new objects corresponding with each of the resource instances you declared in the configuration. If you accept the plan and thus allow Terraform to really apply it, Terraform will create each of those objects and record information about them in the Terraform state.
If you then run terraform apply again, Terraform will compare your configuration with the state to see if there are any differences. This time, Terraform will propose changes only if the configuration doesn't match the existing objects that are recorded in the state. If you accept that plan then Terraform will take each of the actions it proposed, which can be a mixture of different action types: update, create, destroy.
This means that in order to use Terraform successfully you need to make sure to keep the state snapshots safe between Terraform runs. With no special configuration at all Terraform will by default save the state in a local file called terraform.tfstate, but when you are using Terraform in production you'll typically use remote state, which is a way to tell Terraform to store state snapshots in a remote data store separate from the computer where you are running Terraform. By storing the state in a location that all of your coworkers can access, you can collaborate together.
If you use Terraform Cloud, a complementary hosted service provided by HashiCorp, you can configure Terraform to store the state snapshots in Terraform Cloud itself. Terraform Cloud has various other capabilities too, such as running Terraform in a remote execution environment so that everyone who uses that environment can be sure to run Terraform with a consistent set of environment variables stored remotely.
If you run the terraform apply command first time, it will create the necessary resource which was in terraform plan.
If you run the terraform apply command second time, it will try to check if that resource already exist there or not. If found then will not create any duplicate resource.
Before running the terraform apply for the second time, if you run terraform plan you will get the list of change/create/delete list.
Apr, 2022 Update:
The first run of "terraform apply" creates(adds) resources.
The second or later run of "terraform apply" creates(adds), updates(changes) or deletes(destroys) existed resources if there are changes for them. Plus, basically when changing the mutable value of an existed resource, its existed resource is updated rather than deleted then created and basically when changing the immutable value of an existed resource, its existed resource is deleted then created rather than updated.
*A mutable value is the value which can change after creating a resource.
*An immutable values is the value which cannot change after creating a resource.
For example, I create(add) the Cloud Storage bucket "kai_bucket" with the Terraform code below:
resource "google_storage_bucket" "bucket" {
name = "kai_bucket"
location = "ASIA-NORTHEAST1"
force_destroy = true
uniform_bucket_level_access = true
}
So, do the first run of the command below:
terraform apply -auto-approve
Then, one resource "kai_bucket" is created(added) as shown below:
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
+ create
Terraform will perform the following actions:
# google_storage_bucket.bucket will be created
+ resource "google_storage_bucket" "bucket" {
+ force_destroy = true
+ id = (known after apply)
+ location = "ASIA-NORTHEAST1"
+ name = "kai_bucket"
+ project = (known after apply)
+ self_link = (known after apply)
+ storage_class = "STANDARD"
+ uniform_bucket_level_access = true
+ url = (known after apply)
}
Plan: 1 to add, 0 to change, 0 to destroy.
google_storage_bucket.bucket: Creating...
google_storage_bucket.bucket: Creation complete after 1s [id=kai_bucket]
Apply complete! Resources: 1 added, 0 changed, 0 destroyed.
Now, I change the mutable value "uniform_bucket_level_access" from "true" to "false":
resource "google_storage_bucket" "bucket" {
name = "kai_bucket"
location = "ASIA-NORTHEAST1"
force_destroy = true
uniform_bucket_level_access = false # Here
}
Then, do the second run of the command below:
terraform apply -auto-approve
Then, "uniform_bucket_level_access" is updated(changed) from "true" to "false" as shown below:
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
~ update in-place
Terraform will perform the following actions:
# google_storage_bucket.bucket will be updated in-place
~ resource "google_storage_bucket" "bucket" {
id = "kai_bucket"
name = "kai_bucket"
~ uniform_bucket_level_access = true -> false
# (9 unchanged attributes hidden)
}
Plan: 0 to add, 1 to change, 0 to destroy.
google_storage_bucket.bucket: Modifying... [id=kai_bucket]
google_storage_bucket.bucket: Modifications complete after 1s [id=kai_bucket]
Apply complete! Resources: 0 added, 1 changed, 0 destroyed.
Now, I change the immutable value "location" from "ASIA-NORTHEAST1" to "US-EAST1":
resource "google_storage_bucket" "bucket" {
name = "kai_bucket"
location = "US-EAST1" # Here
force_destroy = true
uniform_bucket_level_access = false
}
Then, do the third run of the command below:
terraform apply -auto-approve
Then, one resource "kai_bucket" with "ASIA-NORTHEAST1" is deleted(destroyed) then one resource "kai_bucket" with "US-EAST1" is created(added) as shown below:
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
-/+ destroy and then create replacement
Terraform will perform the following actions:
# google_storage_bucket.bucket must be replaced
-/+ resource "google_storage_bucket" "bucket" {
- default_event_based_hold = false -> null
~ id = "kai_bucket" -> (known after apply)
- labels = {} -> null
~ location = "ASIA-NORTHEAST1" -> "US-EAST1" # forces replacement
name = "kai_bucket"
~ project = "myproject-272234" -> (known after apply)
- requester_pays = false -> null
~ self_link = "https://www.googleapis.com/storage/v1/b/kai_bucket" -> (known after apply)
~ url = "gs://kai_bucket" -> (known after apply)
# (3 unchanged attributes hidden)
}
Plan: 1 to add, 0 to change, 1 to destroy.
google_storage_bucket.bucket: Destroying... [id=kai_bucket]
google_storage_bucket.bucket: Destruction complete after 1s
google_storage_bucket.bucket: Creating...
google_storage_bucket.bucket: Creation complete after 1s [id=kai_bucket]
Apply complete! Resources: 1 added, 0 changed, 1 destroyed.

Terraform wants to replace Google compute engine if its start/stop scheduler is modified

First of all, I am surprised that I have found very few resources on Google that mention this issue with Terraform.
This is an essential feature for optimizing the cost of cloud instances though, so I'm probably missing out on a few things, thanks for your tips and ideas!
I want to create an instance and manage its start and stop daily, programmatically.
The resource "google_compute_resource_policy" seems to meet my use case. However, when I change the stop or start time, Terraform plans to destroy and recreate the instance... which I absolutely don't want!
The resource "google_compute_resource_policy" is attached to the instance via the argument resource_policies where it is specified: "Modifying this list will cause the instance to recreate."
I don't understand why Terraform handles this simple update so badly. It is true that it is not possible to update a scheduler, whereas it is perfectly possible to detach it manually from the instance, then to destroy it before recreating it with the new stop/start schedule and the attach to the instance again.
Is there a workaround without going through a null resource to run a gcloud script to do these steps?
I tried to add an "ignore_changes" lifecycle on the "resource_policies" argument of my instance, Terraform no longer wants to destroy my instance, but it gives me the following error:
Error when reading or editing ResourcePolicy: googleapi: Error 400: The resource_policy resource 'projects/my-project-id/regions/europe-west1/resourcePolicies/my-instance-schedule' is already being used by 'projects/my-project-id/zones/europe-west1-b/instances/my-instance', resourceInUseByAnotherResource"
Here is my Terraform code
resource "google_compute_resource_policy" "instance_schedule" {
name = "my-instance-schedule"
region = var.region
description = "Start and stop instance"
instance_schedule_policy {
vm_start_schedule {
schedule = var.vm_start_schedule
}
vm_stop_schedule {
schedule = var.vm_stop_schedule
}
time_zone = "Europe/Paris"
}
}
resource "google_compute_instance" "my-instance" {
// ******** This is my attempted workaround ********
lifecycle {
ignore_changes = [resource_policies]
}
name = "my-instance"
machine_type = var.machine_type
zone = "${var.region}-b"
allow_stopping_for_update = true
resource_policies = [
google_compute_resource_policy.instance_schedule.id
]
boot_disk {
device_name = local.ref_name
initialize_params {
image = var.boot_disk_image
type = var.disk_type
size = var.disk_size
}
}
network_interface {
network = data.google_compute_network.default.name
access_config {
nat_ip = google_compute_address.static.address
}
}
}
If it can be useful, here is what the terraform apply returns
An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
- destroy
-/+ destroy and then create replacement
Terraform will perform the following actions:
# google_compute_resource_policy.instance_schedule must be replaced
-/+ resource "google_compute_resource_policy" "instance_schedule" {
~ id = "projects/my-project-id/regions/europe-west1/resourcePolicies/my-instance-schedule" -> (known after apply)
name = "my-instance-schedule"
~ project = "my-project-id" -> (known after apply)
~ region = "https://www.googleapis.com/compute/v1/projects/my-project-id/regions/europe-west1" -> "europe-west1"
~ self_link = "https://www.googleapis.com/compute/v1/projects/my-project-id/regions/europe-west1/resourcePolicies/my-instance-schedule" -> (known after apply)
# (1 unchanged attribute hidden)
~ instance_schedule_policy {
# (1 unchanged attribute hidden)
~ vm_start_schedule {
~ schedule = "0 9 * * *" -> "0 8 * * *" # forces replacement
}
# (1 unchanged block hidden)
}
}
Plan: 1 to add, 0 to change, 1 to destroy.
Do you want to perform these actions in workspace "prd"?
Terraform will perform the actions described above.
Only 'yes' will be accepted to approve.
Enter a value: yes
google_compute_resource_policy.instance_schedule: Destroying... [id=projects/my-project-id/regions/europe-west1/resourcePolicies/my-instance-schedule]
Error: Error when reading or editing ResourcePolicy: googleapi: Error 400: The resource_policy resource 'projects/my-project-id/regions/europe-west1/resourcePolicies/my-instance-schedule' is already being used by 'projects/my-project-id/zones/europe-west1-b/instances/my-instance', resourceInUseByAnotherResource
NB: I am working with Terraform 0.14.7 and I am using google provider version 3.76.0
An instance inside GCP can be power off without destroy it with the module google_compute_instance using the argument desired_status, keep in mind that if you are creating the instance for the first time this argument needs to be on “RUNNING”. This module can be used as the following.
resource "google_compute_instance" "default" {
name = "test"
machine_type = "f1-micro"
zone = "us-west1-a"
desired_status = "RUNNING"
}
You can also modify your “main.tf” file if you need to stop the VM first and then started creating a dependency in terraform with depends_on.
As you can see in the following comment, the service account will be created but the key will be assigned until the first sentence is done.
resource "google_service_account" "service_account" {
account_id = "terraform-test"
display_name = "Service Account"
}
resource "google_service_account_key" "mykey" {
service_account_id = google_service_account.service_account.id
public_key_type = "TYPE_X509_PEM_FILE"
depends_on = [google_service_account.service_account]
}
If the first component already exists, terraform only deploys the dependent.
I faced same problem with snapshot policy.
I controlled resource policy creation using a flag input variable and using count. For the first time, I created policy resource using flag as 'true'. When I want to change schedule time, I change the flag as 'false' and apply the plan. This will detach the resource.
I then make flag as 'true' again and apply the plan with new time.
This worked for me for snapshot policy. Hope it could solve yours too.
I solved the "resourceInUseByAnotherResource" error by adding the following lifecycle to the google_compute_resource_policy resource:
lifecycle {
create_before_destroy = true
}
Also, this requires to have a unique name with each change, otherwise, the new resource can't be created, because the resource with the same name already exists. So I appended a random ID to the end of the schedule name:
resource "random_pet" "schedule" {
keepers = {
start_schedule = "${var.vm_start_schedule}"
stop_schedule = "${var.vm_stop_schedule}"
}
}
...
resource "google_compute_resource_policy" "schedule" {
name = "schedule-${random_pet.schedule.id}"
...
lifecycle {
create_before_destroy = true
}
}

Make Terraform ignore the order of list items returned from the service

I'm creating an AWS IAM policy that grants access to a resource to a number of remote accounts. I've got these accounts in a list - all good. However when TF checks the current state on the subsequent plan it comes back in a different order and TF thinks that it must be corrected. How can I ignore the list order?
This is my resource:
resource "aws_ecr_repository_policy" "repo" {
policy = jsonencode({
Statement = [
{
Principal = {
AWS = [
"arn:aws:iam::123456789012:root",
"arn:aws:iam::567890123456:root",
"arn:aws:iam::987654321098:root",
]
...
Now on subsequent terraform plan runs I get some variations of this:
~ {
~ Principal = {
~ AWS = [
+ "arn:aws:iam::987654321098:root", <<< swapped order
"arn:aws:iam::123456789012:root",
"arn:aws:iam::567890123456:root",
- "arn:aws:iam::987654321098:root", <<< and here
]
}
AWS is unpredictable with the order it returns, it changes each time. Can I somehow ignore the order? Ideally without ignoring the whole policy block with lifecycle / ignore_changes.
This was an issue of terraform-provider-aws (fixed with provider version v4.23.0), see
https://github.com/hashicorp/terraform-provider-aws/issues/22274

Terraform - Updating S3 Access Control: Question on replacing acl with grant

I have an S3 bucket which is used as Access logging bucket.
Here is my current module and resource TF code for that:
module "access_logging_bucket" {
source = "../../resources/s3_bucket"
environment = "${var.environment}"
region = "${var.region}"
acl = "log-delivery-write"
encryption_key_alias = "alias/ab-data-key"
name = "access-logging"
name_tag = "Access logging bucket"
}
resource "aws_s3_bucket" "default" {
bucket = "ab-${var.environment}-${var.name}-${random_id.bucket_suffix.hex}"
acl = "${var.acl}"
depends_on = [data.template_file.dependencies]
tags = {
name = "${var.name_tag}"
. . .
}
lifecycle {
ignore_changes = [ "server_side_encryption_configuration" ]
}
}
The default value of variable acl is variable "acl" { default = "private" } in my case. And also as stated in Terraform S3 bucket attribute reference doc.
And for this bucket it is set to log-delivery-write.
I want to update it to add following grants and remove acl as they conflict with each other:
grant {
permissions = ["READ_ACP", "WRITE"]
type = "Group"
uri = "http://acs.amazonaws.com/groups/s3/LogDelivery"
}
grant {
id = data.aws_canonical_user_id.current.id
permissions = ["FULL_CONTROL"]
type = "CanonicalUser"
}
My Questions are:
Is removing the acl attribute and adding the above mentioned grants still maintain the correct access control for the bucket. i.e. is that grant configuration still good to have this as an Access Logging bucket.
If I remove the acl from the resource config, it will make it private which is the default value. Is that the correct thing to do or should it be made null or something?
On checking some documentation for Log Delivery group found this which leads me to think I can go ahead with replacing the acl with the grants I mentioned:
Log Delivery group – Represented by
http://acs.amazonaws.com/groups/s3/LogDelivery . WRITE permission on a
bucket enables this group to write server access logs (see Amazon S3
server access logging) to the bucket. When using ACLs, a grantee can
be an AWS account or one of the predefined Amazon S3 groups.
Based on the grant-log-delivery-permissions-general documentation, I went ahead and ran the terraform apply.
On first run it set the Bucket owner permission correctly but removed the S3 log delivery group. So, I ran the terraform plan again and it showed the following acl grant differences. I am thinking it's most likely that it first updated the acl value which removed the grant for log delivery group.
Thus I re-ran the terraform apply and it worked fine and corrected the log delivery group as well.
# module.buckets.module.access_logging_bucket.aws_s3_bucket.default will be updated in-place
~ resource "aws_s3_bucket" "default" {
acl = "private"
bucket = "ml-mxs-stage-access-logging-9d8e94ff"
force_destroy = false
. . .
tags = {
"name" = "Access logging bucket"
. . .
}
+ grant {
+ permissions = [
+ "READ_ACP",
+ "WRITE",
]
+ type = "Group"
+ uri = "http://acs.amazonaws.com/groups/s3/LogDelivery"
}
+ grant {
+ id = "ID_VALUE"
+ permissions = [
+ "FULL_CONTROL",
]
+ type = "CanonicalUser"
}
. . .
}
Plan: 0 to add, 1 to change, 0 to destroy.