Terraform - Encrypting a db instance forces replacement - amazon-web-services

I have a postgres RDS instance in AWS that I created using terraform.
resource "aws_db_instance" "..." {
...
}
Now I'm trying to encrypt that instance by adding
resource "aws_db_instance" "..." {
...
storage_encrypted = true
}
But when I run terraform plan, it says that it's going to force replacement
# aws_db_instance.... must be replaced
...
~ storage_encrypted = false -> true # forces replacement
What can I do to prevent terraform from replacing my db instance?

Terraform is not at fault here. You simply cannot change the encryption setting on an RDS instance after it was originally created. You can / need to create a snapshot of the current db, copy + encrypt the snapshot and then restore from that snapshot: https://aws.amazon.com/premiumsupport/knowledge-center/update-encryption-key-rds/
This will cause a downtime of the DB. And terraform does not do that for you automatically, you need to do this manually. After the DB is restored terraform should not longer try to replace the DB since the expected config now matches the actual config.
Technically you can ignore_changes the storage_encrypted property but of course that causes terraform to simply ignore any storage encryption changes.

Related

Terraform resets credentials when importing existing RDS db resource

We have a bunch of existing resources on AWS that we want to import under terraform managment. One of these resources is an RDS db. So we wrote something like this :
resource "aws_db_instance" "db" {
engine = "postgres"
username = var.rds_username
password = var.rds_password
# other stuff...
}
variable "rds_username" {
type = string
default = "master_username"
}
variable "rds_password" {
type = string
default = "master_password"
sensitive = true
}
Note these are the existing master credentials. That's important. Then we did this:
terraform import aws_db_instance.db db-identifier
And then tried terraform plan a few times, tweaking the code to fit the existing resource until finally, terraform plan indicated there were no changes to be made (which was the goal).
However, once we ran terraform apply, it reset the master credentials of the DB instance. Worse, other resources that had previously connected to that DB using these exact credentials suddenly can't anymore.
Why is this happening? Is terraform encrypting the password behind the scenes and not telling me? Why can't other services connect to the DB using the same credentials? Is there any way to import an RDS instance without resetting credentials?

Need to enable backup replication feature in AWS RDS through Terraform

Need to enable backup replication feature in AWS RDS - oracle through terraform. So do we have any attributes from terraform side for that particular feature?
The only argument on the Terraform side is aws_db_instance's replicate_source_db
replicate_source_db - (Optional) Specifies that this resource is a Replicate database, and to use this value as the source database. This correlates to the identifier of another Amazon RDS Database to replicate (if replicating within a single region) or ARN of the Amazon RDS Database to replicate (if replicating cross-region). Note that if you are creating a cross-region replica of an encrypted database you will also need to specify a kms_key_id.
The replicate_source_db should have the ID or ARN of the source database.
resource "aws_db_instance" "oracle" {
# ... other arguments
}
resource "aws_db_instance" "oracle_replicant" {
# ... other arguments
replicate_source_db = aws_db_instance.oracle.id
}
Reference
https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/oracle-read-replicas.html

How to Promote Cloud SQL replica to primary using terraform so promoted instance should be in TF control

I am creating GCP Cloud SQL instance using terraform with cross region Cloud SQL replica. I am testing the DR scenario as when DR happen I am promoting read replica to primary instance using glcoud API (as there is not settings/resource available in terraform to promote replica) as I am using gcloud command the promoted instance and state file is not in sync so later the promoted instance is not under terraform control.
Cross-region replica setups become out of sync with the primary right after the promotion is complete. Promoting a replica is done manually and intentionally. It is not the same as high availability, where a standby instance (which is not a replica) automatically becomes the primary in case of a failure or zonal outage. You can promote the read replica using gcloud and Google API manually. By doing both of these will make the instance out of sync with Terraform. So what you are looking for seems to be not available while promoting a replica in Cloud SQL.
As a workaround I would suggest you to promote the replica to primary outside of Terraform, and then try to import the resource back into state which would reset the state file.
Promoting an instance to primary is not supported by Terraform's Google Cloud Provider, but there is an issue (which you should upvote if you care) to add support for this to the provider.
Here's how to work around the lack of support in the meantime. Assume you have the following minimal setup: an instance, a database, a user, and a read replica:
resource "google_sql_database_instance" "instance1" {
name = "old-primary"
region = "us-central1"
database_version = "POSTGRES_14"
}
resource "google_sql_database" "db" {
name = "test-db"
instance = google_sql_database_instance.instance1.name
}
resource "google_sql_user" "user" {
name = "test-user"
instance = google_sql_database_instance.instance1.name
password = var.db_password
}
resource "google_sql_database_instance" "instance2" {
name = "new-primary"
master_instance_name = google_sql_database_instance.instance1.name
region = "europe-west4"
database_version = "POSTGRES_14"
replica_configuration {
failover_target = false
}
}
Steps to follow:
You promote the replica out of band, either using the Console or the gcloud CLI.
Next you manually edit the state file:
# remove the old read-replica state; it's now the new primary
terraform state rm google_sql_database_instance.instance2
# import the new-primary as "instance1"
terraform state rm google_sql_database_instance.instance1
terraform import google_sql_database_instance.instance1 your-project-id/new-primary
# import the new-primary db as "db"
terraform state rm google_sql_database.db
terraform import google_sql_database.db your-project-id/new-primary/test-db
# import the new-primary user as "db"
terraform state rm google_sql_user.user
terraform import google_sql_user.user your-project-id/new-primary/test-user
Now you edit your terraform config to update the resources to match the state:
resource "google_sql_database_instance" "instance1" {
name = "new-primary" # this is the former replica's name
region = "europe-west4" # this is the former replica's region
database_version = "POSTGRES_14"
}
resource "google_sql_database" "db" {
name = "test-db"
instance = google_sql_database_instance.instance1.name
}
resource "google_sql_user" "user" {
name = "test-user"
instance = google_sql_database_instance.instance1.name
password = var.db_password
}
# this has now been promoted and is now "instance1" so the following
# block can be deleted.
# resource "google_sql_database_instance" "instance2" {
# name = "new-primary"
# master_instance_name = google_sql_database_instance.instance1.name
# region = "europe-west4"
# database_version = "POSTGRES_14"
#
# replica_configuration {
# failover_target = false
# }
# }}
}
Then you run terraform apply and see that only the user is updated in-place with the existing password. (This is done because Terraform can't get the password from the API and it was removed as part of the promotion and so has to be re-applied for Terraform's sake.)
What you do with your old primary is up to you. It's no longer managed by terraform. So either delete it manually, or re-import it.
Caveats
Everyone's Terraform setup is different and so you'll probably have to iterate through the steps above until you reach the desired result.
Remember to use a testing environment first with lots of calls to terraform plan to see what's changing. Whenever a resource is marked for deletion, Terraform will report why.
Nonetheless, you can use the process above to work your way to a terraform setup that reflects a promoted read replica. And in the meantime, upvote the issue because if it gets enough attention, the Terraform team will prioritize it accordingly.

Terraform RDS: Restore data after destroy due to modification?

I am trying to create an RDS Aurora MySQL cluster in AWS using Terraform. However, I notice that any time I alter the cluster in a way the requires it to be replaced, all data is lost. I have configured to take a final snapshot and would like to restore from that snapshot, or restore the original data through an alternative measure.
Example: Change Cluster -> TF Destroys the original cluster -> TF Replaces with new cluster -> Restore Data from original
I have attempted to use the same snapshot identifier for both aws_rds_cluster.snapshot_identifier and aws_rds_cluster.final_snapshot_identifier, but Terraform bombs because the final snapshot of the destroyed cluster doesn't yet exist.
I've also attempted to use the rds-finalsnapshot module, but it turns out it is primarily used for spinning environments up and down, preserving the data. i.e. Destroying an entire cluster, then recreating it as part of a separate deployment. (Module: https://registry.terraform.io/modules/connect-group/rds-finalsnapshot/aws/latest)
module "snapshot_maintenance" {
source="connect-group/rds-finalsnapshot/aws//modules/rds_snapshot_maintenance"
identifier = local.cluster_identifier
is_cluster = true
database_endpoint = element(aws_rds_cluster_instance.cluster_instance.*.endpoint, 0)
number_of_snapshots_to_retain = 3
}
resource "aws_rds_cluster" "provisioned_cluster" {
cluster_identifier = module.snapshot_maintenance.identifier
engine = "aurora-mysql"
engine_version = "5.7.mysql_aurora.2.10.0"
port = 1234
database_name = "example"
master_username = "example"
master_password = "example"
iam_database_authentication_enabled = true
storage_encrypted = true
backup_retention_period = 2
db_subnet_group_name = "example"
skip_final_snapshot = false
final_snapshot_identifier = module.snapshot_maintenance.final_snapshot_identifier
snapshot_identifier = module.snapshot_maintenance.snapshot_to_restore
vpc_security_group_ids = ["example"]
}
What I find is if a change requires destroy and recreation, I don't have a great way to restore the data as part of the same deployment.
I'll add that I don't think this is an issue with my code. It's more of a lifecycle limitation of TF. I believe I can't be the only person who wants to preserve the data in their cluster in the event TF determines the cluster must be recreated.
If I wanted to prevent loss of data due to a change to the cluster that results in a destroy, do I need to destroy the cluster outside of terraform or through the cli, sync up Terraform's state and then apply?
The solution ended up being rather simple, albeit obscure. I tried over 50 different approaches using combinations of existing resource properties, provisioners, null resources (with triggers) and external data blocks with AWS CLI commands and Powershell scripts.
The challenge here was that I needed to ensure the provisioning happened in this order to ensure no data loss:
Stop DMS replication tasks from replicating more data into the database.
Take a new snapshot of the cluster, once incoming data had been stopped.
Destroy and recreate the cluster, using the snapshot_identifier to specify the snapshot taken in the previous step.
Destroy and recreate the DMS tasks.
Of course these steps were based on how Terraform decided it needed to apply updates. It may determine it only needed to perform an in-place update; this wasn't my concern. I needed to handle scenarios where the resources were destroyed.
The final solution was to eliminate the use of external data blocks and go exclusively with local provisioners, because external data blocks would execute even when only running terraform plan. I used the local provisioners to tap into lifecycle events like "create" and "destroy" to ensure my Powershell scripts would only execute during terraform apply.
On my cluster, I set both final_snapshot_identifier and snapshot_identifier to the same value.
final_snapshot_identifier = local.snapshot_identifier
snapshot_identifier = data.external.check_for_first_run.result.isFirstRun == "true" ? null : local.snapshot_identifier
snapshot_identifier is only set after the first deployment, external data blocks allow me to check if a resource exists already in order to achieve the condition. The condition is necessary because on a first deployment, the snapshot won't exist and Terraform will fail during the "planning" step due to this.
Then I execute a Powershell script in a local provisioner on the "destroy" to stop any DMS tasks and then delete the snapshot by the name of local.snapshot_identifier.
provisioner "local-exec" {
when = destroy
# First, stop the inflow of data to the cluster by stopping the dms tasks.
# Next, we've tricked TF into thinking the snapshot we want to use is there by using the same name for old and new snapshots, but before we destroy the cluster, we need to delete the original.
# Then TF will create the final snapshot immediately following the execution of the below script and it will be used to restore the cluster since we've set it as snapshot_identifier.
command = "/powershell_scripts/stop_dms_tasks.ps1; aws rds delete-db-cluster-snapshot --db-cluster-snapshot-identifier benefitsystem-cluster"
interpreter = ["PowerShell"]
}
This clears out the last snapshot and allows Terraform to create a new final snapshot by the same name as the original, just in time to be used to restore from.
Now, I can run Terraform the first time and get a brand-new cluster. All subsequent deployments will use the final snapshot to restore from and data is preserved.

terraform create k8s secret from gcp secret

I have managed to achieve the flow of creating sensitive resources in terraform, without revealing what the sensitive details are at any point and therefore won't be stored in plain text in our github repo. I have done this by letting TF create a service account, it's associated SA key, and then creating a GCP secret that references the output from the SA key for example.
I now want to see if there's any way to do the same for some pre-defined database passwords. The flow will be slightly different:
Manually create the GCP secret (in secrets manager) which has a value of a list of plain text database passwords which our PGbouncer instance will use (more info later in the flow)
I import this using terraform import so terraform state is now aware of this resource even though it was created outside of TF, but the secret version I've just added as secret_data = "" (otherwise putting the plain text password details here defeat the object!)
I now want to grab the secret_data from the google_secret_manager_version to add into the kubernetes_secret so it can be used within our GKE cluster.
However, when I run terraform plan, it wants to change the value of my manually created GCP secret
# google_secret_manager_secret_version.pgbouncer-secret-uat-v1 must be replaced
-/+ resource "google_secret_manager_secret_version" "pgbouncer-secret-uat-v1" {
~ create_time = "2021-08-26T14:42:58.279432Z" -> (known after apply)
+ destroy_time = (known after apply)
~ id = "projects/********/secrets/pgbouncer-secret-uat/versions/1" -> (known after apply)
~ name = "projects/********/secrets/pgbouncer-secret-uat/versions/1" -> (known after apply)
~ secret = "projects/********/secrets/pgbouncer-secret-uat" -> "projects/*******/secrets/pgbouncer-secret-uat" # forces replacement
- secret_data = (sensitive value) # forces replacement
Any ideas how I can get round this? I want to import the google secret version to use in kubernetes but not set the secret_data value in the resource as I don't want it to overwrite what i created manually. Here is the relevant terraform config I'm talking about:
resource "google_secret_manager_secret" "pgbouncer-secret-uat" {
provider = google-beta
secret_id = "pgbouncer-secret-uat"
replication {
automatic = true
}
depends_on = [google_project_service.secretmanager]
}
resource "google_secret_manager_secret_version" "pgbouncer-secret-uat-v1" {
provider = google-beta
secret = google_secret_manager_secret.pgbouncer-secret-uat.id
secret_data = ""
}
If you just want to retrieve/READ the secret without actively managing it, then you can use the associated data instead:
data "google_secret_manager_secret_version" "pgbouncer-secret-uat-v1" {
provider = google-beta
secret = google_secret_manager_secret.pgbouncer-secret-uat.id
}
You can then use the value in your Kubernetes cluster as a secret with the data's exported resource attribute: data.google_secret_manager_secret_version.pgbouncer-secret-uat-v1.secret_data. Note that the provider will likely mark this exported resource attribute as sensitive, and that carries the normal consequences with it.
You can also check the full documentation for more information.
Additionally, since you imported the resource into your state to manage it within your Terraform config, you will need to remove it from your state concurrently with removing it from your config. You can do that most easily with terraform state rm google_secret_manager_secret_version.pgbouncer-secret-uat-v1. Then you may safely remove the resource from your config and solely include the data in your config.