Creating RDS Instances from Snapshot Using Terraform

Creating RDS Instances from Snapshot Using Terraform - amazon-web-services

Working on a Terraform project in which I am creating an RDS cluster by grabbing and using the most recent production db snapshot:
# Get latest snapshot from production DB
data "aws_db_snapshot" "db_snapshot" {
most_recent = true
db_instance_identifier = "${var.db_instance_to_clone}"
}
#Create RDS instance from snapshot
resource "aws_db_instance" "primary" {
identifier = "${var.app_name}-primary"
snapshot_identifier = "${data.aws_db_snapshot.db_snapshot.id}"
instance_class = "${var.instance_class}"
vpc_security_group_ids = ["${var.security_group_id}"]
skip_final_snapshot = true
final_snapshot_identifier = "snapshot"
parameter_group_name = "${var.parameter_group_name}"
publicly_accessible = true
timeouts {
create = "2h"
}
}
The issue with this approach is that following runs of the terraform code (once another snapshot has been taken) want to re-create the primary RDS instance (and subsequently, the read replicas) with the latest snapshot of the DB. I was thinking something along the lines of a boolean count parameters that specifies first run, but setting count = 0 on the snapshot resource causes issues with the snapshot_id parameters of the db resource. Likewise setting a count = 0 on the db resource would indicate that it would destroy the db.
Use case for this is to be able to make changes to other aspects of the production infrastructure that this terraform plan manages without having to re-create the entire RDS cluster, which is a very time consuming resource to destroy/create.

Try placing an ignore_changes lifecycle block within your aws_db_instance definition:
lifecycle {
ignore_changes = [
snapshot_identifier,
]
}
This will cause Terraform to only look for changes to the database's snapshot_identifier upon initial creation.
If the database already exists, Terraform will ignore any changes to the existing database's snapshot_identifier field -- even if a new snapshot has been created since then.

Related

How can I configure Terraform to update a GCP compute engine instance template without destroying and re-creating?

I have a service deployed on GCP compute engine. It consists of a compute engine instance template, instance group, instance group manager, and load balancer + associated forwarding rules etc.
We're forced into using compute engine rather than Cloud Run or some other serverless offering due to the need for docker-in-docker for the service in question.
The deployment is managed by terraform. I have a config that looks something like this:
data "google_compute_image" "debian_image" {
family = "debian-11"
project = "debian-cloud"
}
resource "google_compute_instance_template" "my_service_template" {
name = "my_service"
machine_type = "n1-standard-1"
disk {
source_image = data.google_compute_image.debian_image.self_link
auto_delete = true
boot = true
}
...
metadata_startup_script = data.local_file.startup_script.content
metadata = {
MY_ENV_VAR = var.whatever
}
}
resource "google_compute_region_instance_group_manager" "my_service_mig" {
version {
instance_template = google_compute_instance_template.my_service_template.id
name = "primary"
}
...
}
resource "google_compute_region_backend_service" "my_service_backend" {
...
backend {
group = google_compute_region_instance_group_manager.my_service_mig.instance_group
}
}
resource "google_compute_forwarding_rule" "my_service_frontend" {
depends_on = [
google_compute_region_instance_group_manager.my_service_mig,
]
name = "my_service_ilb"
backend_service = google_compute_region_backend_service.my_service_backend.id
...
}
I'm running into issues where Terraform is unable to perform any kind of update to this service without running into conflicts. It seems that instance templates are immutable in GCP, and doing anything like updating the startup script, adding an env var, or similar forces it to be deleted and re-created.
Terraform prints info like this in that situation:
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
~ update in-place
-/+ destroy and then create replacement
Terraform will perform the following actions:
# module.connectors_compute_engine.google_compute_instance_template.airbyte_translation_instance1 must be replaced
-/+ resource "google_compute_instance_template" "my_service_template" {
~ id = "projects/project/..." -> (known after apply)
~ metadata = { # forces replacement
+ "TEST" = "test"
# (1 unchanged element hidden)
}
The only solution I've found for getting out of this situation is to entirely delete the entire service and all associated entities from the load balancer down to the instance template and re-create them.
Is there some way to avoid this situation so that I'm able to change the instance template without having to manually update all the terraform config two times? At this point I'm even fine if it ends up creating some downtime for the service in question rather than a full rolling update or something since that's what's happening now anyway.

I was triggered by this issue as well.
However, according to:
https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_instance_template#using-with-instance-group-manager
Instance Templates cannot be updated after creation with the Google
Cloud Platform API. In order to update an Instance Template, Terraform
will destroy the existing resource and create a replacement. In order
to effectively use an Instance Template resource with an Instance
Group Manager resource, it's recommended to specify
create_before_destroy in a lifecycle block. Either omit the Instance
Template name attribute, or specify a partial name with name_prefix.
I would also test and plan with this lifecycle meta argument as well:
+ lifecycle {
+ prevent_destroy = true
+ }
}
Or more realistically in your specific case, something like:
resource "google_compute_instance_template" "my_service_template" {
version {
instance_template = google_compute_instance_template.my_service_template.id
name = "primary"
}
+ lifecycle {
+ create_before_destroy = true
+ }
}
So terraform plan with either create_before_destroy or prevent_destroy = true before terraform apply on google_compute_instance_template to see results.
Ultimately, you can remove google_compute_instance_template.my_service_template.id from state file and import it back.
Some suggested workarounds in this thread:
terraform lifecycle prevent destroy

How to recreate aws_rds_cluster in Terraform

I am trying to create an encrypted version of my currently existing unencrypted aws_rds_cluster by updating my resource, I added:
kms_key_id = "mykmskey"
storage_encrypted = true
This is how my resource should look like:
resource "aws_rds_cluster" "my_rds_cluster" {
cluster_identifier = "${var.service_name}-rds-cluster"
database_name = var.db_name
master_username = var.db_username
master_password = random_password.db_password.result
engine = var.db_engine
engine_version = var.db_engine_version
kms_key_id = "mykmskey"
storage_encrypted = true
db_subnet_group_name = aws_db_subnet_group.fleet_service_db_subnet_group.name
vpc_security_group_ids = [aws_security_group.fleet_service_service_db_security_group.id]
skip_final_snapshot = true
backup_retention_period = var.environment != "prod" ? null : 7
# snapshot_identifier = "my-rds-instance-snapshot"
tags = { Name = "${var.service_name}-rds-cluster" }
}
The problem is that the original resource had delete_protection = true defined, which I also removed but, even though I removed it the original cluster cannot be deleted by any means in order for the new one to be created, neither through changes in Terraform, nor manually in AWS console, it just throws an error like:
error creating RDS cluster: DBClusterAlreadyExistsFault: DB Cluster already exists
Any ideas what to do in such cases?

To do that purely through Terraform, you would have to:
Remove deletion protection from the original Terraform resource
Run terraform apply, which will remove deletion protection from the actual resource in AWS
Make the modifications to the Terraform resource that will result in a delete or replace of the current resource
Run terraform apply again, during which time Terraform will now delete and/or replace the resource.
The key thing here being that you can't remove deleting protection at the same time you are actually deleting a resource, because Terraform isn't going to update an existing resource to modify an attribute before attempting to delete the resource.

Trigger random_id resource recreation on rds instance destroy and recreate

Folks, am trying to find a way with terraform random_id resource to recreate and provide a new random value when the rds instance destroys and recreates due to a change that went in, say the username on rds has changed.
This random value am trying to attach to final_snapshot_identifier of the aws_db_instance resource so that the snapshot should have a unique value to its id everytime it gets created upon rds instance being destroyed.
Current code:
resource "random_id" "snap_id" {
byte_length = 8
}
locals {
inst_id = "test-rds-inst"
inst_snap_id = "${local.inst_id}-snap-${format("%.4s", random_id.snap_id.dec)}"
}
resource "aws_db_instance" "rds" {
.....
identifier = local.inst_id
final_snapshot_identifier = local.inst_snap_id
skip_final_snapshot = false
username = "foo"
apply_immediately = true
.....
}
output "snap_id" {
value = aws_db_instance.rds.final_snapshot_identifier
}
Output after terraform apply:
snap_id = "test-rds-inst-snap-5553"
Use case am trying out:
#1:
Modify value in rds instance to simulate a destroy & recreate:
Modify username to "foo-tmp"
terraform apply -auto-approve
Output:
snap_id = "test-rds-inst-snap-5553"
I was expecting the random_id to kick in and output a unique id, but it didn't.
Observation:
rds instance in deleting state
snapshot "test-rds-inst-snap-5553" in creating state
rds instance recreated and in available state
snapshot "test-rds-inst-snap-5553" in available state
#2:
Modify value again in rds instance to simulate a destroy & recreate:
Modify username to "foo-new"
terraform apply -auto-approve
Kind of expected below error, coz snap id didn't get a new value in prior attempt, but tired anyways..
Observation:
**Error:** error deleting DB Instance (test-rds-inst): DBSnapshotAlreadyExists: Cannot create the snapshot because a snapshot with the identifier test-rds-inst-snap-5553 already exists.
Am aware of the keepers{} map for random_id resource, but not sure on what from the rds_instance that I need to put in the map so that the random_id resource will be recreated and it ends up providing a new unique value to the snap_id suffix.
Also I feel using any attribute of rds instance in the random_id keepers, might cause a circular dependency issue. I may be wrong but haven't tried it though.
Any suggestions will be helpful. Thanks.

The easiest way to do this would be to use taint on the random_id resource, as per the documentation [1]:
To force a random result to be replaced, the taint command can be used to produce a new result on the next run.
Alternatively, looking at the example from the documentation, you could do something like:
resource "random_id" "snap_id" {
byte_length = 8
keepers {
snapshot_id = var.snapshot_id
}
}
resource "aws_db_instance" "rds" {
.....
identifier = local.inst_id
final_snapshot_identifier = random_id.snap_id.keepers.snapshot_id
skip_final_snapshot = false
username = "foo"
apply_immediately = true
.....
}
This means that until the value of the variable snapshot_id changes, the random_id will generate the same result. Not sure if that would work with locals, but you could try replacing var.snapshot_id with local.inst_snap_id. If that works, you could then name the snapshot using built-in functions like formatdate [2] and timestamp [3] to create a snapshot id which will be tied to the time when you were running apply, something like:
locals {
inst_id = "test-rds-inst"
snap_time = formatdate("YYYYMMDD", timestamp())
inst_snap_id = "${local.inst_id}-snap-${format("%.4s", random_id.snap_id.dec)}-${local.snap_time}"
}
[1] https://registry.terraform.io/providers/hashicorp/random/latest/docs#resource-keepers
[2] https://www.terraform.io/language/functions/formatdate
[3] https://www.terraform.io/language/functions/timestamp

Restoring an RDS Instance to a backup/snapshot

I'm trying to RESTORE an RDS instance to one of it's previous backups/snapshots, but when I follow the provided instructions by Amazon it CREATES a new instance from the backup instead of restoring the existing one.
I would like to just restore the db to an existing state because I have an EC2 instance pointing to it (managed through a Load Balancer) that I'd prefer to not have to go in and point to the new RDS.
How can I restore an RDS instance back to a previous point in time AND NOT create a new instance from the backup/snapshot

Restore to new RDS instance. Then rename or original rds server to something else and apply change immediately. Once the change is processed (2-3 min), rename the restore instance to the same name as the original rds and apply immediately.
RDS dns name within single account only varies by initial part, which is taken from the RDS instance name.

Short answer is you can't:
You can't restore from a DB snapshot to an existing DB instance; a new
DB instance is created when you restore.
from here:
https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_RestoreFromSnapshot.html

If you have terraform experience, I found this to be the most reproducible way of restoring from snapshots.
resource "aws_db_instance" "prod" {
allocated_storage = 10
engine = "mysql"
engine_version = "5.6.17"
instance_class = "db.t2.micro"
name = "mydb"
username = "foo"
password = "bar"
db_subnet_group_name = "my_database_subnet_group"
parameter_group_name = "default.mysql5.6"
}
data "aws_db_snapshot" "latest_prod_snapshot" {
db_instance_identifier = aws_db_instance.prod.id
most_recent = true
}
# Use the latest production snapshot to create a dev instance.
resource "aws_db_instance" "dev" {
instance_class = "db.t2.micro"
name = "mydbdev"
snapshot_identifier = data.aws_db_snapshot.latest_prod_snapshot.id
lifecycle {
ignore_changes = [snapshot_identifier]
}
}
From the terraform documentation.

error listing tags for RDS DB Cluster Snapshot

So I have a workflow that looks like this:
[Production]
Snap cluster
Share snapshot to Staging
[Staging]
Create new cluster out of shared snapshot
I'm using terraform so my config will look like this (for brevity I excluded other attributes and resources)
data "aws_db_cluster_snapshot" "development_final_snapshot" {
db_cluster_identifier = "arn:prod_id:my_cluster"
include_shared = true
most_recent = true
snapshot_type = "shared"
}
resource "aws_rds_cluster" "aurora" {
snapshot_identifier = "${data.aws_db_cluster_snapshot.development_final_snapshot.id}"
}
Then I get this error
error listing tags for RDS DB Cluster Snapshot (arn:prod_id:my_cluster):
InvalidParameterValue: The specified resource name does not match an RDS resource in this region.
This was working fine last week :(.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js