How to specify dead letter dependency using modules? - amazon-web-services

I have the following core module based off this official module:
module "sqs" {
source = "github.com/terraform-aws-modules/terraform-aws-sqs?ref=0d48cbdb6bf924a278d3f7fa326a2a1c864447e2"
name = "${var.site_env}-sqs-${var.service_name}"
}
I'd like to create two queues: xyz and xyz_dead. xyz sends its dead letter messages to xyz_dead.
module "xyz_queue" {
source = "../helpers/sqs"
service_name = "xyz"
redrive_policy = <<POLICY {
"deadLetterTargetArn" : "${data.TODO.TODO.arn}",
"maxReceiveCount" : 5
}
POLICY
site_env = "${var.site_env}"
}
module "xyz_dead_queue" {
source = "../helpers/sqs"
service_name = "xyz_dead"
site_env = "${var.site_env}"
}
How do I specify the deadLetterTargetArn dependency?
If I do:
data "aws_sqs_queue" "dead_queue" {
filter {
name = "tag:Name"
values = ["${var.site_env}-sqs-xyz_dead"]
}
}
and set deadLetterTargetArn to "${data.aws_sqs_queue.dead_queue.arn}", then I get this error:
Error: data.aws_sqs_queue.thumbnail_requests_queue_dead: "name":
required field is not set Error:
data.aws_sqs_queue.thumbnail_requests_queue_dead: : invalid or unknown
key: filter

The best way to do this is to use the outputted ARN from the module:
module "xyz_queue" {
source = "../helpers/sqs"
service_name = "xyz"
site_env = "${var.site_env}"
redrive_policy = <<POLICY
{
"deadLetterTargetArn" : "${module.xyz_dead_queue.this_sqs_queue_arn}",
"maxReceiveCount" : 5
}
POLICY
}
module "xyz_dead_queue" {
source = "../helpers/sqs"
service_name = "xyz_dead"
site_env = "${var.site_env}"
}
NB: I've also changed the indentation of your HEREDOC here because you normally need to remove the indentation with these.
This will pass the ARN of the SQS queue directly from the xyz_dead_queue module to the xyz_queue.
As for the errors you were getting, the aws_sqs_queue data source takes only a name argument, not a filter block like some of the other data sources do.
If you wanted to use the aws_sqs_queue data source then you'd just want to use:
data "aws_sqs_queue" "dead_queue" {
name = "${var.site_env}-sqs-${var.service_name}"
}
That said, if you are creating two things at the same time then you are going to have issues using a data source to refer to one of those things unless you create the first resource first. This is because data sources run before resources so if neither queue yet exists your data source would run and not find the dead letter queue and thus fail. If the dead letter queue did exist then it would be okay. In general though you're best off avoiding using data sources like this and only use them to refer to things being created in a separate terraform apply (or perhaps even created outside of Terraform).
You are also much better off simply passing the outputs of resources or modules to other resources/modules and allowing Terraform to correctly build a dependency tree for them as well.

Related

Importing terraform resources into a local block

I have a project that has many SQS queues in AWS that we need to manage.
I need to import those queues into my terraform code, but since they're already being used, I can't destroy and recreate them.
Since we have many queues, we use a locals block instead of its resource block to define some of its arguments, such as name, delay_seconds and others. (this is because we don't want to add over 10 resource blocks to import the queues into them and have 100+ lines of code)
Below, example code that we use to create them:
provider "aws" {
region = "us-east-2"
}
locals {
sqs_queues = {
test-01 = {
name = "test-import-terraform-01"
delay_seconds = 30
}
test-02 = {
name = "test-import-terraform-02"
delay_seconds = 30
}
}
}
resource "aws_sqs_queue" "queue" {
for_each = local.sqs_queues
name = each.value.name
delay_seconds = each.value.delay_seconds
}
This in turn will create the following queues: test-import-terraform-01 and test-import-terraform-02 as usual.
Querying my statefile, i can see then defined as such:
aws_sqs_queue.queue["test-01"]
aws_sqs_queue.queue["test-02"]
Based on that, i would like to import two existing queues to my code: test-import-terraform-03 and test-import-terraform-04.
I thought about adding these two maps to my locals block:
test-03 = {
name = "test-import-terraform-03"
delay_seconds = 30
}
test-04 = {
name = "test-import-terraform-04"
delay_seconds = 30
}
But when I try to import them, I get the following error for either queues:
$ terraform import aws_sqs_queue.queue["test-03"] https://sqs.us-east-2.amazonaws.com/12345678910/test-import-terraform-03
zsh: no matches found: aws_sqs_queue.queue[test-03]
Is doing something like that possible?
Your problem is not to do with Terraform, but with shell expansion (note the error message comes from zsh).
Try quoting your shell arguments properly:
terraform import 'aws_sqs_queue.queue["test-03"]' 'https://sqs.us-east-2.amazonaws.com/12345678910/test-import-terraform-03'

AWS Glue pipeline with Terraform

We are working with AWS Glue as a pipeline tool for ETL at my company. So far, the pipelines were created manually via the console and I am now moving to Terraform for future pipelines as I believe IaC is the way to go.
I have been trying to work on a module (or modules) that I can reuse as I know that we will be making several more pipelines for various projects. The difficulty I am having is in creating a good level of abstraction with the module. AWS Glue has several components/resources to it, including a Glue connection, databases, crawlers, jobs, job triggers and workflows. The problem is that the number of databases, jobs, crawlers and/or triggers and their interractions (i.e. some triggers might be conditional while others might simply be scheduled) can vary depending on the project, and I am having a hard time abstracting this complexity via modules.
I am having to create a lot of for_each "loops" and dynamic blocks within resources to try to render the module as generic as possible (e.g. so that I can create N number of jobs and/or triggers from the root module and define their interractions).
I understand that modules should actually be quite opinionated and specific, and be good at one task so to speak, which means my problem might simply be conceptual. The fact that these pipelines vary significantly from project to project make them a poor use case for modules.
On a side note, I have not been able to find any robust examples of modules online for AWS Glue so this might be another indicator that it is indeed not the best use case.
Any thoughts here would be greatly appreciated.
EDIT:
As requested, here is some of my code from my root module:
module "glue_data_catalog" {
source = "../../modules/aws-glue/data-catalog"
# Connection
create_connection = true
conn_name = "SAMPLE"
conn_description = "SAMPLE."
conn_type = "JDBC"
conn_url = "jdbc:sqlserver:"
conn_sg_ids = ["sampleid"]
conn_subnet_id = "sampleid"
conn_az = "eu-west-1a"
conn_user = var.conn_user
conn_pass = var.conn_pass
# Databases
db_names = [
"raw",
"cleaned",
"consumption"
]
# Crawlers
crawler_settings = {
Crawler_raw = {
database_name = "raw"
s3_path = "bucket-path"
jdbc_paths = []
},
Crawler_cleaned = {
database_name = "cleaned"
s3_path = "bucket-path"
jdbc_paths = []
}
}
crawl_role = "SampleRole"
}
Glue data catalog module:
#############################
# Glue Connection
#############################
resource "aws_glue_connection" "this" {
count = var.create_connection ? 1 : 0
name = var.conn_name
description = var.conn_description
connection_type = var.conn_type
connection_properties = {
JDBC_CONNECTION_URL = var.conn_url
USERNAME = var.conn_user
PASSWORD = var.conn_pass
}
catalog_id = var.conn_catalog_id
match_criteria = var.conn_criteria
physical_connection_requirements {
security_group_id_list = var.conn_sg_ids
subnet_id = var.conn_subnet_id
availability_zone = var.conn_az
}
}
#############################
# Glue Database Catalog
#############################
resource "aws_glue_catalog_database" "this" {
for_each = var.db_names
name = each.key
description = var.db_description
catalog_id = var.db_catalog_id
location_uri = var.db_location_uri
parameters = var.db_params
}
#############################
# Glue Crawlers
#############################
resource "aws_glue_crawler" "this" {
for_each = var.crawler_settings
name = each.key
database_name = each.value.database_name
description = var.crawl_description
role = var.crawl_role
configuration = var.crawl_configuration
s3_target {
connection_name = var.crawl_s3_connection
path = each.value.s3_path
exclusions = var.crawl_s3_exclusions
}
dynamic "jdbc_target" {
for_each = each.value.jdbc_paths
content {
connection_name = var.crawl_jdbc_connection
path = jdbc_target.value
exclusions = var.crawl_jdbc_exclusions
}
}
recrawl_policy {
recrawl_behavior = var.crawl_recrawl_behavior
}
schedule = var.crawl_schedule
table_prefix = var.crawl_table_prefix
tags = var.crawl_tags
}
It seems to me that I'm not actually providing any abstraction in this way but simply overcomplicating things.
I think I found a good solution to the problem, though it happened "by accident". We decided to divide the pipelines into two distinct projects:
ETL on source data
BI jobs to compute various KPIs
I then noticed that I could group resources together for both projects and standardize the way we have them interact (e.g. one connection, n tables, n crawlers, n etl jobs, one trigger). I was then able to create a module for the ETL process and a module for the BI/KPIs process which provided enough abstraction to actually be useful.

How to send lifecycle_rules to a s3 module in terraform

I have a terraform module that creates a s3 bucket. I want the module to be able to accept lifecycle rules.
resource "aws_s3_bucket" "somebucket" {
bucket = "my-versioning-bucket"
acl = "private"
lifecycle_rule {
prefix = "config/"
enabled = true
noncurrent_version_transition {
days = 30
storage_class = "STANDARD_IA"
}
}
}
I want to be able to to send above lifecycle_rule block of code when I call the module. I tried to send it through a variable but it did not work. I have done some research but no luck. Any help is highly appreciated.
Try to use output, from one module , get the desire value in output
e.g
output "lifecycle_rule" {
value = aws_s3_bucket.somebucket.id
}
and call this value into your another module
like:
module "somename" {
source = "/somewhere"
lifecycle_rule = module.amodule-name-where-output-is-applied.lifecycle_rule
...
You would need to play around this.
Just give it a try, these are my guess as far as I understand terraform and your questing.
below link can also help you:
Terraform: Output a field from a module

Preventing destroy of resources when refactoring Terraform to use indices

When I was just starting to use Terraform, I more or less naively declared resources individually, like this:
resource "aws_cloudwatch_log_group" "image1_log" {
name = "${var.image1}-log-group"
tags = module.tagging.tags
}
resource "aws_cloudwatch_log_group" "image2_log" {
name = "${var.image2}-log-group"
tags = module.tagging.tags
}
resource "aws_cloudwatch_log_stream" "image1_stream" {
name = "${var.image1}-log-stream"
log_group_name = aws_cloudwatch_log_group.image1_log.name
}
resource "aws_cloudwatch_log_stream" "image2_stream" {
name = "${var.image2}-log-stream"
log_group_name = aws_cloudwatch_log_group.image2_log.name
}
Then, 10-20 different log groups later, I realized this wasn't going to work well as infrastructure grew. I decided to define a variable list:
variable "image_names" {
type = list(string)
default = [
"image1",
"image2"
]
}
Then I replaced the resources using indices:
resource "aws_cloudwatch_log_group" "service-log-groups" {
name = "${element(var.image_names, count.index)}-log-group"
count = length(var.image_names)
tags = module.tagging.tags
}
resource "aws_cloudwatch_log_stream" "service-log-streams" {
name = "${element(var.image_names, count.index)}-log-stream"
log_group_name = aws_cloudwatch_log_group.service-log-groups[count.index].name
count = length(var.image_names)
}
The problem here is that when I run terraform apply, I get 4 resources to add, 4 resources to destroy. I tested this with an old log group, and saw that all my logs were wiped (obviously, since the log was destroyed).
The names and other attributes of the log groups/streams are identical- I'm simply refactoring the infrastructure code to be more maintainable. How can I maintain my existing log groups without deleting them yet still refactor my code to use lists?
You'll need to move the existing resources within the Terraform state.
Try running terraform show to get the strings under which the resources are stored, this will be something like [module.xyz.]aws_cloudwatch_log_group.image1_log ...
You can move it with terraform state mv [module.xyz.]aws_cloudwatch_log_group.image1_log '[module.xyz.]aws_cloudwatch_log_group.service-log-groups[0]'.
You can choose which index to assign to each resource by changing [0] accordingly.
Delete the old resource definition for each moved resource, as Terraform would otherwise try to create a new group/stream.
Try it with the first import and check with terraform plan if the resource was moved correctly...
Also check if you need to choose some index for the image_names list jsut to be sure, but I think that won't be necessary.

Terraform not uploading a new ZIP

I want to use Terraform for deployment of my lambda functions. I did something like:
provider "aws" {
region = "ap-southeast-1"
}
data "archive_file" "lambda_zip" {
type = "zip"
source_dir = "src"
output_path = "build/lambdas.zip"
}
resource "aws_lambda_function" "test_terraform_function" {
filename = "build/lambdas.zip"
function_name = "test_terraform_function"
handler = "test.handler"
runtime = "nodejs8.10"
role = "arn:aws:iam::000000000:role/xxx-lambda-basic"
memory_size = 128
timeout = 5
source_code_hash = "${data.archive_file.lambda_zip.output_base64sha256}"
tags = {
"Cost Center" = "Consulting"
Developer = "Jiew Meng"
}
}
I find that when there is no change to test.js, terraform correctly detects no change
No changes. Infrastructure is up-to-date.
When I do change the test.js file, terraform does detect a change:
An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
~ update in-place
Terraform will perform the following actions:
~ aws_lambda_function.test_terraform_function
last_modified: "2018-12-20T07:47:16.888+0000" => <computed>
source_code_hash: "KpnhsytFF0yul6iESDCXiD2jl/LI9dv56SIJnwEi/hY=" => "JWIYsT8SszUjKEe1aVDY/ZWBVfrZYhhb1GrJL26rYdI="
It does zip up the new zip, however, it does not seem to update the function with the new ZIP. It seems like it thinks since the filename has no change, it does not upload ... How can I fix this behaviour?
=====
Following some of the answers here, I tried:
Using null_resource
Using S3 bucket/object with etag
And it does not update ... Why is that?
I ran into the same issue and what solved it for me was publishing the Lambda functions automatically using the publish argument. To do so simply set publish = true in your aws_lambda_function resource.
Note that your function will be versioned after this and each change will create a new one. Therefor you should make sure that you use the qualified_arn attribute reference if you're referring to the function in any of your other Terraform code.
There is a workaround to trigger the resource to be refreshed, if the target lambda file names are src/main.py and src/handler.py. If you have more files to be managed, add them one by one.
resource "null_resource" "lambda" {
triggers {
main = "${base64sha256(file("src/main.py"))}"
handler = "${base64sha256(file("src/handler.py"))}"
}
}
data "archive_file" "lambda_zip" {
type = "zip"
source_dir = "src"
output_path = "build/lambdas.zip"
depends_on = ["null_resource.lambda"]
}
Let me know if this works for you.
There is 2 things you need to take care of:
upload zip file to S3 if its content has changed
update Lambda function if zip file content has changed
I can see you are taking care of the latter with source_code_hash. I don't see how you handle the former. It could look like that:
resource "aws_s3_bucket_object" "zip" {
bucket = "${aws_s3_bucket.zip.bucket}"
key = "myzip.zip"
source = "${path.module}/myzip.zip"
etag = "${md5(file("${path.module}/myzip.zip"))}"
}
etag is the most important option here.
I created this module to help ease some of the issues around deploying Lambda with Terraform: https://registry.terraform.io/modules/rojopolis/lambda-python-archive/aws/0.1.4
It may be useful in this scenario. Basically, it replaces the "archive_file" data source with a specialized lambda archive data source to better manage stable source code hash, etc.