I am currently experiencing some challenges with the way that count index's in Terraform. I am seeking some help to convert this to for_each.
# Data Source for github repositories. This is used for adding all repos to the teams.
data "github_repositories" "repositories" {
query = "org:theorg"
}
resource "github_team_repository" "business_analysts" {
count = length(data.github_repositories.repositories.names)
team_id = github_team.Business_Analysts.id
repository = element(data.github_repositories.repositories.names, count.index)
permission = "pull"
}
I have tried the following with no success:
resource "github_team_repository" "business_analysts" {
for_each = toset(data.github_repositories.repositories.names)
team_id = github_team.Business_Analysts.id
repository = "${each.value}"
permission = "pull"
}
I am querying a Github organization and receiving a huge list of repositories. I am then using count to add those repositories to a Team. Unfortunately, Terraform will error out once a new repository is added or changed. This being said I think the new for_each function could solve this dilemma for me however, I am having trouble wrapping my head around how to implement it; in this particular scenario. Any help would be appreciated.
False Alarm. The I had the answer all along and the issue was attributed to the way I was referencing a variable.....
if someone stumbles upon this then you want to structure your loop like so
for_each = toset(data.github_repositories.repositories.names)
team_id = github_team.Business_Analysts.id
repository = each.key
permission = "pull"
}
Related
I've been scratching my head over this one for longer than I'd like to admit, but I'm throwing in the towel...
I have a large Terraform package and in the Terraform Plan, I get this error:
Terraform Plan (Error) Log
Exception Error in plan - TypeError: planResultMessage.search is not a function
I do not use the planResultMessage.search anywhere in my code, so my guess is that it is a Terraform error?
What I do know is that this set of resources that it is deploying is a bunch of yaml documents that I am trying to leverage to create SSM Documents. They are being loaded as such:
member_data.tf
data "template_file" "member_createmultiregiontrail" {
template = file("${path.module}/member-runbooks/member-asr-CreateCloudTrailMultiRegionTrail.yml")
}
data "template_file" "member_createlogmetricsfilteralarm" {
template = file("${path.module}/member-runbooks/member-asr-CreateLogMetricFilterAndAlarm.yml")
}
asr-member.tf
resource "aws_ssm_document" "asr_document_cloudtrail_multiregion" {
provider = aws.customer
count = var.enabled == true && var.child_account == true ? 1 : 0
name = "ASR-CreateCloudTrailMultiRegionTrail"
document_format = "YAML"
document_type = "Automation"
content = data.template_file.member_createmultiregiontrail.template
}
resource "aws_ssm_document" "asr_document_logs_metricsfilter_alarm" {
provider = aws.customer
count = var.enabled == true && var.child_account == true ? 1 : 0
name = "ASR-CreateLogMetricFilterAndAlarm"
document_format = "YAML"
document_type = "Automation"
content = data.template_file.member_createlogmetricsfilteralarm.template
}
As an example. I think the cause might be in these document files because the Terraform Error populates in the middle of the contents of these documents, it's always a random location in one of the documents...
Example:
This one fell into a document for SecHub's AFSBP Redshift 6 control, but at the beginning of the section contents it acknowledges that the resource will be deployed:
# module.aws-securityhub-master.aws_ssm_document.AFSBP_Redshift_6[0] will be created
I have tried loading the contents directly, using yamlencode, using simply "file", loading them into locals, pulling a file from locals, and now I'm on data sources.
If anyone can offer any help, it would be greatly appreciated.
DISCLAIMER:
This Terraform build out is a deconstruction of Amazon's SHARR solution:
https://aws.amazon.com/solutions/implementations/automated-security-response-on-aws/
you can see the various yaml build-outs here based on which security control:
https://github.com/aws-solutions/aws-security-hub-automated-response-and-remediation/tree/main/source/playbooks
The two that I specifically called out in my data sources are:
https://github.com/aws-solutions/aws-security-hub-automated-response-and-remediation/blob/main/source/remediation_runbooks/CreateCloudTrailMultiRegionTrail.yaml
and
https://github.com/aws-solutions/aws-security-hub-automated-response-and-remediation/blob/main/source/remediation_runbooks/CreateLogMetricFilterAndAlarm.yaml
and the AFSBP yaml can be found here (just in case it matters):
https://github.com/aws-solutions/aws-security-hub-automated-response-and-remediation/blob/main/source/playbooks/AFSBP/ssmdocs/AFSBP_Redshift.6.yaml
Thank you in advance!
This turned out to be a buffer overflow issue. Expanded resources to accommodate the deployment and that solved the issue.
I have a Terraform project that allows to create multiple cloud functions.
I know that if I change the name of the google_storage_bucket_object related to the function itself, terraform will see the difference of the zip name and redeploy the cloud function.
My question is, there is a way to obtain the same behaviour, but only with the cloud functions that have been changed?
resource "google_storage_bucket_object" "zip_file" {
# Append file MD5 to force bucket to be recreated
name = "${local.filename}#${data.archive_file.source.output_md5}"
bucket = var.bucket.name
source = data.archive_file.source.output_path
}
# Create Java Cloud Function
resource "google_cloudfunctions_function" "java_function" {
name = var.function_name
runtime = var.runtime
available_memory_mb = var.memory
source_archive_bucket = var.bucket.name
source_archive_object = google_storage_bucket_object.zip_file.name
timeout = 120
entry_point = var.function_entry_point
event_trigger {
event_type = var.event_trigger.event_type
resource = var.event_trigger.resource
}
environment_variables = {
PROJECT_ID = var.env_project_id
SECRET_MAIL_PASSWORD = var.env_mail_password
}
timeouts {
create = "60m"
}
}
By appending MD5 every cloud functions will result in a different zip file name, so terraform will re-deploy every of them and I found that without the MD5, Terraform will not see any changes to deploy.
If I have changed some code only inside a function, how can I tell to Terraform to re-deploy only it (so for example to change only its zip file name)?
I hope my question is clear and I want to thank you everyone who tries to help me!
As per subject, I have set up log based metrics for a platform in gcp i.e. firewall, audit, route etc. monitoring.
enter image description here
Now I need to setup alert policies tied to these log based metrics, which is easy enough to do manually in gcp.
enter image description here
However, I need to do it via terraform thus using this module:
https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/monitoring_alert_policy#nested_alert_strategy
I might be missing something very simple but finding it hard to understand this as the alert strategy is apparently required but yet does not seem to be supported?
I am also a bit confused on which kind of condition I should be using to match my already setup log based metric?
This is my module so far, PS. I have tried using the same filter as I did for setting up the log based metric as well as the name of the log based filter:
resource "google_monitoring_alert_policy" "alert_policy" {
display_name = var.display_name
combiner = "OR"
conditions {
display_name = var.display_name
condition_matched_log {
filter = var.filter
#duration = "600s"
#comparison = "COMPARISON_GT"
#threshold_value = 1
}
}
user_labels = {
foo = "bar"
}
}
var filter is:
resource.type="gce_route" AND (protoPayload.methodName:"compute.routes.delete" OR protoPayload.methodName:"compute.routes.insert")
Got this resolved in the end.
Turns out common issue:
https://issuetracker.google.com/issues/143436657?pli=1
Had to add this to the filter parameter in my terraform module after the metric name - AND resource.type="global"
We are working with AWS Glue as a pipeline tool for ETL at my company. So far, the pipelines were created manually via the console and I am now moving to Terraform for future pipelines as I believe IaC is the way to go.
I have been trying to work on a module (or modules) that I can reuse as I know that we will be making several more pipelines for various projects. The difficulty I am having is in creating a good level of abstraction with the module. AWS Glue has several components/resources to it, including a Glue connection, databases, crawlers, jobs, job triggers and workflows. The problem is that the number of databases, jobs, crawlers and/or triggers and their interractions (i.e. some triggers might be conditional while others might simply be scheduled) can vary depending on the project, and I am having a hard time abstracting this complexity via modules.
I am having to create a lot of for_each "loops" and dynamic blocks within resources to try to render the module as generic as possible (e.g. so that I can create N number of jobs and/or triggers from the root module and define their interractions).
I understand that modules should actually be quite opinionated and specific, and be good at one task so to speak, which means my problem might simply be conceptual. The fact that these pipelines vary significantly from project to project make them a poor use case for modules.
On a side note, I have not been able to find any robust examples of modules online for AWS Glue so this might be another indicator that it is indeed not the best use case.
Any thoughts here would be greatly appreciated.
EDIT:
As requested, here is some of my code from my root module:
module "glue_data_catalog" {
source = "../../modules/aws-glue/data-catalog"
# Connection
create_connection = true
conn_name = "SAMPLE"
conn_description = "SAMPLE."
conn_type = "JDBC"
conn_url = "jdbc:sqlserver:"
conn_sg_ids = ["sampleid"]
conn_subnet_id = "sampleid"
conn_az = "eu-west-1a"
conn_user = var.conn_user
conn_pass = var.conn_pass
# Databases
db_names = [
"raw",
"cleaned",
"consumption"
]
# Crawlers
crawler_settings = {
Crawler_raw = {
database_name = "raw"
s3_path = "bucket-path"
jdbc_paths = []
},
Crawler_cleaned = {
database_name = "cleaned"
s3_path = "bucket-path"
jdbc_paths = []
}
}
crawl_role = "SampleRole"
}
Glue data catalog module:
#############################
# Glue Connection
#############################
resource "aws_glue_connection" "this" {
count = var.create_connection ? 1 : 0
name = var.conn_name
description = var.conn_description
connection_type = var.conn_type
connection_properties = {
JDBC_CONNECTION_URL = var.conn_url
USERNAME = var.conn_user
PASSWORD = var.conn_pass
}
catalog_id = var.conn_catalog_id
match_criteria = var.conn_criteria
physical_connection_requirements {
security_group_id_list = var.conn_sg_ids
subnet_id = var.conn_subnet_id
availability_zone = var.conn_az
}
}
#############################
# Glue Database Catalog
#############################
resource "aws_glue_catalog_database" "this" {
for_each = var.db_names
name = each.key
description = var.db_description
catalog_id = var.db_catalog_id
location_uri = var.db_location_uri
parameters = var.db_params
}
#############################
# Glue Crawlers
#############################
resource "aws_glue_crawler" "this" {
for_each = var.crawler_settings
name = each.key
database_name = each.value.database_name
description = var.crawl_description
role = var.crawl_role
configuration = var.crawl_configuration
s3_target {
connection_name = var.crawl_s3_connection
path = each.value.s3_path
exclusions = var.crawl_s3_exclusions
}
dynamic "jdbc_target" {
for_each = each.value.jdbc_paths
content {
connection_name = var.crawl_jdbc_connection
path = jdbc_target.value
exclusions = var.crawl_jdbc_exclusions
}
}
recrawl_policy {
recrawl_behavior = var.crawl_recrawl_behavior
}
schedule = var.crawl_schedule
table_prefix = var.crawl_table_prefix
tags = var.crawl_tags
}
It seems to me that I'm not actually providing any abstraction in this way but simply overcomplicating things.
I think I found a good solution to the problem, though it happened "by accident". We decided to divide the pipelines into two distinct projects:
ETL on source data
BI jobs to compute various KPIs
I then noticed that I could group resources together for both projects and standardize the way we have them interact (e.g. one connection, n tables, n crawlers, n etl jobs, one trigger). I was then able to create a module for the ETL process and a module for the BI/KPIs process which provided enough abstraction to actually be useful.
When I was just starting to use Terraform, I more or less naively declared resources individually, like this:
resource "aws_cloudwatch_log_group" "image1_log" {
name = "${var.image1}-log-group"
tags = module.tagging.tags
}
resource "aws_cloudwatch_log_group" "image2_log" {
name = "${var.image2}-log-group"
tags = module.tagging.tags
}
resource "aws_cloudwatch_log_stream" "image1_stream" {
name = "${var.image1}-log-stream"
log_group_name = aws_cloudwatch_log_group.image1_log.name
}
resource "aws_cloudwatch_log_stream" "image2_stream" {
name = "${var.image2}-log-stream"
log_group_name = aws_cloudwatch_log_group.image2_log.name
}
Then, 10-20 different log groups later, I realized this wasn't going to work well as infrastructure grew. I decided to define a variable list:
variable "image_names" {
type = list(string)
default = [
"image1",
"image2"
]
}
Then I replaced the resources using indices:
resource "aws_cloudwatch_log_group" "service-log-groups" {
name = "${element(var.image_names, count.index)}-log-group"
count = length(var.image_names)
tags = module.tagging.tags
}
resource "aws_cloudwatch_log_stream" "service-log-streams" {
name = "${element(var.image_names, count.index)}-log-stream"
log_group_name = aws_cloudwatch_log_group.service-log-groups[count.index].name
count = length(var.image_names)
}
The problem here is that when I run terraform apply, I get 4 resources to add, 4 resources to destroy. I tested this with an old log group, and saw that all my logs were wiped (obviously, since the log was destroyed).
The names and other attributes of the log groups/streams are identical- I'm simply refactoring the infrastructure code to be more maintainable. How can I maintain my existing log groups without deleting them yet still refactor my code to use lists?
You'll need to move the existing resources within the Terraform state.
Try running terraform show to get the strings under which the resources are stored, this will be something like [module.xyz.]aws_cloudwatch_log_group.image1_log ...
You can move it with terraform state mv [module.xyz.]aws_cloudwatch_log_group.image1_log '[module.xyz.]aws_cloudwatch_log_group.service-log-groups[0]'.
You can choose which index to assign to each resource by changing [0] accordingly.
Delete the old resource definition for each moved resource, as Terraform would otherwise try to create a new group/stream.
Try it with the first import and check with terraform plan if the resource was moved correctly...
Also check if you need to choose some index for the image_names list jsut to be sure, but I think that won't be necessary.