I'm working on a terraform module to deploy our various AWS data lake services. I'm encountering an error occasionally (not always) when deploying a table that we have setup as a governed table in LakeFormation.
The error code comes from the AWS provider and to the tune of "Resource was present, now absent". When trying to run apply again I receive an error because the resource actually exists in AWS from the initial run, but Terraform is confused thinking it wasn't created. I've tried adding depends_on blocks to upstream resources, but that hasn't had the desired effect so far.
Are there any workarounds or solutions folks have encountered to an issue like this?
resource "aws_glue_catalog_table" "governed_table" {
name = "orders"
database_name = "sales"
table_type = "GOVERNED"
partition_keys {
name = "country"
type = "string"
}
storage_descriptor {
input_format = "org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat"
output_format = "org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat"
location = "s3://${var.s3_bucket}/bronze/sales/orders/"
ser_de_info {
name = "sales_orders"
parameters = {
"serialization.format" = "1"
}
serialization_library = "org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe"
}
columns {
name = "sale_id"
type = "string"
comment = ""
}
columns {
name = "sale_amount"
type = "float"
comment = ""
}
columns {
name = "sale_date"
type = "int"
comment = ""
}
}
}
Related
I'm using following local variables to pass map of account_id and account_type to my child module in order to do to create patch manager resources based on account_type from my child module.
locals {
org_sub_accounts_map = zipmap(module.accounts.account_id, module.accounts.tags_all.*.AccountType)
org_sub_accounts = [for k, v in local.org_sub_accounts_map : {
id = k
type = v
}
]
}
module "ssm_patch_manager" {
source = "../../../../modules/aws/cloudformation/stacksets"
accounts = local.org_sub_accounts
account_exception_list = var.account_exception_list
regions = var.region_list
stackset_name = "SSM-PatchManager"
template = "ssm_patch_manager"
parameters = var.patch_manager_default_params
parameter_overrides = var.patch_manager_params_overrides
stackset_admin_role_arn = module.stackset_admin_role.role_arn
depends_on = [module.accounts]
}
local.org_sub_accounts is something like this:
org_sub_accounts = [
{
"id" = "111111111111"
"type" = "Dev"
},
{
"id" = "222222222222"
"type" = "Prod"
},
{
"id" = "33333333333"
"type" = "Dev"
}
]
This works fine with all the existing AWS accounts as terraform aware of the accounts IDs. Now the problem is, when I'm creating a new AWS account from module.accounts, and running the terraform plan, I get below error:
Error: Invalid for_each argument
on ../../../../modules/aws/cloudformation/stacksets/main.tf line 25, in resource "aws_cloudformation_stack_set_instance" "stack":
25: for_each = {
26: for stack_instance in local.instance_data : "${stack_instance.account}.${stack_instance.region}" => stack_instance if contains(var.account_exception_list, stack_instance.account) == false
27: }
├────────────────
│ local.instance_data will be known only after apply
│ var.account_exception_list is list of string with 1 element
The "for_each" value depends on resource attributes that cannot be determined
until apply, so Terraform cannot predict how many instances will be created.
To work around this, use the -target argument to first apply only the
resources that the for_each depends on.
I understand this is clearly because terraform doesn't know the account_id when evaluating the locals variables. Can anyone suggest something here to resolve this issue?
Please note, this is already implemented solution. But we came to know after sometime when we try to create a new account. Therefore, any suggestion without major structure changes in the code would be really helpful.
Update
instance_data is a resource block in child module.
locals {
instance_data = flatten([
for account in var.accounts : [
for region in var.regions : {
account = account.id
type = try(length(account.type), 0) > 0 ? account.type : "default" # To support legacy var.account input, which account 'type' key is not passed.
region = region
}
]
])
resource "aws_cloudformation_stack_set_instance" "stack" {
for_each = {
for stack_instance in local.instance_data : "${stack_instance.account}.${stack_instance.region}" => stack_instance if contains(var.account_exception_list, stack_instance.account) == false
}
account_id = each.value.account
region = each.value.region
parameter_overrides = lookup(var.parameter_overrides, each.value.type, null) # To handle different parameters based on 'AccountType' Tag in sub accounts
stack_set_name = aws_cloudformation_stack_set.stackset.name
}
Finally came up with below solution. Just posting this here if anyone needs to reference in future.
data "aws_organizations_resource_tags" "account" {
count = length(module.organization.non_master_accounts.*.id)
resource_id = module.organization.non_master_accounts.*.id[count.index]
}
# “for_each” value in child module depends on resource attributes (AWS account_id) that cannot be determined until terraform apply, so Terraform cannot predict how many instances will be created.
# Because of this, we use `aws_organizations_resource_tags` data source to create stacksets, instead of module outputs.
locals {
org_sub_accounts = [for account in data.aws_organizations_resource_tags.account : {
id = account.id
type = try(length(account.tags.AccountType), 0) > 0 ? account.tags.AccountType : "default" # In case AccountType tag missing in existing/invited accounts.
}
]
}
# Output in Organization module
output "non_master_accounts" {
value = aws_organizations_organization.root.non_master_accounts
}
UPDATE
i got the variable working which passes the terraform plan flying colors. That said when i run terraform apply I get a new error:
creating CodePipeline (dev-mgt-mytest-cp): ValidationException: 2
validation errors detected: Value at
'pipeline.stages.1.member.actions.1.member.configuration' failed to
satisfy constraint: Map value must satisfy constraint: [Member must
have length less than or equal to 50000, Member must have length
greater than or equal to 1]; Value at
'pipeline.stages.2.member.actions.1.member.configuration' failed to
satisfy constraint: Map value must satisfy constraint: [Member must
have length less than or equal to 50000, Member must have a length
greater than or equal to 1]
I don't believe this is a limit for the code pipeline since I have done this pipeline manually without dynamic stages, and it works fine. Not sure if this is a terraform hard limit. Looking for some help here. Also, I have updated the code with the working variable for those looking for the syntax.
OLD POST
================================================================
I am giving my first stab at creating dynamic stages and really struggling with the documentation out there. What I have put together so far is based on articles found here in StackOverflow and a few resources online. So far I think i have good syntax, but the value i am passing is from my main.tf is getting an error:
The given value is not suitable for the module.test_code.var.stages
declared at │ ../dynamic_pipeline/variables.tf:60,1-18: all list
elements must have the same │ type.
Part 1
All I am trying to do basically is pass in dynamic stages into the pipeline. Once I get the stages working, I will add the new dynamic variables. I am providing the dynamic module, variables.tf for the module, and then my test run along with variables.
dynamic_pipeline.tf
resource "aws_codepipeline" "cp_plan_pipeline" {
name = "${local.cp_name}-cp"
role_arn = var.cp_run_role
artifact_store {
type = var.cp_artifact_type
location = var.cp_artifact_bucketname
}
dynamic "stage" {
for_each = [for s in var.stages : {
name = s.name
action = s.action
} if(lookup(s, "enabled", true))]
content {
name = stage.value.name
dynamic "action" {
for_each = stage.value.action
content {
name = action.value["name"]
owner = action.value["owner"]
version = action.value["version"]
category = action.value["category"]
provider = action.value["provider"]
run_order = lookup(action.value, "run_order", null)
namespace = lookup(action.value, "namespace", null)
region = lookup(action.value, "region", data.aws_region.current.name)
input_artifacts = lookup(action.value, "input_artifacts", [])
output_artifacts = lookup(action.value, "output_artifacts", [])
configuration = {
RepositoryName = lookup(action.value, "repository_name", null)
ProjectName = lookup(action.value, "ProjectName", null)
BranchName = lookup(action.value, "branch_name", null)
PollForSourceChanges = lookup(action.value, "poll_for_sourcechanges", null)
OutputArtifactFormat = lookup(action.value, "ouput_format", null)
}
}
}
}
}
}
variables.tf
#---------------------------------------------------------------------------------------------------
# General
#---------------------------------------------------------------------------------------------------
variable "region" {
type = string
description = "The AWS Region to be used when deploying region-specific resources (Default: us-east-1)"
default = "us-east-1"
}
#---------------------------------------------------------------------------------------------------
# CODEPIPELINE VARIABLES
#---------------------------------------------------------------------------------------------------
variable "cp_name" {
type = string
description = "The name of the codepipline"
}
variable "cp_repo_name" {
type = string
description = "Then name of the repo that will be used as a source repo to trigger builds"
}
variable "cp_branch_name" {
type = string
description = "The branch of the repo that will be watched and used to trigger deployment"
default = "development"
}
variable "cp_artifact_bucketname" {
type = string
description = "name of the artifact bucket where articacts are stored."
default = "Codepipeline-artifacts-s3"
}
variable "cp_run_role" {
type = string
description = "S3 artifact bucket name."
}
variable "cp_artifact_type" {
type = string
description = ""
default = "S3"
}
variable "cp_poll_sources" {
description = "Trigger that lets codepipeline know that it needs to trigger build on change"
type = bool
default = false
}
variable "cp_ouput_format" {
type = string
description = "Output artifacts format that is used to save the outputs"
default = "CODE_ZIP"
}
variable "stages" {
type = list(object({
name = string
action = list(object({
name = string
owner = string
version = string
category = string
provider = string
run_order = number
namespace = string
region = string
input_artifacts = list(string)
output_artifacts = list(string)
repository_name = string
ProjectName = string
branch_name = string
poll_for_sourcechanges = bool
output_format = string
}))
}))
description = "This list describes each stage of the build"
}
#---------------------------------------------------------------------------------------------------
# ENVIORNMENT VARIABLES
#---------------------------------------------------------------------------------------------------
variable "env" {
type = string
description = "The environment to deploy resources (dev | test | prod | sbx)"
default = "dev"
}
variable "tenant" {
type = string
description = "The Service Tenant in which the IaC is being deployed to"
default = "dummytenant"
}
variable "project" {
type = string
description = "The Project Name or Acronym. (Note: You should consider setting this in your Enviornment Variables.)"
}
#---------------------------------------------------------------------------------------------------
# Parameter Store Variables
#---------------------------------------------------------------------------------------------------
variable "bucketlocation" {
type = string
description = "location within the S3 bucket where the State file resides"
}
Part 2
That is the main makeup of the pipeline. Below is the module I created to try to execute as a test to ensure it works. This is where I am getting the error
main.tf
module test_code {
source = "../dynamic_pipeline"
cp_name = "dynamic-actions"
project = "my_project"
bucketlocation = var.backend_bucket_target_name
cp_run_role = "arn:aws:iam::xxxxxxxxx:role/cp-deploy-service-role"
cp_repo_name = var.repo
stages = [{
name = "part 1"
action = [{
name = "Source"
owner = "AWS"
version = "1"
category = "Source"
provider = "CodeCommit"
run_order = 1
repository_name = "my_target_repo"
branch_name = "main"
poll_for_sourcechanges = true
output_artifacts = ["CodeWorkspace"]
ouput_format = var.cp_ouput_format
}]
},
{
name = "part 2"
action = [{
run_order = 1
name = "Combine_Binaries"
owner = "AWS"
version = "1"
category = "Build"
provider = "CodeBuild"
namespace = "BIN"
input_artifacts = ["CodeWorkspace"]
output_artifacts = ["CodeSource"]
ProjectName = "test_runner"
}]
}]
}
variables files associated with the run book:
variables.tf
#---------------------------------------------------------------------------------------------------
# CODEPIPELINE VARIABLES
#---------------------------------------------------------------------------------------------------
variable "cp_branch_name" {
type = string
description = "The branch of the repo that will be watched and used to trigger deployment"
default = "development"
}
variable "cp_poll_sources" {
description = "Trigger that lets codepipeline know that it needs to trigger build on change"
type = bool
default = false
}
variable "cp_ouput_format" {
type = string
description = "Output artifacts format that is used to save the outputs. Values can be CODEBUILD_CLONE_REF or CODE_ZIP"
default = "CODE_ZIP"
}
variable "backend_bucket_target_name" {
type = string
description = "The folder name where the state file is stored for the pipeline"
default = "dynamic-test-pl"
}
variable "repo" {
type = string
description = "name of the repo the pipeine is managing"
default = "my_target_repo"
}
I know this is my first time trying this. Not very good with Lists and maps on terraform, but I am certain it has to do with the way i am passing it in. Any help or guidance would be appreciated it.
After some time, I finally found the answer to this issue. Special thanks to this thread on github. It put me in the right direction. A couple of things to take away from this. Variable declaration is the essential part of Dynamic Pipeline. I worked with several working examples that yielded great results for Stages and Actions, but when it came to the configuration environment variables, they all crashed. The root problem that I concluded was that you could not perform Dynamic Actions with Environment variables and hope for terraform to perform the JSON translation for you. In some cases it would work but required that every Action contain similar elements which led to character constraints and errors like my post called out.
My best guess is that terraform has a hard limit on variables and their character limits. The solution, declare the resource as a dynamic that seems to support different limits versus traditional variables within a resource. The approach that was taken makes the entire Terraform Resource a Dynamic attribute which I feel is treated differently by terraform in its entirety with fewer limits (an assumption). I say that because I tried four methods of dynamic staging and actions. Those methods worked up until I introduced the environment variables (forces JSON conversion on a specific resource type) and then I would get various errors all pointing at either a variable not supported for missing attributes or a variable that exceeds terraform character limits.
What worked was creating the entire resource as a dynamic resource which I could pass in as a map attribute that includes the EnvironmentVariables. See examples below.
Final Dynamic Pipeline
resource "aws_codepipeline" "codepipeline" {
for_each = var.code_pipeline
name = "${local.name_prefix}-${var.AppName}"
role_arn = each.value["code_pipeline_role_arn"]
tags = {
Pipeline_Key = each.key
}
artifact_store {
type = lookup(each.value, "artifact_store", null) == null ? "" : lookup(each.value.artifact_store, "type", "S3")
location = lookup(each.value, "artifact_store", null) == null ? null : lookup(each.value.artifact_store, "artifact_bucket", null)
}
dynamic "stage" {
for_each = lookup(each.value, "stages", {})
iterator = stage
content {
name = lookup(stage.value, "name")
dynamic "action" {
for_each = lookup(stage.value, "actions", {}) //[stage.key]
iterator = action
content {
name = action.value["name"]
category = action.value["category"]
owner = action.value["owner"]
provider = action.value["provider"]
version = action.value["version"]
run_order = action.value["run_order"]
input_artifacts = lookup(action.value, "input_artifacts", null)
output_artifacts = lookup(action.value, "output_artifacts", null)
configuration = action.value["configuration"]
namespace = lookup(action.value, "namespace", null)
}
}
}
}
}
Calling Dynamic Pipeline
module "code_pipeline" {
source = "../module-aws-codepipeline" #using module locally
#source = "your-github-repository/aws-codepipeline" #using github repository
AppName = "My_new_pipeline"
code_pipeline = local.code_pipeline
}
SAMPlE local pipeline variable
locals {
/*
DECLARE enviornment variables. Note each Action does not require environment variables
*/
action_second_stage_variables = [
{
name = "PIPELINE_EXECUTION_ID"
type = "PLAINTEXT"
value = "#{codepipeline.PipelineExecutionId}"
},
{
name = "NamespaceVariable"
type = "PLAINTEXT"
value = "some_value"
},
]
action_third_stage_variables = [
{
name = "PL_VARIABLE_1"
type = "PLAINTEXT"
value = "VALUE1"
},
{
name = "PL_VARIABLE 2"
type = "PLAINTEXT"
value = "VALUE2"
},
{
name = "PL_VARIABLE_3"
type = "PLAINTEXT"
value = "VAUE3"
},
{
name = "PL_VARIABLE_4"
type = "PLAINTEXT"
value = "#{BLD.NamespaceVariable}"
},
]
/*
BUILD YOUR STAGES
*/
code_pipeline = {
codepipeline-configs = {
code_pipeline_role_arn = "arn:aws:iam::aws_account_name:role/role_name"
artifact_store = {
type = "S3"
artifact_bucket = "your-aws-bucket-name"
}
stages = {
stage_1 = {
name = "Download"
actions = {
action_1 = {
run_order = 1
category = "Source"
name = "First_Stage"
owner = "AWS"
provider = "CodeCommit"
version = "1"
output_artifacts = ["download_ouput"]
configuration = {
RepositoryName = "Codecommit_target_repo"
BranchName = "main"
PollForSourceChanges = true
OutputArtifactFormat = "CODE_ZIP"
}
}
}
}
stage_2 = {
name = "Build"
actions = {
action_1 = {
run_order = 2
category = "Build"
name = "Second_Stage"
owner = "AWS"
provider = "CodeBuild"
version = "1"
namespace = "BLD"
input_artifacts = ["Download_ouput"]
output_artifacts = ["build_outputs"]
configuration = {
ProjectName = "codebuild_project_name_for_second_stage"
EnvironmentVariables = jsonencode(local.action_second_stage_variables)
}
}
}
}
stage_3 = {
name = "Validation"
actions = {
action_1 = {
run_order = 1
name = "Third_Stage"
category = "Build"
owner = "AWS"
provider = "CodeBuild"
version = "1"
input_artifacts = ["build_outputs"]
output_artifacts = ["validation_outputs"]
configuration = {
ProjectName = "codebuild_project_name_for_third_stage"
EnvironmentVariables = jsonencode(local.action_third_stage_variables)
}
}
}
}
}
}
}
}
The trick becomes building your code pipeline resource and its stages and actions at the local level. You take your local.tf and build out the pipeline variable there, you build out all your stages, actions, and EnvironmentVariables. EnvironmentVariables are then passed and converted from JSON directly into the variable, which passes in as a single variable type. A sample explaining this approach can be found within this GitHub repository. I took the findings and consolidated them, and documented them so others could leverage this method.
I'm using Terraform with GCP ... I have the groups variable that I have not been able to get to work. Here's the definitions:
resource "google_compute_instance_group" "vm_group" {
name = "vm-group"
zone = "us-central1-c"
project = "myproject-dev"
instances = [google_compute_instance.east_vm.id, google_compute_instance.west_vm.id]
named_port {
name = "http"
port = "8080"
}
named_port {
name = "https"
port = "8443"
}
lifecycle {
create_before_destroy = true
}
}
data "google_compute_image" "debian_image" {
family = "debian-9"
project = "debian-cloud"
}
resource "google_compute_instance" "west_vm" {
name = "west-vm"
project = "myproject-dev"
machine_type = "e2-micro"
zone = "us-central1-c"
boot_disk {
initialize_params {
image = data.google_compute_image.debian_image.self_link
}
}
network_interface {
network = "default"
}
}
resource "google_compute_instance" "east_vm" {
name = "east-vm"
project = "myproject-dev"
machine_type = "e2-micro"
zone = "us-central1-c"
boot_disk {
initialize_params {
image = data.google_compute_image.debian_image.self_link
}
}
network_interface {
network = "default"
}
}
And here are the variables:
http_forward = true
https_redirect = true
create_address = true
project = "myproject-dev"
backends = {
"yobaby" = {
description = "my app"
enable_cdn = false
security_policy = ""
custom_request_headers = null
custom_response_headers = null
iap_config = {
enable = false
oauth2_client_id = ""
oauth2_client_secret = ""
}
log_config = {
enable = false
sample_rate = 0
}
groups = [{group = "google_compute_instance_group.vm_group.id"}]
}
}
... this is my latest attempt to get a group value that works, but this one won't work for me either; I still get
Error 400: Invalid value for field 'resource.backends[0].group': 'google_compute_instance_group.vm_group.id'. The URL is malformed., invalid
I've tried this with DNS FQDNs and variations on the syntax above; still no go.
Thanks much for any advice whatsoever!
There are couple clues that can lead in this direction based from the error message reported by Terraform Error 400: Invalid value for field 'resource.backends[0].group': 'google_compute_instance_group.vm_group.id'. The URL is malformed., invalid:
Error code 400 means the request was actually sent to the server, who rejected it as malformed (HTTP error code 400 is for client-side errors); this implies that Terraform itself has no problem with the syntax, i.e., the configuration file is correct and actionable from TF's PoV
Value of field resource.backends[0].group is reported as being literally 'google_compute_instance_group.vm_group.id' which strongly suggests that a variable substitution did not take place.
The quotes around the code block makes it into a literal value instead of a variable reference. The solution is to change this:
groups = [{group = "google_compute_instance_group.vm_group.id"}]
To this:
groups = [{group = google_compute_instance_group.vm_group.id}]
I gave up on Terraform and used gcloud scripts to do what I needed to do, based on this posting.
I am attaching a tag-template to a column of a BigQuery table. For this, I am using Terraform and I have just recreated the code in the terraform documentation.
resource "google_data_catalog_entry" "entry" {
entry_group = google_data_catalog_entry_group.entry_group.id
entry_id = "my_entry"
user_specified_type = "my_custom_type"
user_specified_system = "SomethingExternal"
schema = <<EOF
{
"columns": [...]
}
EOF
}
resource "google_data_catalog_entry_group" "entry_group" {
entry_group_id = "my_entry_group"
}
resource "google_data_catalog_tag_template" "tag_template" {
tag_template_id = "my_template"
region = "us-central1"
display_name = "Demo Tag Template"
fields {
field_id = "source"
display_name = "Source of data asset"
type {
primitive_type = "STRING"
}
is_required = true
}
force_delete = "true"
}
resource "google_data_catalog_tag" "basic_tag" {
parent = google_data_catalog_entry.entry.id
template = google_data_catalog_tag_template.tag_template.id
fields {
field_name = "source"
string_value = "my-string"
}
column = "address"
}
Docs: https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/data_catalog_tag
Unfortunately, every time I run 'terraform apply' twice, I obtain the next API error:
Error: Error updating Tag "projects/xxxx/locations/europe-west2/entryGroups/xxxx/entries/xxxx/tags/xxx": googleapi: Error 400: Unsupported field mask path: "column", supported field masks are:
fields
It is like Terraform is not happy when I create this resource twice.
To avoid this I have used:
count = local.created_tag ? 1 : 0
But I wonder what might be the cause and if there is a better way to solve the issue.
I am having an error with a terraform code, while deploy a GCP composer resource:
google_composer_environment.composer-beta: googleapi: Error 400: Property key must be of the form section-name. The section may not contain opening square brackets, closing square brackets or hyphens, and the name may not contain a semicolon or equals sign. The entire property key may not contain periods., badRequest
The issue arises while this GCP resource is being deployed: https://www.terraform.io/docs/providers/google/r/composer_environment.html
This is my code:
Variables.tf file:
variable "composer_airflow_version" {
type = "map"
default = {
image_version="composer-1.6.1-airflow-1.10.1"
}
}
variable "composer_python_version" {
type = "map"
default = {
python_version="3"
}
}
my-composer.tf file:
resource "google_composer_environment" "composer-beta" {
provider= "google-beta"
project = "my-proyect"
name = "${var.composer_name}"
region = "${var.region}"
config {
node_count = "${var.composer_node_count}"
node_config {
zone = "${var.zone}"
machine_type = "${var.composer_machine_type}"
network = "${google_compute_network.network.self_link}"
subnetwork = "${lookup(var.vpc_subnets_01[0], "subnet_name")}"
}
software_config {
airflow_config_overrides="${var.composer_airflow_version}",
airflow_config_overrides="${var.composer_python_version}",
}
}
depends_on = [
"google_service_account.comp-py3-dev-worker",
"google_compute_subnetwork.subnetwork",
]
}
According to the error message, the root cause of the error seems be related to the software_config section in the terraform code. I understand that the variables "composer_airflow_version" and "composer_python_version" should be of type "map", therefore, I set up them as map format.
A really appreciate it, if someone could identify the cause of the error, and tell me the adjustment to apply. It is likely that I should apply a change in variables, but I don't know what it is. :-(
Thanks in advance,
Jose
Based on the documentations, airflow_config_overrides, pypi_packages, env_variables, image_version and python_version should be directly under software_config.
Variables.tf file:
variable "composer_airflow_version" {
default = "composer-1.6.1-airflow-1.10.1"
}
variable "composer_python_version" {
default = "3"
}
my-composer.tf file:
resource "google_composer_environment" "composer-beta" {
provider= "google-beta"
project = "my-proyect"
name = "${var.composer_name}"
region = "${var.region}"
config {
node_count = "${var.composer_node_count}"
node_config {
zone = "${var.zone}"
machine_type = "${var.composer_machine_type}"
network = "${google_compute_network.network.self_link}"
subnetwork = "${lookup(var.vpc_subnets_01[0], "subnet_name")}"
}
software_config {
image_version = "${var.composer_airflow_version}",
python_version = "${var.composer_python_version}",
}
}
depends_on = [
"google_service_account.comp-py3-dev-worker",
"google_compute_subnetwork.subnetwork",
]
}