I have a list of servers stored as a list in locals as
locals {
my_list = [
"server1",
"server2",
"server3",
"server4"
]
}
Can I fetch data for each server such as instace I'd etc using the locals above? Without defining individual data blocks for each server.
Can I then put those attributes in a list? Finally how would I consume it later for the example below which is for just one server. ( Below example is a cloud watch alarm dimension)
dimensions = {
instanceid = data.aws_instance.server1.instance_id
}
You can provide filter instance-id with your my_list (assuming server1 is instance-id):
data "aws_instances" "my_instances" {
filter {
name = "instance-id"
values = local.my_list
}
}
In case my_list contains instance names, then you can use:
data "aws_instance" "my_instances" {
for_each = toset(local.my_list)
instance_tags = {
Name = each.key
}
}
and to get the list of instance ids:
values(data.aws_instance.my_instances)[*].id
Related
I've the following variable:
variable "mymap" {
type = map(string)
default = {
"key1" = "val1"
"key2" = "val2"
}
}
I am trying to expand this to create individual parameters in this resource:
resource "aws_elasticache_parameter_group" "default" {
name = "cache-params"
family = "redis2.8"
parameter {
name = "activerehashing"
value = "yes"
}
parameter {
name = "min-slaves-to-write"
value = "2"
}
}
My desired state for this example would be:
resource "aws_elasticache_parameter_group" "default" {
name = "cache-params"
family = "redis2.8"
parameter {
name = "key1"
value = "val1"
}
parameter {
name = "key2"
value = "val2"
}
}
I don't see this supported explicitly in the docs; am I even taking the correct approach to doing this?
(I'm mainly looking at leveraging 'dynamic' and 'for_each' keywords, but haven't been able to have success)
To achieve the desired state, you would have to do a couple of things. One could be to use dynamic meta-argument [1] with for_each [2]. The code would have to be changed to the following:
resource "aws_elasticache_parameter_group" "default" {
name = "cache-params"
family = "redis2.8"
dynamic "parameter" {
for_each = var.mymap
content {
name = parameter.value.name
value = parameter.value.value
}
}
}
However, you would also have to adjust the variable:
variable "mymap" {
type = map(map(string))
description = "Map of parameters for Elasticache."
default = {
"parameter1" = {
"value" = "value1"
"name" = "name1"
}
}
}
Then, you can define the values for the variable mymap in a tfvars file (e.g., terraform.tfvars) like this:
mymap = {
"parameter1" = {
"name" = "activerehashing"
"value" = "yes"
}
"parameter2" = {
"name" = "min-slaves-to-write"
"value" = "2"
}
}
[1] https://developer.hashicorp.com/terraform/language/expressions/dynamic-blocks
[2] https://developer.hashicorp.com/terraform/language/meta-arguments/for_each
You can use a dynamic block to dynamically declare zero or more nested configuration blocks based on a collection.
resource "aws_elasticache_parameter_group" "default" {
name = "cache-params"
family = "redis2.8"
dynamic "parameter" {
for_each = var.mymap
content {
name = parameter.key
value = parameter.value
}
}
}
The above tells Terraform to generate one parameter block for each element of var.varmap, and to populate the name and value arguments of each generated block based on the key and value from each map element respectively.
The parameter symbol inside the content block represents the current element of the collection. This symbol is by default named after the block type being generated, which is why it was named parameter in this case. It's possible to override that generated name using an additional iterator argument in the dynamic block, but that's necessary only if you are generating multiple levels of nesting where a nested block type has the same name as its container.
I have a resource that creates multiple s3 access points depending on the input provided. The input is a map with s3 uri as the key and parsed bucket name as the value.
Example:
{
"s3://my_bucket/model1.tar.gz" -> "my_bucket",
"s3://my_bucket_2/model2.tar.gz" -> "my_bucket_2",
"s3://my_bucket/model3.tar.gz" -> "my_bucket"
}
I then use for_each to iterate through each element in the map to create s3 access points. Unfortunately, there are 2 "my_bucket" values in the map, which means it will attempt to create s3 access points for that designated bucket twice, and thus will error out with message:
AccessPointAlreadyOwnedByYou: Your previous request to create the named accesspoint succeeded and you already own it.
How can I check that the access point exists first before creating the resource?
Code Example:
resource "aws_s3_access_point" "s3_access_point" {
for_each = var.create ? local.uri_bucket_map : {}
bucket = each.value
name = format("%s-%s", each.value, "access-point")
}
output "s3_access_point_arn" {
description = "The arn of the access point"
value = { for uri, ap in aws_s3_access_point.s3_access_point : uri => ap.arn }
}
Desired Output:
{
"s3://my_bucket/model1.tar.gz" -> <access point uri>,
"s3://my_bucket_2/model2.tar.gz" -> <access point uri>,
"s3://my_bucket/model3.tar.gz" -> <access point uri>
}
I would invert your uri_bucket_map:
locals {
uri_bucket_map_inverse = {
for k,v in local.uri_bucket_map: v => k...
}
}
giving:
{
"my_bucket" = [
"s3://my_bucket/model1.tar.gz",
"s3://my_bucket/model3.tar.gz",
]
"my_bucket_2" = [
"s3://my_bucket_2/model2.tar.gz",
]
}
then just create access points as:
resource "aws_s3_access_point" "s3_access_point" {
for_each = var.create ? local.uri_bucket_map_inverse : {}
bucket = each.key
name = format("%s-%s", each.key, "access-point")
}
and the output would use both the APs and the inverted list map:
output "s3_access_point_arn" {
description = "The arn of the access point"
value = merge([for bucket_name, ap in aws_s3_access_point.s3_access_point:
{ for uri in local.uri_bucket_map_inverse[bucket_name]:
uri => ap.arn
}
]...)
}
I have two conditions need to be fulfilled:
Grant users permission to specific project-id based on env. For example: my-project-{env} (env: stg/prd)
I want to loop over the variables, instead of writing down repetitive resource for each user.
Example:
variable some_ext_users {
type = map(any)
default = {
user_1 = { email_id = "user_1#gmail.com" }
user_2 = { email_id = "user_2#gmail.com" }
}
}
To avoid repetitive resource made on each user (imagine 100++ users), I decided to list them in variable as written above.
Then, I'd like to assign these user GCS permission, e.g:
resource "google_storage_bucket_iam_member" "user_email_access" {
for_each = var.some_ext_users
count = var.env == "stg" ? 1 : 0
provider = google-beta
bucket = "my-bucketttt"
role = "roles/storage.objectViewer"
member = "user:${each.value.email_id}"
}
The error I'm getting is clear :
Error: Invalid combination of "count" and "for_each" on
../../../modules/my-tf.tf line 54, in resource
"google_storage_bucket_iam_member" "user_email_access": 54:
for_each = var.some_ext_users The "count" and "for_each"
meta-arguments are mutually-exclusive, only one should be used to be
explicit about the number of resources to be created.
My question is, what is the workaround in order to satisfy the requirements above if count and for_each can't be used together?
You could control the user list according to the environment, rather than trying to control the resource. So, something like this:
resource "google_storage_bucket_iam_member" "user_email_access" {
for_each = var.env == "stg" ? var.some_ext_users : {}
provider = google-beta
bucket = "my-bucketttt"
role = "roles/storage.objectViewer"
member = "user:${each.value.email_id}"
}
The rule for for_each is to assign it a map that has one element per instance you want to declare, so the best way to think about your requirement here is that you need to write an expression that produces a map with zero elements when your condition doesn't hold.
The usual way to project and filter collections in Terraform is for expressions, and indeed we can use a for expression with an if clause to conditionally filter out unwanted elements, which in this particular case will be all of the elements:
resource "google_storage_bucket_iam_member" "user_email_access" {
for_each = {
for name, user in var.some_ext_users : name => user
if var.env == "stg"
}
# ...
}
Another possible way to structure this would be to include the environment keywords as part of the data structure, which would keep all of the information in one spot and potentially allow you to have entries that apply to more than one environment at once:
variable "some_ext_users" {
type = map(object({
email_id = string
environments = set(string)
}))
default = {
user_1 = {
email_id = "user_1#gmail.com"
environments = ["stg"]
}
user_2 = {
email_id = "user_2#gmail.com"
environments = ["stg", "prd"]
}
}
}
resource "google_storage_bucket_iam_member" "user_email_access" {
for_each = {
for name, user in var.some_ext_users : name => user
if contains(user.environments, var.env)
}
# ...
}
This is a variation of the example in the "Filtering Elements" documentation I linked above, which uses an is_admin flag in order to declare different resources for admin users vs. non-admin users. In this case, notice that the if clause refers to the symbols declared in the for expression, which means we can now get a different result for each element of the map, whereas the first example either kept all elements or no elements.
Given the data source definition:
data "aws_ami" "my_ami" {
filter {
name = "name"
values = ["my_ami_name"]
}
}
How does one add a second filter only if a condition is true?
Example pseudo code of what I want:
data "aws_ami" "my_ami" {
filter {
name = "name"
values = ["my_ami_name"]
}
var.state ? filter {
name = "state"
values = [var.state]
} : pass
}
The second filter would only be used if the state variable has content.
Note that I don't want to use a 'N/A' value to always use the second filter, regardless if it's needed or not.
You can use dynamic blocks. The condition depends exactly on what is your condition (var.state is not shown, so I don't know what it is), but in general you can do:
data "aws_ami" "my_ami" {
filter {
name = "name"
values = ["my_ami_name"]
}
dynamic "filter" {
for_each = var.state ? [1] : []
content {
name = "state"
values = [var.state]
}
}
}
I have a Kinesis Firehose configuration in Terraform, which reads data from Kinesis stream in JSON, converts it to Parquet using Glue and writes to S3.
There is something wrong with data format conversion and I am getting the below error(with some details removed):
{"attemptsMade":1,"arrivalTimestamp":1624541721545,"lastErrorCode":"DataFormatConversion.InvalidSchema","lastErrorMessage":"The
schema is invalid. The specified table has no columns.","attemptEndingTimestamp":1624542026951,"rawData":"xx","sequenceNumber":"xx","subSequenceNumber":null,"dataCatalogTable":{"catalogId":null,"databaseName":"db_name","tableName":"table_name","region":null,"versionId":"LATEST","roleArn":"xx"}}
The Terraform configuration for Glue Table, I am using, is as follows:
resource "aws_glue_catalog_table" "stream_format_conversion_table" {
name = "${var.resource_prefix}-parquet-conversion-table"
database_name = aws_glue_catalog_database.stream_format_conversion_db.name
table_type = "EXTERNAL_TABLE"
parameters = {
EXTERNAL = "TRUE"
"parquet.compression" = "SNAPPY"
}
storage_descriptor {
location = "s3://${element(split(":", var.bucket_arn), 5)}/"
input_format = "org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat"
output_format = "org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat"
ser_de_info {
name = "my-stream"
serialization_library = "org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe"
parameters = {
"serialization.format" = 1
}
}
columns {
name = "metadata"
type = "struct<tenantId:string,env:string,eventType:string,eventTimeStamp:timestamp>"
}
columns {
name = "eventpayload"
type = "struct<operation:string,timestamp:timestamp,user_name:string,user_id:int,user_email:string,batch_id:string,initiator_id:string,initiator_email:string,payload:string>"
}
}
}
What needs to change here?
I faced the "The schema is invalid. The specified table has no columns" with the following combination:
avro schema in Glue schema registry,
glue table created through console using "Add table from existing schema"
kinesis data firehose configured with Parquet conversion and referencing the glue table created from the schema registry.
It turns out that KDF is unable to read table's schema if table is created from existing schema. Table have to be created from scratch (in opposition to "Add table from existing schema") This isn't documented ... for now.
In addition to the answer from mberchon I found that the default generated policy for the Kinesis Delivery Stream did not include the necessary IAM permissions to actually read the schema.
I had to manually modify the IAM policy to include glue:GetSchema and glue:GetSchemaVersion.
Frustrated by having to manually define columns, wrote a little python tool that takes a pydantic class (could be made to work with json-schema too) and generated a json that can be used with terraform to create the table.
https://github.com/nanit/j2g
from pydantic import BaseModel
from typing import List
class Bar(BaseModel):
name: str
age: int
class Foo(BaseModel):
nums: List[int]
bars: List[Bar]
other: str
get converted to
{
"nums": "array<int>",
"bars": "array<struct<name:string,age:int>>",
"other": "string"
}
and can be used in terraform like so
locals {
columns = jsondecode(file("${path.module}/glue_schema.json"))
}
resource "aws_glue_catalog_table" "table" {
name = "table_name"
database_name = "db_name"
storage_descriptor {
dynamic "columns" {
for_each = local.columns
content {
name = columns.key
type = columns.value
}
}
}
}
Thought id post here as i was facing the same problem and found a workaround for this that appears to work.
As is stated above AWS do not allow you to use tables generated from existing schema to convert data types using Firehose. That said if you are using terraform you can create the table using the existing schema, then use the columns attribute from the first table created to create another table and then use that second table as the table for data type conversion in the firehose config, i can confirm this works.
tables terraform:
resource "aws_glue_catalog_table" "aws_glue_catalog_table_from_schema" {
name = "first_table"
database_name = "foo"
storage_descriptor {
schema_reference {
schema_id {
schema_arn = aws_glue_schema.your_glue_schema.arn
}
schema_version_number = aws_glue_schema.your_glue_schema.latest_schema_version
}
}
}
resource "aws_glue_catalog_table" "aws_glue_catalog_table_from_first_table" {
name = "second_table"
database_name = "foo"
storage_descriptor {
dynamic "columns" {
for_each = aws_glue_catalog_table.aws_glue_catalog_table_from_schema.storage_descriptor[0].columns
content {
name = columns.value.name
type = columns.value.type
}
}
}
}
firehose data format conversion configuration:
data_format_conversion_configuration {
output_format_configuration{
serializer {
parquet_ser_de {}
}
}
input_format_configuration {
deserializer {
hive_json_ser_de {}
}
}
schema_configuration {
database_name = aws_glue_catalog_table.aws_glue_catalog_table_from_first_table.database_name
role_arn = aws_iam_role.firehose_role.arn
table_name = aws_glue_catalog_table.aws_glue_catalog_table_from_first_table.name
}
}