I'm trying to write an internal module using Terraform 0.13 that allows for the caller to choose one or more prewritten policy documents at call time. What I'd like to do is define each policy as a data.iam_policy_document, and conditionally include/merge them into the resulting policy as multiple statements. None of the examples I've found seem to quite do this, and most of the IAM related modules in the registry just rely on the parent module passing the complete policy statement, but my goal is for the user of the module to not need to understand how to write proper IAM policies.
My thought was the easiest way would to be to merge the .json versions of the policy documents and pass that to the iam_policy resource, but that didn't seem to work well with having the policy document controlled via a count ternary, and I realize this is maybe the wrong approach entirely.
The desired result of using the module is the creation of a single role, with an appropriate trust policy, that has access to the chosen group of services, and to not create any unused and uneeded resources (extra policies that remain unattached, etc)
The aws_iam_policy_document is primarily for defining entirely new policies, but for this sort of task of wrangling existing policies (which may or may not have been created with aws_iam_policy_document, I suppose) I think it would be easier to decode the policy JSON using jsondecode and then work with those resulting data structures before merging the result back together again.
That could get complicated if the policies can potentially be interdependent or conflict with one another, but if you can assume that all of the policies will be independent of each other then you could potentially just concatenate together the Statement arrays from each document.
For example:
variable "iam_policies_json" {
type = list(string)
}
locals {
iam_policies = [for src in var.iam_policies_json : jsondecode(src)]
iam_policy_statements = flatten([
for policy in local.iam_policies : policy.Statement
])
merged_policy = jsonencode({
Version = "2012-10-17"
Statement = local.iam_policy_statements
})
}
The above just unconditionally merges all of them together, but once you have data structures like the local.iam_policy_statements here you can potentially use other Terraform expression constructs, such as for expressions with if clauses, to conditionally filter out any policies you don't want to include in the result.
Related
I have a requirement for creating aws lambda functions dynamically basis some input parameters like name, docker image etc.
I have been able to build this using terraform (triggered using gitlab pipelines).
Now the problem is that for every unique name I want a new lambda function to be created/updated, i.e if I trigger the pipeline 5 times with 5 names then there should be 5 lambda functions, instead what I get is the older function being destroyed and a new one being created.
How do I achieve this?
I am using Resource: aws_lambda_function
Terraform code
resource "aws_lambda_function" "executable" {
function_name = var.RUNNER_NAME
image_uri = var.DOCKER_PATH
package_type = "Image"
role = role.arn
architectures = ["x86_64"]
}
I think there is a misunderstanding on how terraform works.
Terraform maps 1 resource to 1 item in state and the state file is used to manage all created resources.
The reason why your function keeps getting destroyed and recreated with the new values is because you have only 1 resource in your terraform configuration.
This is the correct and expected behavior from terraform.
Now, as mentioned by some people above, you could use "count or for_each" to add new lambda functions without deleting the previous ones, as long as you can keep track of the previous passed values (always adding the new values to the "list").
Or, if there is no need to keep track/state of the lambda functions you have created, terraform may not be the best solution to solve your needs. The result you are looking for can be easily implemented by python or even shell with aws cli commands.
I have a simple Terraform code where I manage an application's version code in S3
I want to manage multiple version of this code in S3.
My code is as follows:
main.tf
resource "aws_s3_bucket" "caam_test_bucket" {
bucket = "caam-test-bucket"
versioning {
enabled = true
}
}
resource "aws_s3_bucket_object" "caam_test_bucket_obj" {
bucket = aws_s3_bucket.caam_test_bucket.id
key = "${var.env}/v-${var.current_version}/app.zip"
source = "app.zip"
}
Every time I update the code, I export it to app.zip, increment the variable current_version and push the terraform code.
The issue here is that instead of keeping multiple version folders in the S3 buckets, it deletes the existing one and creates another.
I want Terraform to keep any paths and files created and to not delete it.
For e.g if a path dev/v-1.0/app.zip already exists and i increment the current version to 2.0 and push the code, i want Terraform to keep dev/v-1.0/app.zip and also add the dev/v-2.0/app.zip to the bucket.
Is there a way to do that ?
TF deletes your object, because that is how it works:
Destroy resources that exist in the state but no longer exist in the configuration.
One way to overcome this is to keep all your objects in the configuration, through for_each. This way you would keep adding new versions to a map of existing objects, rather then keep replacing them. This can be problematic if you are creating lots of versions, as you have to keep them all.
Probably easier way is to use local-exec which is going to use AWS CLI to upload the object. This happens "outside" of TF, thus TF will not be deleting pre-existing objects, as TF won't be aware of them.
Is there a way I can use a terraform data call for a bucket (perhaps created and stored in a different state file) and then in the event nothing is in data, create the resource by setting a count?
I've been doing some experiments and continually get the following:
Error: Failed getting S3 bucket (example_random_bucket_name): NotFound: Not Found
status code: 404, request id: <ID here>, host id: <host ID here>
Sample code to test (this has been modified from the original code which generated this error):
variable "bucket_name" {
default = "example_random_bucket_name"
}
data "aws_s3_bucket" "new" {
bucket = var.bucket_name
}
resource "aws_s3_bucket" "s3_bucket" {
count = try(1, data.aws_s3_bucket.new.id == "" ? 1 : 0 )
bucket = var.bucket_name
}
I feel like rather than generating an error I should get an empty result, but that's not the case.
Terraform is a desired-state system, so you can only describe what result you want, not the steps/conditions to get there.
If Terraform did allow you to decide whether to declare a bucket based on whether there is already a bucket of that name, you would create a configuration that could never converge: on the first run, it would not exist and so your configuration would declare it. But on the second run, the bucket would then exist and therefore your configuration would not declare it anymore, and so Terraform would plan to destroy it. On the third run, it would propose to create it again, and so on.
Instead, you must decide as part of your system design which Terraform configuration (or other system) is responsible for managing each object:
If you decide that a particular Terraform configuration is responsible for managing this S3 bucket then you can declare it with an unconditional aws_s3_bucket resource.
If you decide that some other system ought to manage the bucket then you'll write your configuration to somehow learn about the bucket name from elsewhere, such as by an input variable or using the aws_s3_bucket data source.
Sadly you can't do this. data sources must exist, otherwise they error out. There is no build in way in TF to check if a resource exists or not. There is nothing in between, in a sense that a resource may, or may not exist.
If you require such functionality, you have to program it yourself using External Data Source. Or maybe simpler, provide an input variable bucket_exist, so that you explicitly set it during apply.
Data sources are designed to fail this way.
However, if you use a state file from external configuration, it's possible to declare an output in the external state, based on whether the s3 bucket is managed by that state and use it in s3_bucket resource as condition.
For example, the output in external state will be empty string (not managed) or value for whatever property is useful for you. Boolean is another choice. Delete data source from this configuration and add condition to the resource based on the output.
It's your call if any such workarounds complicate or simplify your configuration.
In GCP, when using Terraform, I see I can use name attribute as well as self_link. So, I am wondering if there are cases where I must use any of those.
For example:
resource "google_compute_ssl_policy" "custom_ssl_policy" {
name = "my-ssl-policy"
profile = "MODERN"
min_tls_version = "TLS_1_1"
}
this object, then can be referred as:
ssl_policy = google_compute_ssl_policy.custom_ssl_policy.name
and
ssl_policy = google_compute_ssl_policy.custom_ssl_policy.self_link
I know that object.name returns the Terraform object name, and object.self_link returns GCP's resources's URI.
I have tried with several objects, and it works with both attributes, so I want to know if this is trivial or there are situations where I should use one of them.
Here is the definition from the official documentation:
Nearly every GCP resource will have a name field. They are used as a
short way to identify resources, and a resource's display name in the
Cloud Console will be the one defined in the name field.
When linking resources in a Terraform config though, you'll primarily
want to use a different field, the self_link of a resource. Like name,
nearly every resource has a self_link. They look like:
https://www.googleapis.com/compute/v1/projects/foo/zones/us-central1-c/instances/terraform-instance
A resource's self_link is a unique reference to that resource. When
linking two resources in Terraform, you can use Terraform
interpolation to avoid typing out the self link!
Reference: https://registry.terraform.io/providers/hashicorp/google/latest/docs/guides/getting_started
One example, I can deploy two cloud functions with the same name/same project but in different regions. In this case, if you had to reference both resources in Terraform code, you would be better by using the self_link since it's a unique URI.
I have a Lambda function invoked by S3 put events, which in turn needs to process the objects and write to a database on RDS. I want to test things out in my staging stack, which means I have a separate bucket, different database endpoint on RDS, and separate IAM roles.
I know how to configure the lambda function's event source and IAM stuff manually (in the Console), and I've read about lambda aliases and versions, but I don't see any support for providing operational parameters (like the name of the destination database) on a per-alias basis. So when I make a change to the function, right now it looks like I need a separate copy of the function for staging and production, and I would have to keep them in sync manually. All of the logic in the code would be the same, and while I get the source bucket and key as a parameter to the function when it's invoked, I don't currently have a way to pass in the destination stuff.
For the destination DB information, I could have a switch statement in the function body that checks the originating S3 bucket and makes a decision, but I hate making every function have to keep that mapping internally. That wouldn't work for the DB credentials or IAM policies, though.
I suppose I could automate all or most of this with the SDK. Has anyone set something like this up for a continuous integration-style deployment with Lambda, or is there a simpler way to do it that I've missed?
I found a workaround using Lambda function aliases. Given the context object, I can get the invoked_function_arn property, which has the alias (if any) at the end.
arn_string = context.invoked_function_arn
alias = arn_string.split(':')[-1]
Then I just use the alias as an index into a dict in my config.py module, and I'm good to go.
config[alias].host
config[alias].database
One thing I'm not crazy about is that I have to invoke my function from an alias every time, and now I can't use aliases for any other purpose without affecting this scheme. It would be nice to have explicit support for user parameters in the context object.