Terraform looping a module - amazon-web-services

I have a module with in my terraform file that created some Database servers that does a few things.
First, it creates an auto scaling group to use a specific image, then it creates some EBS volumes and attaches them and then adds some lambda code so on launch the instances get registered to route 53. So in all about 80 lines of text.
Extract
module "systemt-sql-db01" {
source = "localmodules/tf-aws-asg"
name = "${var.envname}-sys-db01"
envname = "${var.envname}"
service = "dbpx"
ami_id = "${data.aws_ami.app_sqlproxy.id}"
user_data = "${data.template_cloudinit_config.config-enforcement-sqlproxy.rendered}"
#subnets = ["${module.subnets-enforcement.web_private_subnets}"]
subnets = ["${element(module.subnets-enforcement.web_private_subnets, 1)}"]
security_groups = ["${aws_security_group.unfiltered-egress-sg.id}", "${aws_security_group.sysopssg.id}", "${aws_security_group.system-sqlproxy.id}"]
key_name = "${var.keypair}"
load_balancers = ["${var.envname}-enf-dbpx-int-elb"]
iam_instance_profile = "${module.iam_profile_generic.profile_arn}"
instance_type = "${var.enforcement_instancesize_dbpx}"
min = 0
max = 0
}
And I then have two parameter files one that I call when launching to pre production and one called when launching to production. I don't want these to contain anything other than variables.
The problem is that for production I need to call the module twice, but for production I need it called three times.
People talk about a count function for modules but I don't think this is possible as yet. Can anyone suggest any other ways to do this? What I would like is to be able in my parameter file to set a list variable of all the DB ASG names, and then loop through this calling the module each time.
I hope that makes sense?
thank you

EDIT Looping in modules is in beta for Terraform 0.13 (https://discuss.hashicorp.com/t/terraform-0-13-beta-released/9555).
This is a highly requested feature in Terraform and as mentioned it is not yet supported. Later releases of Terraform v0.12 will introduce this feature (https://www.hashicorp.com/blog/hashicorp-terraform-0-12-preview-for-and-for-each).
I had a similar problem where I had to create multiple KMS keys for multiple accounts from a base KMS module. I ended up creating a second module that uses the core KMS module, this second module had many instances of the core module, but only required me to input the account details once.
This is still not ideal, but it worked well enough without over complicating things.

Related

Terraform handle multiple lambda functions

I have a requirement for creating aws lambda functions dynamically basis some input parameters like name, docker image etc.
I have been able to build this using terraform (triggered using gitlab pipelines).
Now the problem is that for every unique name I want a new lambda function to be created/updated, i.e if I trigger the pipeline 5 times with 5 names then there should be 5 lambda functions, instead what I get is the older function being destroyed and a new one being created.
How do I achieve this?
I am using Resource: aws_lambda_function
Terraform code
resource "aws_lambda_function" "executable" {
function_name = var.RUNNER_NAME
image_uri = var.DOCKER_PATH
package_type = "Image"
role = role.arn
architectures = ["x86_64"]
}
I think there is a misunderstanding on how terraform works.
Terraform maps 1 resource to 1 item in state and the state file is used to manage all created resources.
The reason why your function keeps getting destroyed and recreated with the new values is because you have only 1 resource in your terraform configuration.
This is the correct and expected behavior from terraform.
Now, as mentioned by some people above, you could use "count or for_each" to add new lambda functions without deleting the previous ones, as long as you can keep track of the previous passed values (always adding the new values to the "list").
Or, if there is no need to keep track/state of the lambda functions you have created, terraform may not be the best solution to solve your needs. The result you are looking for can be easily implemented by python or even shell with aws cli commands.

Why does Terraform plan recreate blocks when the configuration has not changed?

I have a simple site set up on AWS and have a terraform script working to deploy it (at least from my local machine).
When I have a successful deployment through terraform apply, quite often if I then run terraform plan again (immediately after the apply) I will see changes like this:
# aws_route53_record.alias_route53_record_portal will be updated in-place
~ resource "aws_route53_record" "alias_route53_record_portal" {
fqdn = "mysite.co.uk"
id = "Z12345678UR1K1IFUBA_mysite.co.uk_A"
name = "mysite.co.uk"
records = []
ttl = 0
type = "A"
zone_id = "Z12345678UR1K1IFUBA"
- alias {
- evaluate_target_health = false -> null
- name = "d12345mkpmx9ii.cloudfront.net" -> null
- zone_id = "Z2FDTNDATAQYW2" -> null
}
+ evaluate_target_health = true
+ name = "d12345mkpmx9ii.cloudfront.net"
+ zone_id = "Z2FDTNDATAQYW2"
}
}
Why is terraform saying that some parts of resources need recreating when nothing has changed?
EDIT My actual tf resource...
resource "aws_route53_record" "alias_route53_record_portal" {
zone_id = data.aws_route53_zone.sds_zone.zone_id
name = "mysite.co.uk"
type = "A"
alias {
name = aws_cloudfront_distribution.s3_distribution.domain_name
zone_id = aws_cloudfront_distribution.s3_distribution.hosted_zone_id
evaluate_target_health = true
}
}
You have changed evaluate_target_health from false to true. Terraform will just update the fields that have changed. The reason why it's showing like this is because often times AWS doesn't provide separate APIs for each field. Since TF is showing that this resource will be updated in-place, it will touch the minimum number of resources to make this change.
The "plan" operation in Terraform first synchronizes the Terraform state with remote objects (by making calls to the remote API), and then it compares the updated state with the configuration.
Terraform (or, more accurately, the relevant Terraform provider) then generates a planned update or replace for any case where the state and the configuration disagree.
If you see a planned update for a resource whose configuration you know you haven't changed, then by process of elimination that suggests that the remote system is what has changed.
Sometimes that can happen if some other process (or a human in the admin console) changes an object that Terraform believes itself to be responsible for. In that case, the typical resolution is to ensure that each object is only managed by one system and that no-one is routinely making changes to Terraform-managed objects outside of Terraform.
One way to diagnose this would be to consult the remote system and see whether its current settings agree with your Terraform configuration. If not, that would suggest that something other than Terraform has changed the value.
A less common reason this can arise is due to a bug in the provider itself. There are two variations of this class of bug:
When creating the object, the provider doesn't correctly translate the given configuration to a remote API call, and so it ends up creating an object that doesn't match the configuration. A subsequent Terraform plan will then notice that inconsistency and plan an update to fix it. If the provider's update operation has a similar bug then this will never converge, causing the provider to repeatedly plan the same update.
Conversely, the create/update may be implemented correctly but the "refresh" operation (updating the state to match the remote system) may inaccurately translate the remote object data back to Terraform state data, causing the state to not accurately reflect the remote system. In that case, the provider will probably then find that the configuration doesn't match the state anymore, even though the state was correct after the initial create.
Both of these bugs are typically nicknamed "permadiff" by provider developers, because the symptom is Terraform seeming to plan the same change indefinitely, no matter how many times you apply it. If you think you've encountered a "permadiff" bug then usually the path forward is to report a bug in the provider's development repository so that the maintainers can investigate.
One specific variation of "permadiff" is a situation where the remote system does some sort of normalization of your given values which the provider doesn't take into account. For example, some remote systems will accept strings containing uppercase letters but will convert them to lowercase for permanent storage. If a provider doesn't take that into account, it will probably incorrectly plan to change the value back to the one containing uppercase letters again in order to try to make the state match the configuration. This subclass of bug is a normalization permadiff, which provider teams will typically address by re-implementing the remote system's normalization logic in the provider itself.
If you find a normalization permadiff then you can often work around it until the bug is fixed by figuring out what normalization the remote system expects and then manually normalizing your configuration to match it, so that the provider will then see the configuration as matching the remote system.

How can I run terraform code in sequence?

I am trying to setup some automation around AWS infrastructure. Just bumped into one issue about module dependencies. Since there is no "Include" type of option in terraform so it's becoming little difficult to achieve my goal.
Here is the short description of scenario:
In my root directory I've a file main.tf
which consists of multiple module blocks
eg.
module mytest1
{
source = mymod/dev
}
module mytest2
{
source = mymod2/prod
}
each dev and prod have lots of tf files
Few of my .tf file which exists inside prod directory needs some output from the resources which exists inside dev directory
Since module has no dependency, I was thinking if there is any way to run modules in sequence or any other ideas ?
Not entirely sure about your use case for having prod and dev needing to interact in the way you've stated.
I would expect you to maybe have something like the below folder structure:
Folder 1: Dev (Contains modules for dev)
Folder 2: Prod (Contains modules for prod)
Folder 3: Resources (Contains generic resource blocks that both dev and prod module utilise)
Then when you run terraform apply for Folder 1, it will create your dev infrastructure by passing the variables from your modules to the resources (in Folder 3).
And when you run terraform apply for Folder 2, it will create your prod infrastructure by passing the variables from your modules to the resources (in Folder 3).
If you can't do that for some reason, then Output Variables or Data Sources can potentially help you retrieve the information you need.
There is no reason for you to have different modules for different envs. Usually, the difference between lower envs and prod are the number and the tier for each resource, and you could just use variables to pass that to inside the modules.
To deal with this, you can use terraform workspaces and create one workspace for each env, e.g:
terraform worskspace new staging
This will create a completely new workspace, with its own state. If you need to define the number of resouces to be created, you can use the variable sor the terraform workspace name itself, e.g:
# Your EC2 Module
"aws_instance" "example" {
count = "${terraform.workspace == "prod" ? 3 : 1}"
}
# or
"aws_instance" "example" {
count = "${lenght(var.subnets)}" # you are likely to have more subnets for prod
}
# Your module
module "instances" {
source = "./modules/ec2"
subnets = "my subnets list"
}
And that is it, you can have all your modules working for any environment just by creating workspaces and changing the variables for each one on your pipeline and applying the plan each time.
You can read more about workspaces here
I'm not too sure about your requirement of having the production environment depend on the development environment, but putting the specifics aside, the idiomatic way to create sequencing between resources and between modules in Terraform is to use reference expressions.
You didn't say what aspect of the development environment is consumed by the production environment, but for the sake of example let's say that the production environment needs the id of a VPC created in the development environment. In that case, the development module would export that VPC id as an output value:
# (this goes within a file in your mymod/dev directory)
output "vpc_id" {
value = "${aws_vpc.example.id}"
}
Then your production module conversely would have an input variable to specify this:
# (this goes within a file in your mymod2/prod directory)
variable "vpc_id" {
type = "string"
}
With these in place, your parent module can then pass the value between the two to establish the dependency you are looking for:
module "dev" {
source = "./mymod/dev"
}
module "prod" {
source = "./mymod2/prod"
vpc_id = "${module.dev.vpc_id}"
}
This works because it creates the following dependency chain:
module.prod's input variable vpc_id depends on
module.dev's output value vpc_id, which depends on
module.dev's aws_vpc.example resource
You can then use var.vpc_id anywhere inside your production module to obtain that VPC id, which creates another link in that dependency chain, telling Terraform that it must wait until the VPC is created before taking any action that depends on the VPC to exist.
In particular, notice that it's the individual variables and outputs that participate in the dependency chain, not the module as a whole. This means that if you have any resources in the prod module that don't need the VPC to exist then Terraform can get started on creating them immediately, without waiting for the development module to be fully completed first, while still ensuring that the VPC creation completes before taking any actions that do need it.
There is some more information on this pattern in the documentation section Module Composition. It's written with Terraform v0.12 syntax and features in mind, but the general pattern is still applicable to earlier versions if you express it instead using the v0.11 syntax and capabilities, as I did in the examples above.

Terragrunt v0.14.9, Terraform v0.11.7 reading AWS VPC ID from second environment

I have used Terragrunt to orchestrate the creation of a non-default AWS VPC.
I've got S3/DynamoDB state mgmt, and the VPC code is a module. I have the 'VPC environment' terraform.tfvars code checked into a second repo as per the terragrunt README.md.
I created a second module which will eventually create hosts in this VPC but for now just aims to output its ID. I have created a separate 'hosts environment' / terraform.tfvars for the instantiation of this module.
I run terragrunt apply in the VPC environment directory - VPC created
I run terragrunt apply a second time in the hosts environment directory - output directive doesn't work (no error, but incorrect, see below).
This is a precursor to one day running a terragrunt apply-all in the parent directory of the VPC/hosts environment directories; my reading of the docs suggest using a terraform_remote_state data source to expose the VPC ID, so I specified access like this in the data.tf file of the hosts module:
data "terraform_remote_state" "vpc" {
backend = "s3"
config {
bucket = "myBucket"
key = "keyToMy/vpcEnvironment.tfstate"
region = "stateRegion"
}
}
Then, in the hosts module outputs.tf, I specified an output to check assignment:
output "mon_vpc" {
value = "${data.terraform_remote_state.vpc.id}"
}
When I run (2) above, it exits with:
Apply complete! Resources: 0 added, 0 changed, 0 destroyed.
Outputs:
mon_vpc = 2018-06-02 23:14:42.958848954 +0000 UTC
Questions:
I'm going wrong setting up the code so that the hosts environment is configured to correctly acquire the VPC ID from the already-existing VPC (terraform state file) - any advice on what to change here would be appreciated.
It does look like I've managed to acquire the date of when the VPC was created rather than its ID, which given the code is perplexing - anyone know why?
I'm not using community modules - all hand rolled.
EDIT: In response to Brandon Miller, here is a bit more. In my VPC module, I have an outputs.tf containing among other outputs:
output "aws_vpc.mv.id-op" {
value = "${aws_vpc.mv.id}"
}
and the vpc.tf contains
resource "aws_vpc" "mv" {
cidr_block = "${var.vpcCidr}"
enable_dns_support = true
enable_dns_hostnames = true
tags = {
Name = "mv-vpc-${var.aws_region}"
}
}
As this cfg results in a vpc being created, and as most of the parameters are <computed>, I assumed state would contain sufficient data for other modules to refer to by consulting state (I assumed at first that terraform used the AWS API for this under the bonnet, rather than consulting a different state key).
EDIT 2: Read all of #brendan-miller's answer and following comments first.
Use of periods causes a problem as it confuses terraform (see Brendan's answer for the specification format below):
Error: output 'mon_vpc': unknown resource 'data.aws_vpc.mv-ds' referenced in variable data.aws_vpc.mv-ds.vpc.id
You named your output aws_vpc.mv.id-op but when you retrieve it you are retrieving just id. You could try
data.terraform_remote_state.vpc.aws_vpc.mv.id
but im not sure if Terraform will complain about the additional .. However the format should always be
data.terraform_remote_state.<name of the remote state module>.<name of the output>
You mentioned wanting to be able to get this info with the AWS API. That is also possible by using the aws_vpc data source. Their example uses id, but you can also use any tag you used on your vpc.
Like this:
data "aws_vpc" "default" {
filter {
name = "tag:Name"
values = ["example-vpc-name"]
}
}
Then you can use this for the id
${data.aws_vpc.default.id}
In addition this retrieves all tags set, for example:
${data.aws_vpc.default.tags.Name}
And the cidr block
${data.aws_vpc.default.cidr_block}
As well as some other info. This can be very useful for storing and retrieving things about your VPC.

How in terraform using google platform provider do I get instance information from a instance created with google_compute_region_instance_group?

I am creating a terraform file so I can setup some VMs in GCP to build my own Kubernetes platform (Yes google has their own engine but I want to use some custom items). I have been able to create the .tf file to create the whole stack just like the other setup in the Kubespray project. Something like what you do to terraform VMs on AWS.
The last part I need to automate is the creation of the host file for Ansible.
I create the Masters and Workers using a resource called google_compute_region_instance_group which places each instance in a different AZ with in GCP. Now I need to get the hostname and IP give to these instances. The problem I have is that they are dynamically created recourses. So to pull this information out I use a data source to grab the info.
Here is what I have now.
data.google_compute_region_instance_group.data_masters.instances
[
{
"instance" = "https://www.googleapis.com/compute/v1/projects/appportablityphase2/zones/us-east1-c/instances/k8-masters-4r2f"
"named_ports" = []
"status" = "RUNNING"
},
{
"instance" = "https://www.googleapis.com/compute/v1/projects/appportablityphase2/zones/us-east1-d/instances/k8-masters-qh64"
"named_ports" = []
"status" = "RUNNING"
},
{
"instance" = "https://www.googleapis.com/compute/v1/projects/appportablityphase2/zones/us-east1-b/instances/k8-masters-w9c8"
"named_ports" = []
"status" = "RUNNING"
},
]
As you can see the output is a mix of a list and maps. I am able to get just the instance self url with this line.
lookup(data.google_compute_region_instance_group.data_masters.instances[0], "instance")
https://www.googleapis.com/compute/v1/projects/appportablityphase2/zones/us-east1-c/instances/k8-masters-4r2f
Which then I can split and get the instance name. This is the hard part that I can not figure out with Terraform. In the above line I have to use [0] to call the instance information. I then need to iterate through all of the instance which may be more then 3 or 3.
I can not find a way to do this with this data source type. I have tried count.index but it only supported in a resource type not data source. I have also tried splat syntax and it has not worked.
I don't think generating manually the inventory is the right approach although it is possible.
You could give a try to GCP Dynamic Inventory, which generates inventory from running instances based on their network tags.
For instance, instance A has tags foo, and instance B has tags foo and bar, the generated inventory will be:
[tag_foo]
A
B
[tag_bar]
B
Script is available at this address: https://github.com/ansible/ansible/blob/devel/contrib/inventory/gce.py
Configuration file here: https://github.com/ansible/ansible/blob/devel/contrib/inventory/gce.ini
And usage is ansible-playbook -i gce.py site.yml