Terraform, get count of length of data source - amazon-web-services

Folks,
I'm trying to create a subnet per each aws availability zone available in a AWS region.
data "aws_availability_zones" "azs" {
depends_on = [aws_vpc.k3s_vpc]
state = "available"
}
locals {
azs= "${data.aws_availability_zones.azs.names}"
}
resource "aws_subnet" "private_subnets" {
count = length(data.aws_availability_zones.azs.names)
vpc_id = aws_vpc.k3s_vpc.id
cidr_block = var.private_subnets_cidr[count.index]
availability_zone = local.azs[count.index]
}
getting below error
Error: Invalid count argument
The "count" value depends on resource attributes that cannot be determined
until apply, so Terraform cannot predict how many instances will be created.
To work around this, use the -target argument to first apply only the
resources that the count depends on.
Any ideas ?

Inside your data "aws_availability-zones" "azs" block you've written depends_on = [aws_vpc.k3s_vpc], which means that Terraform can't look up the availability zones until the VPC is already created, but the VPC doesn't exist yet during planning and so you see this error.
The availability zones for a particular region your account don't vary based on the creation of VPCs, so it's not clear to me why you included that dependency. If you remove it then Terraform should see that it is able to resolve that data source during the planning phase and thus determine how many subnets to create.
However, I would still suggest caution with this approach: if the result of that lookup were to change in future in a way that introduces availability zones anywhere except the end of the list then your existing subnets would get reassigned to new availability zones, and thus the provider will plan to replace them. Instead, it might be better to use the availability zone names themselves as the identifiers for the subnets, so that it won't matter what order they appear in the resulting list:
data "aws_availability_zones" "azs" {
state = "available"
}
locals {
azs = toset(data.aws_availability_zones.azs.names)
}
resource "aws_subnet" "private_subnets" {
for_each = local.azs
vpc_id = aws_vpc.k3s_vpc.id
cidr_block = var.private_subnets_cidr[each.value]
availability_zone = each.value
}
Notice that under this approach you'd also need to change variable "public_subnets_cidr" to be a map instead of a list, with the availability zone names as keys, so that the CIDR ranges are also assigned directly to AZs and won't get reassigned if new zones appear in your account later.

Related

Does Terraform allow overriding resources from modules?

I have a Terraform module where I wanted to refactor the EFS so it is managed in the module rather than in the Terraform I presently have. Presently my Terraform includes two VPCs sharing an EFS using VPC Peering connection.
Eventually I want to be rid of old VPC but the EFS is still held by the old VPC. EFS does not allow creating aws_efs_mount_target on a different VPC.
resource "aws_efs_mount_target" "main" {
for_each = toset(module.docker-swarm.subnets)
file_system_id = aws_efs_file_system.main.id
subnet_id = each.key
}
So I was wondering, is it possible to set something along the lines of
disable {
module.common.aws_efs_mount_target.main
}
or
module "common" {
exclude = [ "aws_efs_mount_target.main " ]
}
This is how I solved it. Basically create a new variable and use that as a condition to make an empty set.
resource "aws_efs_mount_target" "main" {
for_each = var.skip_efs_mount_target ? [] : toset(module.docker-swarm.subnets)
file_system_id = aws_efs_file_system.main.id
subnet_id = each.key
}

How to pick distinct subnets from all the available zones in terraform

When trying to create elb(classic load balancer) in AWS via terraform, I am sending a list of public subnet ids that were created from another module. In this case I have 4 subnets which are spanned across 3 az's. I have 2 subnets from az-1a when I am trying to run the terraform , I get an error saying same az can't be used twice for ELB
resource "aws_elb" "loadbalancer" {
name = "loadbalancer-terraform"
subnets = var.public_subnets
listener {
instance_port = 80
instance_protocol = "http"
lb_port = 80
lb_protocol = "http"
}
depends_on = [aws_autoscaling_group.private_ec2]
}
Is there any way where I can select subnets from the given list in such a way I can only get subnet id's from distinct AZ's .
subnetid1 -- az1-a
subnetid2 -- az1-b
subnetid3 -- az1-c
subnetid4 -- az1-a
now I need to get an output either subnet-1,2 and 3 or subnet-2,3 and 4.
It sounds like this problem decomposes into two smaller problems:
Determine the availability zone of each of the subnets.
For each distinct availability zone, choose any one of the subnets that belongs to it. (I'm assuming here that there is no reason to prefer one subnet over another if both are in the same AZ.)
For step one, if we don't already have the subnets in question managed by the current configuration (which seems to be the case here -- you are receiving them from an input variable) then we can use the aws_subnet data source to read information about a subnet given its ID. Because you have more than one subnet here, we'll use resource for_each to look up each one.
data "aws_subnet" "public" {
for_each = toset(var.public_subnets)
id = each.key
}
The above will make data.aws_subnet.public appear as a map from subnet id to subnet object, and the subnet objects each have availability_zone attributes specifying which zone each subnet belongs to. For our second step it's more convenient to invert that mapping, so that the keys are availability zones and the values are subnet ids:
locals {
availability_zone_subnets = {
for s in data.aws_subnet.public : s.availability_zone => s.id...
}
}
The above is a for expression, which in this case is using the ... suffix to activate grouping mode, because we're expecting to find more than one subnet per availability zone. As a result of this, local.availability_zone_subnets will be a map from availability zone name to a list of one or more subnet ids, like this:
{
"az1-a" = ["subnetid1", "subnetid4"]
"az1-b" = ["subnetid2"]
"az1-c" = ["subnetid3"]
}
This gets us the information we need to implement the second part of the problem: choosing any one of the elements from each of those lists. The easiest definition of "any one" is to take the first one, by using [0] to take the first element.
resource "aws_elb" "loadbalancer" {
depends_on = [aws_autoscaling_group.private_ec2]
name = "loadbalancer-terraform"
subnets = [for subnet_ids in local.availability_zone_subnets : subnet_ids[0]]
listener {
instance_port = 80
instance_protocol = "http"
lb_port = 80
lb_protocol = "http"
}
}
There are some caveats of the above solution which are important to consider:
Taking the first element of each list of subnet ids means that the configuration could potentially be sensitive to the order of elements in var.public_subnets, but this particular combination above implicitly avoids that with the toset(var.public_subnets) in the initial for_each, which discards the original ordering of var.public_subnets and causes all of the downstream expressions to order the results by a lexical sort of the subnet ids. In other words, this will choose the subnet whose id is the "lowest" when doing a lexical sort.
I don't really like it when that sort of decision is left implicit, because it can be confusing to future maintainers who might change the design and be surprised to see it now choosing a different subnet for each availability zone. I can see a couple different ways to mitigate that, and I'd probably do both if I were writing a long-lived module:
Make sure variable "public_subnets" has type = set(string) for its type constraint, rather than type = list(string), to be explicit that this module discards the ordering of the subnets as given by the caller. If you do this, you can change toset(var.public_subnets) to just var.public_subnets, because it will already be a set.
In the final for expression to choose the first subnet for each availability zone, include an explicit call to sort. This call is redundant with how the rest of this is implemented in my example, but I think it's a good clue to a future reader that it's using a lexical sort to decide which of the subnets to use:
subnets = [
for subnet_ids in local.availability_zone_subnets : sort(subnet_ids)[0]
]
Neither of those changes will actually affect the behavior immediately, but additions like this can be helpful to future maintainers as they read a module they might not be previously familiar with, so they don't need to read the entire module to understand a smaller part of it.

How do I specify different regions per resource for Terraform GCP module?

The documentation for the Terraform google provider module lists a global option to set a region:
region - (Optional) The region to operate under, if not specified by a
given resource. This can also be specified using any of the following
environment variables (listed in order of precedence):
GOOGLE_REGION
GCLOUD_REGION
CLOUDSDK_COMPUTE_REGION
However, I found no way to specify a region for a google_compute_instance or a google_compute_disk resource. How do I create multiple instances/disks in different regions within the same project?
OP's phrasing of the answer:
Both of these resource types are located within a single zone, they have a zone field accordingly to specify where to provision them. Since a zone is located in a single region, specifying the requested zone for the resource is enough because it implicitly specifies the region as well. There is no option to specify the region for these resource types because that would be redundant along with specifying the zone, and specifying only the region would not be enough.
Original answer provided:
Both of the resources you linked have the zone tag, which is where instances and VM disks need to be located, as they are not region-wide. Zones are located within a region, and usually there are two or three zones for each region.
For example, taking the region us-west1, in this list you can see that it has the zones a, b and c, which when specified in the zone tag need to be written as us-west1-a, us-west1-b or us-west1-c.
Edit:
This example shows an example terraform configuration file, which creates two different Compute Engine VM instances in two different zones, located in two different regions:
provider "google" {
project="YOUR-PROJECT" # Project ID
region="europe-west2" # Default resource region
zone="europe-west2-b" # Default resource zone
}
/*
* Create instance in region Europe West 1, zone b
*/
resource "google_compute_instance" "europe_instance"{
name = "europe-instance-1"
machine_type = "n1-standard-1"
zone = "europe-west1-b"
boot_disk {
initialize_params {
image = "debian-cloud/debian-9"
}
}
network_interface {
network = "default"
}
}
/*
* Create instance in US West 1, zone c
*/
resource "google_compute_instance" "us_instance"{
name = "us-instance-2"
machine_type = "n1-standard-1"
zone = "us-west1-c"
boot_disk {
initialize_params {
image = "debian-cloud/debian-9"
}
}
network_interface {
network = "default"
}
}

Why I've to specify two values when identifying the aws resource?

I do not know why I've to specify two values when identifying an AWS resource in terraform. For example,
resource "aws_instance" "test"
I understand that "aws_instance" is the resource type but what about the other one?
I'm not a terraform expert, but my understanding of the second value is the "Logical ID" of the instance much like Cloudformation, i.e. this is what it will be referred to inside terraform. Meaning that if you create that instance, and then want to export it's IP somewhere else you can then access the resource properties through the second value, like so:
"${aws_instance.test.private_ip}"
The second parameter you give is the "NAME" of the resource you have created. The "NAME" parameter must be set. You can see its importance while using the output of resource that gives input to another resource creation.
While the resource name ("test" in your example) is not useful in a simple configuration with only one or two resources, an important feature of Terraform is using the attributes of one resource to populate another.
A common example of this in AWS is creating VPC and subnet objects:
variable "app_name" {}
variable "env_name" {}
resource "aws_vpc" "main" {
cidr_block = "10.1.0.0/16"
tags = {
Name = "${var.app_name}-${var.env_name}"
}
}
resource "aws_subnet" "a" {
vpc_id = "${aws_vpc.main.id}"
cidr_block = "${cidrsubnet(aws_vpc.main.cidr_block, 4, 1})"
availability_zone = "us-west-2a"
tags = {
Name = "${var.app_name}-${var.env_name}-usw2a"
}
}
resource "aws_subnet" "b" {
vpc_id = "${aws_vpc.main.id}"
cidr_block = "${cidrsubnet(aws_vpc.main.cidr_block, 4, 2})"
availability_zone = "us-west-2b"
tags = {
Name = "${var.app_name}-${var.env_name}-usw2b"
}
}
In this example, the name "main" of the "aws_vpc" resource is used as part of references from the two subnets back to the VPC. This allows Terraform to populate the subnet vpc_id even though its value won't be known until the VPC is created. It also avoids duplicating the VPC's base CIDR block in the subnets, instead calculating a new subnet prefix dynamically.
Notice that the resource names are different than the tag Name on each object, because they have a different scope: the Terraform resource names are required to be unique only within a single module, and so they will usually have short names that just distinguish any resources of the same type within that one module. The Name tags -- and, for some other resource types, the unique resource name -- must instead be unique either within an entire AWS region or possibly across a whole AWS partition (in the case of S3, for example).
The different purpose of these Terraform-specific names becomes particularly important for more complicated systems where the same module is instantiated multiple times in different configurations, such as creating similar infrastructure across different environments. In this case the Terraform-specific names will be the same across all uses of the module -- since the module source code is identical -- but they will need to have distinct names within AWS itself, e.g. qualified by the environment name they belong to. The usual way to achieve that is to add variables to your modules to specify a subsystem and an environment and then use that to produce a consistent naming scheme for the objects in AWS, while Terraform itself just uses its local names for references in configuration.

Insufficient capacity in availability zone on AWS

I got the following error from AWS today.
"We currently do not have sufficient m3.large capacity in the Availability Zone you requested (us-east-1a). Our system will be working on provisioning additional capacity. You can currently get m3.large capacity by not specifying an Availability Zone in your request or choosing us-east-1e, us-east-1b."
What does this mean exactly? It sounds like AWS doesn't have the physical resources to allocate me the virtual resources that I need. That seems unbelievable though.
What's the solution? Is there an easy way to change the availability zone of an instance?
Or do I need to create an AMI and restore it in a new availability zone?
This is not a new issue. You cannot change the availability zone. Best option is to create an AMI and relaunch the instance in new AZ, as you have already said. You would have everything in place. If you want to go across regions, see this - http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/CopyingAMIs.html
You can try getting reserved instances, which guarantee you get the instances all the time.
I fixed this eror by fixing my aws_region and availability_zone values. Once I added aws_subnet_ids, error msg showed me exactly which zone my ec2 was being created.
variable "availability_zone" {
default = "ap-southeast-2c"
}
variable "aws_region" {
description = "EC2 Region for the VPC"
default = "ap-southeast-2c"
}
data "aws_vpc" "default" {
default = true
}
data "aws_subnet_ids" "all" {
vpc_id = "${data.aws_vpc.default.id}"
}
resource "aws_instance" "ec2" {
....
subnet_id = "${element(data.aws_subnet_ids.all.ids, 0)}"
availability_zone = "${var.availability_zone}"
}