What does keyword `count` mean in data source?

What does keyword `count` mean in data source? - amazon-web-services

when reading open-source terraform module chgangaraju/terraform-aws-cloudfront-s3-website, I found they use count in Data source. But I didn't find any document about count.
what does it mean in this place?
data "aws_acm_certificate" "acm_cert" {
count = var.use_default_domain ? 0 : 1
domain = coalesce(var.acm_certificate_domain, "*.${var.hosted_zone}")
provider = aws.aws_cloudfront
//CloudFront uses certificates from US-EAST-1 region only
statuses = [
"ISSUED",
]
}

It means same as for resource. The pattern uses that you have is an Conditional Expression:
count = var.use_default_domain ? 0 : 1
The expression with count are often used for optional resources or data sources. Specifically, in your case, if you set use_default_domain to false, the CloudFront distro created in this TF script will be created with your own, custom domain and SSL certificate. For this to happen, the data source acm_cert will fetch information about SSL certificate from ACM for your acm_certificate_domain.
In contrast, when use_default_domain is true, you are going to use default domain and SSL certificate from CloudFront. For that you don't need to have any SSL certificate in ACM. Subsequently, TF will not fetch it.
Technically, if use_default_domain is true, then count is 0 and the data "aws_acm_certificate" "acm_cert" is not executed. But, if count is 1 (when use_default_domain is false), the data source will run and try to fetch information about your custom SSL certificate.

A data source is a query, which is used for getting data from the outside world and making it available to your Terraform configuration
terraform example code
provider "aws" {
region = "us-west-2"
}
data "aws_ami" "example" {
count = 2 // look at this line!
most_recent = true
owners = ["self"]
}
output "amis" {
value = "${data.aws_ami.example.*.id}"
}
when you specify the count in the data source block, it's similar to SQL limit. This can be thought of as equivalent to the following SQL query:
select * from data_source
where owners in ('self')
order by most_recent desc
limit 2; /** <= look at this line! **/
Response
The data source returns two records and sends them to the output.
You will see the following results in your terminal.
amis = [
"ami-0345c0186ced78ce6",
"ami-0345c0186ced78ce6",
]
Terraform Documentation: Multiple Resource Instances
Data resources support count and for_each meta-arguments as defined for managed resources, with the same syntax and behavior.
As with managed resources, when count or for_each is present it is important to distinguish the resource itself from the multiple resource instances it creates. Each instance will separately read from its data source with its own variant of the constraint arguments, producing an indexed result.
Related Question
How can I output a data source that uses count?

Related

aws_shield_protection Terraform

I am struggling to find a way to include all load balancers with certain tag value's (e.g. Shield protection = ON) in an aws account.
Currently i have a map of arn's in a variable and running a for loop. This method work's but not in an efficient way; since every time I have to add the ARN of a new Load balancer manually.
resource "aws_shield_protection" "this" {
for_each = var.listofarn
name = "shield protection".each.key
resource_arn = each.key
}
variable listofarn {
type = map(string)
default = {
appx_alb="arn::xxxxx"
appy_alb="arn:yyyyy"
}
}
Is there a way to use data resource "aws_lb".
thanks.

Using data source wouldn't help much. aws_lb data source can only return one alb. You can't use it to get information about all your ALBs. You would have to run the aws_lb data source in a for_loop with tags or some ALB id.
But you could overcome your issue through development of an external data source. Since its a fully custom data source, it can return information about all your ALBs in the form you want.

Why does Terraform plan recreate blocks when the configuration has not changed?

I have a simple site set up on AWS and have a terraform script working to deploy it (at least from my local machine).
When I have a successful deployment through terraform apply, quite often if I then run terraform plan again (immediately after the apply) I will see changes like this:
# aws_route53_record.alias_route53_record_portal will be updated in-place
~ resource "aws_route53_record" "alias_route53_record_portal" {
fqdn = "mysite.co.uk"
id = "Z12345678UR1K1IFUBA_mysite.co.uk_A"
name = "mysite.co.uk"
records = []
ttl = 0
type = "A"
zone_id = "Z12345678UR1K1IFUBA"
- alias {
- evaluate_target_health = false -> null
- name = "d12345mkpmx9ii.cloudfront.net" -> null
- zone_id = "Z2FDTNDATAQYW2" -> null
}
+ evaluate_target_health = true
+ name = "d12345mkpmx9ii.cloudfront.net"
+ zone_id = "Z2FDTNDATAQYW2"
}
}
Why is terraform saying that some parts of resources need recreating when nothing has changed?
EDIT My actual tf resource...
resource "aws_route53_record" "alias_route53_record_portal" {
zone_id = data.aws_route53_zone.sds_zone.zone_id
name = "mysite.co.uk"
type = "A"
alias {
name = aws_cloudfront_distribution.s3_distribution.domain_name
zone_id = aws_cloudfront_distribution.s3_distribution.hosted_zone_id
evaluate_target_health = true
}
}

You have changed evaluate_target_health from false to true. Terraform will just update the fields that have changed. The reason why it's showing like this is because often times AWS doesn't provide separate APIs for each field. Since TF is showing that this resource will be updated in-place, it will touch the minimum number of resources to make this change.

The "plan" operation in Terraform first synchronizes the Terraform state with remote objects (by making calls to the remote API), and then it compares the updated state with the configuration.
Terraform (or, more accurately, the relevant Terraform provider) then generates a planned update or replace for any case where the state and the configuration disagree.
If you see a planned update for a resource whose configuration you know you haven't changed, then by process of elimination that suggests that the remote system is what has changed.
Sometimes that can happen if some other process (or a human in the admin console) changes an object that Terraform believes itself to be responsible for. In that case, the typical resolution is to ensure that each object is only managed by one system and that no-one is routinely making changes to Terraform-managed objects outside of Terraform.
One way to diagnose this would be to consult the remote system and see whether its current settings agree with your Terraform configuration. If not, that would suggest that something other than Terraform has changed the value.
A less common reason this can arise is due to a bug in the provider itself. There are two variations of this class of bug:
When creating the object, the provider doesn't correctly translate the given configuration to a remote API call, and so it ends up creating an object that doesn't match the configuration. A subsequent Terraform plan will then notice that inconsistency and plan an update to fix it. If the provider's update operation has a similar bug then this will never converge, causing the provider to repeatedly plan the same update.
Conversely, the create/update may be implemented correctly but the "refresh" operation (updating the state to match the remote system) may inaccurately translate the remote object data back to Terraform state data, causing the state to not accurately reflect the remote system. In that case, the provider will probably then find that the configuration doesn't match the state anymore, even though the state was correct after the initial create.
Both of these bugs are typically nicknamed "permadiff" by provider developers, because the symptom is Terraform seeming to plan the same change indefinitely, no matter how many times you apply it. If you think you've encountered a "permadiff" bug then usually the path forward is to report a bug in the provider's development repository so that the maintainers can investigate.
One specific variation of "permadiff" is a situation where the remote system does some sort of normalization of your given values which the provider doesn't take into account. For example, some remote systems will accept strings containing uppercase letters but will convert them to lowercase for permanent storage. If a provider doesn't take that into account, it will probably incorrectly plan to change the value back to the one containing uppercase letters again in order to try to make the state match the configuration. This subclass of bug is a normalization permadiff, which provider teams will typically address by re-implementing the remote system's normalization logic in the provider itself.
If you find a normalization permadiff then you can often work around it until the bug is fixed by figuring out what normalization the remote system expects and then manually normalizing your configuration to match it, so that the provider will then see the configuration as matching the remote system.

Terraform - Upload file to S3 on every apply

I need to upload a folder to S3 Bucket. But when I apply for the first time. It just uploads. But I have two problems here:
uploaded version outputs as null. I would expect some version_id like 1, 2, 3
When running terraform apply again, it says Apply complete! Resources: 0 added, 0 changed, 0 destroyed. I would expect to upload all the times when I run terraform apply and create a new version.
What am I doing wrong? Here is my Terraform config:
resource "aws_s3_bucket" "my_bucket" {
bucket = "my_bucket_name"
versioning {
enabled = true
}
}
resource "aws_s3_bucket_object" "file_upload" {
bucket = "my_bucket"
key = "my_bucket_key"
source = "my_files.zip"
}
output "my_bucket_file_version" {
value = "${aws_s3_bucket_object.file_upload.version_id}"
}

Terraform only makes changes to the remote objects when it detects a difference between the configuration and the remote object attributes. In the configuration as you've written it so far, the configuration includes only the filename. It includes nothing about the content of the file, so Terraform can't react to the file changing.
To make subsequent changes, there are a few options:
You could use a different local filename for each new version.
You could use a different remote object path for each new version.
You can use the object etag to let Terraform recognize when the content has changed, regardless of the local filename or object path.
The final of these seems closest to what you want in this case. To do that, add the etag argument and set it to be an MD5 hash of the file:
resource "aws_s3_bucket_object" "file_upload" {
bucket = "my_bucket"
key = "my_bucket_key"
source = "${path.module}/my_files.zip"
etag = "${filemd5("${path.module}/my_files.zip")}"
}
With that extra argument in place, Terraform will detect when the MD5 hash of the file on disk is different than that stored remotely in S3 and will plan to update the object accordingly.
(I'm not sure what's going on with version_id. It should work as long as versioning is enabled on the bucket.)

The preferred solution is now to use the source_hash property. Note that aws_s3_bucket_object has been replaced by aws_s3_object.
locals {
object_source = "${path.module}/my_files.zip"
}
resource "aws_s3_object" "file_upload" {
bucket = "my_bucket"
key = "my_bucket_key"
source = local.object_source
source_hash = filemd5(local.object_source)
}
Note that etag can have issues when encryption is used.

You shouldn't be using Terraform to do this. Terraform is supposed to orchestrate and provision your infrastructure and its configuration, not files. That said, terraform is not aware of changes on your files. Unless you change their names, terraform will not update the state.
Also, it is better to use local-exec to do that. Something like:
resource "aws_s3_bucket" "my-bucket" {
# ...
provisioner "local-exec" {
command = "aws s3 cp path_to_my_file ${aws_s3_bucket.my-bucket.id}"
}
}

Terragrunt v0.14.9, Terraform v0.11.7 reading AWS VPC ID from second environment

I have used Terragrunt to orchestrate the creation of a non-default AWS VPC.
I've got S3/DynamoDB state mgmt, and the VPC code is a module. I have the 'VPC environment' terraform.tfvars code checked into a second repo as per the terragrunt README.md.
I created a second module which will eventually create hosts in this VPC but for now just aims to output its ID. I have created a separate 'hosts environment' / terraform.tfvars for the instantiation of this module.
I run terragrunt apply in the VPC environment directory - VPC created
I run terragrunt apply a second time in the hosts environment directory - output directive doesn't work (no error, but incorrect, see below).
This is a precursor to one day running a terragrunt apply-all in the parent directory of the VPC/hosts environment directories; my reading of the docs suggest using a terraform_remote_state data source to expose the VPC ID, so I specified access like this in the data.tf file of the hosts module:
data "terraform_remote_state" "vpc" {
backend = "s3"
config {
bucket = "myBucket"
key = "keyToMy/vpcEnvironment.tfstate"
region = "stateRegion"
}
}
Then, in the hosts module outputs.tf, I specified an output to check assignment:
output "mon_vpc" {
value = "${data.terraform_remote_state.vpc.id}"
}
When I run (2) above, it exits with:
Apply complete! Resources: 0 added, 0 changed, 0 destroyed.
Outputs:
mon_vpc = 2018-06-02 23:14:42.958848954 +0000 UTC
Questions:
I'm going wrong setting up the code so that the hosts environment is configured to correctly acquire the VPC ID from the already-existing VPC (terraform state file) - any advice on what to change here would be appreciated.
It does look like I've managed to acquire the date of when the VPC was created rather than its ID, which given the code is perplexing - anyone know why?
I'm not using community modules - all hand rolled.
EDIT: In response to Brandon Miller, here is a bit more. In my VPC module, I have an outputs.tf containing among other outputs:
output "aws_vpc.mv.id-op" {
value = "${aws_vpc.mv.id}"
}
and the vpc.tf contains
resource "aws_vpc" "mv" {
cidr_block = "${var.vpcCidr}"
enable_dns_support = true
enable_dns_hostnames = true
tags = {
Name = "mv-vpc-${var.aws_region}"
}
}
As this cfg results in a vpc being created, and as most of the parameters are <computed>, I assumed state would contain sufficient data for other modules to refer to by consulting state (I assumed at first that terraform used the AWS API for this under the bonnet, rather than consulting a different state key).
EDIT 2: Read all of #brendan-miller's answer and following comments first.
Use of periods causes a problem as it confuses terraform (see Brendan's answer for the specification format below):
Error: output 'mon_vpc': unknown resource 'data.aws_vpc.mv-ds' referenced in variable data.aws_vpc.mv-ds.vpc.id

You named your output aws_vpc.mv.id-op but when you retrieve it you are retrieving just id. You could try
data.terraform_remote_state.vpc.aws_vpc.mv.id
but im not sure if Terraform will complain about the additional .. However the format should always be
data.terraform_remote_state.<name of the remote state module>.<name of the output>
You mentioned wanting to be able to get this info with the AWS API. That is also possible by using the aws_vpc data source. Their example uses id, but you can also use any tag you used on your vpc.
Like this:
data "aws_vpc" "default" {
filter {
name = "tag:Name"
values = ["example-vpc-name"]
}
}
Then you can use this for the id
${data.aws_vpc.default.id}
In addition this retrieves all tags set, for example:
${data.aws_vpc.default.tags.Name}
And the cidr block
${data.aws_vpc.default.cidr_block}
As well as some other info. This can be very useful for storing and retrieving things about your VPC.

How are data sources used in Terraform?

The Terraform Data Sources documentation tells me what a data source is, but I do not quite understand it. Can somebody give me a use case of data source? What is the difference between it and configuring something using variables?

Data sources can be used for a number of reasons; but their goal is to do something and then give you data.
Let's take the example from their documentation:
# Find the latest available AMI that is tagged with Component = web
data "aws_ami" "web" {
filter {
name = "state"
values = ["available"]
}
filter {
name = "tag:Component"
values = ["web"]
}
most_recent = true
}
This uses the aws_ami data source - this is different than a resource! It will instead just give you information, and not create anything. This example in particular will call out to the describe-images AWS API call, pass in a few --filter options as specified, and return an object that you can get information from - take a look at these attributes!
name
owner_id
description
image_id
... The list goes on. This is really useful if I were, let's say - always wanting to pull the latest AMI matching some tags, and keep a launch configuration up to date with it. I could use this data provider rather than always have to update a variable or hard-code the ID.
Data source can be used for other reasons as well; one of my favorites is the template provider.
Good luck!

Data sources provide information about entities that are not managed by the current Terraform configuration.
This may include:
Configuration data from Consul
Information about the state of manually-configured infrastructure components
In other words, data sources are read-only views into the state of pre-existing components external to our configuration.
Once you have defined a data source, you can use the data elsewhere in your Terraform configuration.
For example, let's suppose we want to create a Terraform configuration for a new AWS EC2 instance. We want to use an AMI image which were created and uploaded by a Jenkins job using the AWS CLI, and not managed by Terraform. As part of the configuration for our Jenkins job, this AMI image will always have a name with the prefix app-.
In this case, we can use the aws_ami data source to obtain information about the most recent AMI image that has the name prefix app-.
data "aws_ami" "app_ami" {
most_recent = true
filter {
name = "name"
values = ["app-*"]
}
}
Data sources export attributes, just like resources do. We can interpolate these attributes using the syntax data.TYPE.NAME.ATTR. In our example, we can interpolate the value of the AMI ID as data.aws_ami.app_ami.id, and pass it as the ami argument for our aws_instance resource.
resource "aws_instance" "app" {
ami = "${data.aws_ami.app_ami.id}"
instance_type = "t2.micro"
}
Data sources are most powerful when retrieving information about dynamic entities - those whose properties change value often. For example, the next time Terraform fetches data for our aws_ami data source, the value of the exported attributes may be different (we might have built and pushed a new AMI).
Variables are used for static values, those that rarely changes, such as your access and secret keys, or a standard list of sudoers for your servers.

Good examples up there!
The main difference between Terraform data source, resource and variable is :
Resource: Provisioning of resources/infra on our platform. Create, Update and delete!
Variable Provides predefined values as variables on our IAC. Used by resource for provisioning.
Data Source: Fetch values from our infra/provider and and provides data for our resource to provision infra/resource.
Examples are well explained above :)

Data sources are used to fetch the data from the provider end, so that It can be used as configuration in .tf files, Instead of hardcoding it. Example: Below code fetches the AWS AMI ID and uses it to launch AWS instance.
data "aws_ami" "std_ami" {
most_recent = true
owners = ["amazon"]
filter {
name = "root-device-type"
values = ["ebs"]
}
filter {
name = "virtualization-type"
values = ["hvm"]
}
}
resource "aws_instance" "myec2" {
ami = data.aws_ami.std_ami.id
instance_type = "t2.micro"
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js