I'm creating three EKS clusters using this module. Everything works fine, just that when I try to add the configmap to the clusters using map_roles, I face an issue.
My configuration looks like this which I have it within all three clusters
map_roles = [{
rolearn = "arn:aws:iam::${var.account_no}:role/argo-${var.environment}-${var.aws_region}"
username = "system:node:{{EC2PrivateDNSName}}"
groups = ["system:bootstrappers","system:nodes"]
},
{
rolearn = "arn:aws:sts::${var.account_no}:assumed-role/${var.assumed_role_1}"
username = "admin"
groups = ["system:masters","system:nodes","system:bootstrappers"]
},
{
rolearn = "arn:aws:sts::${var.account_no}:assumed-role/${var.assumed_role_2}"
username = "admin"
groups = ["system:masters","system:nodes","system:bootstrappers"]
}
]
The problem occurs while applying the template. It says
configmaps "aws-auth" already exists
When I studied the error further I realised that when applying the template, the module creates three configmap resources of the same name like these
resource "kubernetes_config_map" "aws_auth" {
# ...
}
resource "kubernetes_config_map" "aws_auth" {
# ...
}
resource "kubernetes_config_map" "aws_auth" {
# ...
}
This obviously is a problem. How do I fix this issue?
The aws-auth configmap is created by EKS, when you create a managed node pool. It has the configuration required for nodes to register with the control plane. If you want to control the contents of the configmap with Terraform you have two options.
Either make sure you create the config map before the managed node pools resource. Or import the existing config map into the Terraform state manually.
I've now tested my solution, which expands on #pst's "import aws-auth" answer and looks like this: break up the terraform apply operation in your main eks project into 3 steps, which completely isolate the eks resources from the k8s resources, so that you may manage the aws-auth ConfigMap from terraform workflows.
terraform apply -target=module.eks
This creates just the eks cluster and anything else the module creates.
The eks module design now guarantees this will NOT include anything from the kubernetes provider.
terraform import kubernetes_config_map.aws-auth kube-system/aws-auth
This brings the aws-auth map, generated by the creation of the eks cluster in the previous step, into the remote terraform state.
This is only necessary when the map isn't already in the state, so we first check, with something like:
if terraform state show kubernetes_config_map.aws-auth ; then
echo "aws-auth ConfigMap already exists in Remote Terraform State."
else
echo "aws-auth ConfigMap does not exist in Remote Terraform State. Importing..."
terraform import -var-file="${TFVARS_FILE}" kubernetes_config_map.aws-auth kube-system/aws-auth
fi
terraform apply
This is a "normal" apply which acts exactly like before, but will have nothing to do for module.eks. Most importantly, this call will not encounter the "aws-auth ConfigMap already exists" error since terraform is aware of its existence, and instead the proposed plan will update aws-auth in place.
NB:
Using a Makefile to wrap your terraform workflows makes this simple to implement.
Using a monolithic root module with -target is a little ugly, and as your use of the kubernetes provider grows, it makes sense to break out all the kubernetes terraform objects into a separate project. But the above gets the job done.
The separation of eks/k8s resources is best practice anyway, and is advised to prevent known race conditions between aws and k8s providers. Follow the trail from here for details.
I know that it's too late. But sharing a solution that I found.
We should use kubernetes_config_map_v1_data instead of kubernetes_config_map_v1. This resource allows Terraform to manage data within a pre-existing ConfigMap.
Example,
resource "kubernetes_config_map_v1_data" "aws_auth" {
metadata {
name = "aws-auth"
namespace = "kube-system"
}
data = {
"mapRoles" = data.template_file.aws_auth_template.rendered
}
force = true
}
data "template_file" "aws_auth_template" {
template = "${file("${path.module}/aws-auth-template.yml")}"
vars = {
cluster_admin_arn = "${local.accounts["${var.env}"].cluster_admin_arn}"
}
}
Related
Overview
Currently, dashboards are being deployed via Terraform using values from a dictionary in locals.tf:
resource "aws_cloudwatch_dashboard" "my_alb" {
for_each = local.env_mapping[var.env]
dashboard_name = "${each.key}_alb_web_operational"
dashboard_body = templatefile("templates/alb_ops.tpl", {
environment = each.value.env,
account = each.value.account,
region = each.value.region,
alb = each.value.alb
tg = each.value.alb_tg
}
This leads to fragility because the values of AWS infrastructure resources like the ALB and ALB target group are hard coded. Sometimes when applying updates AWS resources are destroyed and recreated.
Question
What's the best approach to get these values dynamically? For example, this could be achieved by writing a Python/Boto3 Lambda, which looks up these values and then passes them to Terraform as env variables. Are there any other recommended ways to achieve the same?
It depends on how much environment is dynamical. But sounds like Terraform data sources is what you are looking for.
Usually, loadbalancer names are fixed or generated by some rule and should be known before creating dashboard.
Let's suppose that names are fixed, and names are:
variable "loadbalancers" {
type = object
default = {
alb01 = "alb01",
alb02 = "alb02"
}
}
In this case loadbalancers may be taken by:
data "aws_lb" "albs" {
for_each = var.loadbalancers
name = each.value # or each.key
}
And after that you will be able to get dynamically generated parameters:
data.aws_lb["alb01"].id
data.aws_lb["alb01"].arn
etc
If loadbalancer names are generated by some rule, you should use aws cli or aws cdk to get all names, or just generate names by same rule as it was generated inside AWS environment and pass inside Terraform variable.
Notice: terraform plan (apply, destroy) will raise error if you pass non-existent name. You should check if LB with provided name exists.
I do have a terraform script which provisions a Kubernetes deployment and a few clusterroles and clusterrolebingings via Helm.
But now I do need to edit one of the provisioned Clusterroles via Terraform and add another block of permissions. Is there a way to do this or would I need to recreate a similar resource freshly.
This is my block to create the respective deployment for efs-csi-driver.
resource "helm_release" "aws-efs-csi-driver" {
name = "aws-efs-csi-driver"
chart = "aws-efs-csi-driver"
repository = "https://kubernetes-sigs.github.io/aws-efs-csi-driver/"
version = "2.x.x"
namespace = "kube-system"
timeout = 3600
values = [
file("${path.module}/config/values.yaml"),
]
}
Somehow I do need to modify https://github.com/kubernetes-sigs/aws-efs-csi-driver/blob/45c5e752d2256558170100138de835b82d54b8af/deploy/kubernetes/base/controller-serviceaccount.yaml#L11 by adding a couple of more permission blocks. Is there a way that I can patch it out (Or completely overlay)
I have a microservice deployed in a docker container to manage and execute terraform commads to create infrastructure on AWS. The terraform template supported is as follows:
provider "aws" {
profile = "default"
region = "us-east-1"
}
resource "aws_default_vpc" "default" {
tags = {
Name = "Default VPC"
}
}
resource "aws_security_group" "se_security_group" {
name = "test-sg"
description = "secure soft edge ports"
vpc_id = aws_default_vpc.default.id
tags = {
Name = "test-sg"
}
}
resource "aws_instance" "web" {
ami = "ami-*********"
instance_type = "t3.micro"
tags = {
Name = "test"
}
depends_on = [
aws_security_group.se_security_group,
]
}
With this system in place, while the terraform process is being executed (creating an EC2 instance),if the docker container crashes, then the state file would not have the entry regarding the EC2 resource being created. On container restart, if the terraform process is restarted on the same state file, it would end up creating a whole new EC2 instance resulting in a resource leak.
How is the crash scenario in terraform commonly handled?
Is there a way to rollback the previous transaction without the state file having the EC2 entry?
Please help me with this issue. Thanks
How is the crash scenario in terraform commonly handled?
It depends when did the crash happened. Some plausible scenarios are:
Most likely, your state file will remain locked, as long as your backend supports locking. In this case nothing will be created after restart, because Terraform wont be able to acquire a lock to the state file, so it will throw an error. We will have to force unlock the state.
We managed to unlock the state file/the state file was not locket at all. In this case we can have to following scenarios:
The state file will have an entry with an identifier for the resource, even if there was a crash will the resource was provisioning. In this case Terraform will refresh the state and will display in the plan if there are any changes to be made. Nevertheless, we should read the plan and decide if we would want to apply or do some manual adjustments first.
Terraform wont be able to identify a resource which already exists, so it will try to provision it. Again, we should read the state file and decide ourselves what to do. We can either import the already existing resource or terminate it and let Terraform attempt to create it again.
Is there a way to rollback the previous transaction without the state file having the EC2 entry?
No, there is no way to rollback to the previous transaction. Terraform will attempt to provision whatever it is in the .tf files. What we could do is to checkout a previous version of our code from our source control and apply that.
I've been looking for a way to be able to deploy to multiple AWS accounts simultaneously in Terraform and coming up dry. AWS has the concept of doing this with Stacks but I'm not sure if there is a way to do this in Terraform? If so what would be some solutions?
You can read more about the Cloudformation solution here.
You can define multiple provider aliases which can be used to run actions in different regions or even different AWS accounts.
So to perform some actions in your default region (or be prompted for it if not defined in environment variables or ~/.aws/config) and also in US East 1 you'd have something like this:
provider "aws" {
# ...
}
# Cloudfront ACM certs must exist in US-East-1
provider "aws" {
alias = "cloudfront-acm-certs"
region = "us-east-1"
}
You'd then refer to them like so:
data "aws_acm_certificate" "ssl_certificate" {
provider = aws.cloudfront-acm-certs
...
}
resource "aws_cloudfront_distribution" "cloudfront" {
...
viewer_certificate {
acm_certificate_arn = data.aws_acm_certificate.ssl_certificate.arn
...
}
}
So if you want to do things across multiple accounts at the same time then you could assume a role in the other account with something like this:
provider "aws" {
# ...
}
# Assume a role in the DNS account so we can add records in the zone that lives there
provider "aws" {
alias = "dns"
assume_role {
role_arn = "arn:aws:iam::ACCOUNT_ID:role/ROLE_NAME"
session_name = "SESSION_NAME"
external_id = "EXTERNAL_ID"
}
}
And refer to it like so:
data "aws_route53_zone" "selected" {
provider = aws.dns
name = "test.com."
}
resource "aws_route53_record" "www" {
provider = aws.dns
zone_id = data.aws_route53_zone.selected.zone_id
name = "www.${data.aws_route53_zone.selected.name"
...
}
Alternatively you can provide credentials for different AWS accounts in a number of other ways such as hardcoding them in the provider or using different Terraform variables, AWS SDK specific environment variables or by using a configured profile.
I would recommend also combining your solution with Terraform workspaces:
Named workspaces allow conveniently switching between multiple
instances of a single configuration within its single backend. They
are convenient in a number of situations, but cannot solve all
problems.
A common use for multiple workspaces is to create a parallel, distinct
copy of a set of infrastructure in order to test a set of changes
before modifying the main production infrastructure. For example, a
developer working on a complex set of infrastructure changes might
create a new temporary workspace in order to freely experiment with
changes without affecting the default workspace.
Non-default workspaces are often related to feature branches in
version control. The default workspace might correspond to the
"master" or "trunk" branch, which describes the intended state of
production infrastructure. When a feature branch is created to develop
a change, the developer of that feature might create a corresponding
workspace and deploy into it a temporary "copy" of the main
infrastructure so that changes can be tested without affecting the
production infrastructure. Once the change is merged and deployed to
the default workspace, the test infrastructure can be destroyed and
the temporary workspace deleted.
AWS S3 is in the list of the supported backends.
It is very easy to use (similar to working with git branches) and combine it with the selected AWS account.
terraform workspace list
dev
* prod
staging
A few references regarding configuring the AWS provider to work with multiple account:
https://terragrunt.gruntwork.io/docs/features/work-with-multiple-aws-accounts/
https://assets.ctfassets.net/hqu2g0tau160/5Od5r9RbuEYueaeeycUIcK/b5a355e684de0a842d6a3a483a7dc7d3/devopscon-V2.1.pdf
I've been looking for a way to be able to deploy to multiple AWS accounts simultaneously in Terraform and coming up dry. AWS has the concept of doing this with Stacks but I'm not sure if there is a way to do this in Terraform? If so what would be some solutions?
You can read more about the Cloudformation solution here.
You can define multiple provider aliases which can be used to run actions in different regions or even different AWS accounts.
So to perform some actions in your default region (or be prompted for it if not defined in environment variables or ~/.aws/config) and also in US East 1 you'd have something like this:
provider "aws" {
# ...
}
# Cloudfront ACM certs must exist in US-East-1
provider "aws" {
alias = "cloudfront-acm-certs"
region = "us-east-1"
}
You'd then refer to them like so:
data "aws_acm_certificate" "ssl_certificate" {
provider = aws.cloudfront-acm-certs
...
}
resource "aws_cloudfront_distribution" "cloudfront" {
...
viewer_certificate {
acm_certificate_arn = data.aws_acm_certificate.ssl_certificate.arn
...
}
}
So if you want to do things across multiple accounts at the same time then you could assume a role in the other account with something like this:
provider "aws" {
# ...
}
# Assume a role in the DNS account so we can add records in the zone that lives there
provider "aws" {
alias = "dns"
assume_role {
role_arn = "arn:aws:iam::ACCOUNT_ID:role/ROLE_NAME"
session_name = "SESSION_NAME"
external_id = "EXTERNAL_ID"
}
}
And refer to it like so:
data "aws_route53_zone" "selected" {
provider = aws.dns
name = "test.com."
}
resource "aws_route53_record" "www" {
provider = aws.dns
zone_id = data.aws_route53_zone.selected.zone_id
name = "www.${data.aws_route53_zone.selected.name"
...
}
Alternatively you can provide credentials for different AWS accounts in a number of other ways such as hardcoding them in the provider or using different Terraform variables, AWS SDK specific environment variables or by using a configured profile.
I would recommend also combining your solution with Terraform workspaces:
Named workspaces allow conveniently switching between multiple
instances of a single configuration within its single backend. They
are convenient in a number of situations, but cannot solve all
problems.
A common use for multiple workspaces is to create a parallel, distinct
copy of a set of infrastructure in order to test a set of changes
before modifying the main production infrastructure. For example, a
developer working on a complex set of infrastructure changes might
create a new temporary workspace in order to freely experiment with
changes without affecting the default workspace.
Non-default workspaces are often related to feature branches in
version control. The default workspace might correspond to the
"master" or "trunk" branch, which describes the intended state of
production infrastructure. When a feature branch is created to develop
a change, the developer of that feature might create a corresponding
workspace and deploy into it a temporary "copy" of the main
infrastructure so that changes can be tested without affecting the
production infrastructure. Once the change is merged and deployed to
the default workspace, the test infrastructure can be destroyed and
the temporary workspace deleted.
AWS S3 is in the list of the supported backends.
It is very easy to use (similar to working with git branches) and combine it with the selected AWS account.
terraform workspace list
dev
* prod
staging
A few references regarding configuring the AWS provider to work with multiple account:
https://terragrunt.gruntwork.io/docs/features/work-with-multiple-aws-accounts/
https://assets.ctfassets.net/hqu2g0tau160/5Od5r9RbuEYueaeeycUIcK/b5a355e684de0a842d6a3a483a7dc7d3/devopscon-V2.1.pdf