Unable to register ec2 instance into ECS using terraform - amazon-web-services

I am unable to register the ec2 instance into the ecs cluster, I have created the cluster, service and registered the task into it. But the ec2 instance is not registered. I have given the userdata to register the instance into the cluster but unable to register it. I am attaching the files which are needed. Ec2 instance are provisioning just not registering to the ECS cluster. I am implementing module wise structure. I am attaching the screenshot at the end of the question
Autoscaling:
resource "aws_launch_configuration" "ec2" {
image_id = var.image_id
instance_type = var.instance_type
name = "ec2-${terraform.workspace}"
user_data = <<EOF
#!/bin/bash
echo 'ECS_CLUSTER=${var.cluster_name.name}' >> /etc/ecs/ecs.config
echo 'ECS_DISABLE_PRIVILEGED=true' >> /etc/ecs/ecs.config
EOF
key_name = var.key_name
iam_instance_profile = var.instance_profile
security_groups = [aws_security_group.webserver.id]
}
resource "aws_autoscaling_group" "asg" {
vpc_zone_identifier = var.public_subnet
desired_capacity = 2
max_size = 2
min_size = 2
health_check_grace_period = 300
launch_configuration = aws_launch_configuration.ec2.name
target_group_arns = [var.tg.arn]
}
resource "aws_security_group" "webserver" {
name = "webserver-${terraform.workspace}"
description = "Allow internet traffic"
vpc_id = var.vpc_id
ingress {
description = "incoming for ec2-instance"
from_port = 0
to_port = 0
protocol = -1
security_groups = [var.alb_sg]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "webserver-sg"
}
}
output "ec2_sg" {
value = aws_security_group.webserver.id
}
Cluster:
resource "aws_ecs_cluster" "cluster" {
name = "wordpress-${terraform.workspace}"
}
output "cluster" {
value = aws_ecs_cluster.cluster.id
}
output "cluster1" {
value = aws_ecs_cluster.cluster
}
Service:
resource "aws_ecs_service" "wordpress" {
name = "Wordpress-${terraform.workspace}"
cluster = var.cluster
task_definition = var.task.id
desired_count = 2
scheduling_strategy = "REPLICA"
load_balancer {
target_group_arn = var.tg.arn
container_name = "wordpress"
container_port = 80
}
deployment_controller {
type = "ECS"
}
}
Task:
data "template_file" "init" {
template = "${file("${path.module}/template/containerdef.json")}"
vars = {
rds_endpoint = "${var.rds_endpoint}"
name = "${var.name}"
username = "${var.username}"
password = "${var.password}"
}
}
resource "aws_ecs_task_definition" "task" {
family = "wordpress"
container_definitions = "${data.template_file.init.rendered}"
network_mode = "bridge"
requires_compatibilities = ["EC2"]
memory = "1GB"
cpu = "1 vCPU"
task_role_arn = var.task_execution.arn
}
main.tf
data "aws_availability_zones" "azs" {}
data "aws_ssm_parameter" "name" {
name = "Dbname"
}
data "aws_ssm_parameter" "password" {
name = "db_password"
}
module "my_vpc" {
source = "./modules/vpc"
vpc_cidr = var.vpc_cidr
public_subnet = var.public_subnet
private_subnet = var.private_subnet
availability_zone = data.aws_availability_zones.azs.names
}
module "db" {
source = "./modules/rds"
ec2_sg = "${module.autoscaling.ec2_sg}"
allocated_storage = var.db_allocated_storage
storage_type = var.db_storage_type
engine = var.db_engine
engine_version = var.db_engine_version
instance_class = var.db_instance_class
name = data.aws_ssm_parameter.name.value
username = data.aws_ssm_parameter.name.value
password = data.aws_ssm_parameter.password.value
vpc_id = "${module.my_vpc.vpc_id}"
public_subnet = "${module.my_vpc.public_subnets_ids}"
}
module "alb" {
source = "./modules/alb"
vpc_id = "${module.my_vpc.vpc_id}"
public_subnet = "${module.my_vpc.public_subnets_ids}"
}
module "task" {
source = "./modules/task"
name = data.aws_ssm_parameter.name.value
username = data.aws_ssm_parameter.name.value
password = data.aws_ssm_parameter.password.value
rds_endpoint = "${module.db.rds_endpoint}"
task_execution = "${module.role.task_execution}"
}
module "autoscaling" {
source = "./modules/autoscaling"
vpc_id = "${module.my_vpc.vpc_id}"
#public_subnet = "${module.my_vpc.public_subnets_ids}"
tg = "${module.alb.tg}"
image_id = var.image_id
instance_type = var.instance_type
alb_sg = "${module.alb.alb_sg}"
public_subnet = "${module.my_vpc.public_subnets_ids}"
instance_profile = "${module.role.instance_profile}"
key_name = var.key_name
cluster_name = "${module.cluster.cluster1}"
}
module "role" {
source = "./modules/Iam_role"
}
module "cluster" {
source = "./modules/Ecs-cluster"
}
module "service" {
source = "./modules/services"
cluster = "${module.cluster.cluster}"
tg = "${module.alb.tg}"
task = "${module.task.task}"
}
ec2-instance role:
resource "aws_iam_role" "container_instance" {
name = "container_instance-${terraform.workspace}"
assume_role_policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Action": "sts:AssumeRole",
"Principal": {
"Service": "ec2.amazonaws.com"
},
"Effect": "Allow"
}
]
}
EOF
tags = {
tag-key = "tag-value"
}
}
resource "aws_iam_instance_profile" "ec2_instance_role" {
name = "iam_instance_profile-${terraform.workspace}"
role = aws_iam_role.container_instance.name
}
resource "aws_iam_role_policy_attachment" "ec2_instance" {
role = aws_iam_role.container_instance.name
policy_arn = "arn:aws:iam::aws:policy/service-role/AmazonEC2ContainerServiceforEC2Role"
}
Screenshot:
[![enter image description here][2]][2]

Based on the chat discussion.
The issue could be caused by using incorrect instance profile:
iam_instance_profile = var.instance_profile.name
The important thing is that, now the two instances are correctly registered with the cluster.

Related

use terraform to create an aws codedeploy ecs infrastructure

I tried to use terraform to setup aws codeploy ecs infrastructure, following aws documentation to understand aws deploy : https://docs.aws.amazon.com/AmazonECS/latest/developerguide/create-blue-green.html , reading this post to have an example (it uses EC2 instance) : https://hiveit.co.uk/techshop/terraform-aws-vpc-example/02-create-the-vpc/ and finally use reference into terraform documentation : https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/codedeploy_deployment_group
The probleme is when I tried to make a deploy from aws codedeploy, the deployment is stuck in the install phase
Here is the terraform configuration I have done
# main.tf
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 4.0"
}
}
}
provider "aws" {
# defined in AWS_REGION env
# defined in AWS_ACCESS_KEY_ID env
# defined in AWS_SECRET_ACCESS_KEY env
}
# create repository to store docker image
resource "aws_ecr_repository" "repository" {
name = "test-repository"
}
# network.tf
resource "aws_vpc" "vpc" {
cidr_block = "10.0.0.0/16"
tags = {
Name = "terraform-example-vpc"
}
}
resource "aws_internet_gateway" "gateway" {
vpc_id = aws_vpc.vpc.id
tags = {
Name = "terraform-example-internet-gateway"
}
}
resource "aws_route" "route" {
route_table_id = aws_vpc.vpc.main_route_table_id
destination_cidr_block = "0.0.0.0/0"
gateway_id = aws_internet_gateway.gateway.id
}
resource "aws_subnet" "main" {
count = length(data.aws_availability_zones.available.names)
vpc_id = aws_vpc.vpc.id
cidr_block = "10.0.${count.index}.0/24"
map_public_ip_on_launch = true
availability_zone = element(data.aws_availability_zones.available.names, count.index)
tags = {
Name = "public-subnet-${element(data.aws_availability_zones.available.names, count.index)}"
}
}
# loadbalancer.tf
resource "aws_security_group" "lb_security_group" {
name = "terraform_lb_security_group"
vpc_id = aws_vpc.vpc.id
ingress {
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "terraform-example-lb-security-group"
}
}
resource "aws_lb" "lb" {
name = "terraform-example-lb"
security_groups = [aws_security_group.lb_security_group.id]
subnets = aws_subnet.main.*.id
tags = {
Name = "terraform-example-lb"
}
}
resource "aws_lb_target_group" "group1" {
name = "terraform-example-lb-target1"
port = 80
protocol = "HTTP"
vpc_id = aws_vpc.vpc.id
target_type = "ip"
}
resource "aws_lb_listener" "listener_http" {
load_balancer_arn = aws_lb.lb.arn
port = "80"
protocol = "HTTP"
default_action {
target_group_arn = aws_lb_target_group.group1.arn
type = "forward"
}
}
# cluster.tf
resource "aws_ecs_cluster" "cluster" {
name = "terraform-example-cluster"
tags = {
Name = "terraform-example-cluster"
}
}
resource "aws_iam_role" "ecsTaskExecutionRole" {
name = "ecsTaskExecutionRole"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
"Sid" : "",
"Effect" : "Allow",
"Principal" : {
"Service" : "ecs-tasks.amazonaws.com"
},
"Action" : "sts:AssumeRole"
}
]
})
}
resource "aws_ecs_task_definition" "task_definition" {
family = "deployment-app"
network_mode = "awsvpc"
requires_compatibilities = ["FARGATE"]
cpu = 256
memory = 512
execution_role_arn = aws_iam_role.ecsTaskExecutionRole.arn
container_definitions = jsonencode([
{
"name" : "app",
"image" : "httpd:2.4",
"portMappings" : [
{
"containerPort" : 80,
"hostPort" : 80,
"protocol" : "tcp"
}
],
"essential" : true
}
])
}
resource "aws_ecs_service" "service" {
cluster = aws_ecs_cluster.cluster.id
name = "terraform-example-service"
task_definition = "deployment-app"
launch_type = "FARGATE"
scheduling_strategy = "REPLICA"
platform_version = "LATEST"
desired_count = 1
load_balancer {
target_group_arn = aws_lb_target_group.group1.arn
container_name = "app"
container_port = 80
}
deployment_controller {
type = "CODE_DEPLOY"
}
network_configuration {
assign_public_ip = true
security_groups = [aws_security_group.lb_security_group.id]
subnets = aws_subnet.main.*.id
}
lifecycle {
ignore_changes = [desired_count, task_definition, platform_version]
}
}
# codedeploy.tf
resource "aws_codedeploy_app" "codedeploy_app" {
name = "example-codedeploy-app"
compute_platform = "ECS"
}
resource "aws_lb_target_group" "group2" {
name = "terraform-example-lb-target2"
port = 80
protocol = "HTTP"
vpc_id = aws_vpc.vpc.id
target_type = "ip"
}
resource "aws_codedeploy_deployment_group" "codedeploy_group" {
app_name = aws_codedeploy_app.codedeploy_app.name
deployment_group_name = "deployment_group_name"
service_role_arn = "###"
deployment_config_name = "CodeDeployDefault.ECSAllAtOnce"
auto_rollback_configuration {
enabled = true
events = ["DEPLOYMENT_FAILURE"]
}
blue_green_deployment_config {
deployment_ready_option {
action_on_timeout = "CONTINUE_DEPLOYMENT"
wait_time_in_minutes = 0
}
terminate_blue_instances_on_deployment_success {
action = "TERMINATE"
termination_wait_time_in_minutes = 1
}
}
deployment_style {
deployment_option = "WITH_TRAFFIC_CONTROL"
deployment_type = "BLUE_GREEN"
}
load_balancer_info {
target_group_pair_info {
target_group {
name = aws_lb_target_group.group1.name
}
target_group {
name = aws_lb_target_group.group2.name
}
prod_traffic_route {
listener_arns = [aws_lb_listener.listener_http.arn]
}
}
}
ecs_service {
cluster_name = aws_ecs_cluster.cluster.name
service_name = aws_ecs_service.service.name
}
}
# datasource.tf
data "aws_availability_zones" "available" {}
note: replace ### with the arn of the role AWSCodeDeployRoleForECS : https://docs.aws.amazon.com/AmazonECS/latest/developerguide/codedeploy_IAM_role.html I don't add it into terraform yet
after using
terraform plan
terraform apply
all the stack is set and i have access to the it works of httpd through the load balancer dns name
My probleme is when I push a new image to the repository, update the task definition and create a new deployment, this last one is stuck in the Step 1 without any error or whatever
For the example, I tried to push an nginx image instead of httpd
aws ecs register-task-definition \
--family=deployment-app \
--network-mode=awsvpc \
--cpu=256 \
--memory=512 \
--execution-role-arn=arn:aws:iam::__AWS_ACCOUNT__:role/ecsTaskExecutionRole \
--requires-compatibilities='["FARGATE"]' \
--container-definitions='[{"name": "app","image": "nginx:latest","portMappings": [{"containerPort": 80,"hostPort": 80,"protocol": "tcp"}],"essential": true}]'
I am using aws console to create deployment, with yaml appspec :
version: 0.0
Resources:
- TargetService:
Type: AWS::ECS::Service
Properties:
TaskDefinition: "arn:aws:ecs:eu-west-3:__AWS_ACCOUNT__:task-definition/deployment-app:9"
LoadBalancerInfo:
ContainerName: "app"
ContainerPort: 80
PlatformVersion: "LATEST"
Can anyone help me to understand my mistake ?
Thanks !
I didn't know where to find a log from codeploy to know what was the problem. Finally, I just needed to go to the service, and check the provisionning task, after that the task failed with error message.
The problem came from my ecsTaskExecutionRole because it didn't has enought ECR rights to pull the image I built

AWS EC2 instance not joining ECS cluster

I am quite desperate with an issue very similar to the one described into this thread.
https://github.com/OpenDroneMap/opendronemap-ecs/issues/14#issuecomment-432004023
When I attach the network interface to my EC2 instance, so that my custom VPC is used instead of the default one, the EC2 instance no longer joins the ECS cluster.
This is my terraform definition.
provider "aws" {}
resource "aws_vpc" "main" {
cidr_block = "10.0.0.0/16"
enable_dns_support = true
enable_dns_hostnames = true
assign_generated_ipv6_cidr_block = true
}
resource "aws_internet_gateway" "main" {
vpc_id = aws_vpc.main.id
}
resource "aws_subnet" "main" {
vpc_id = aws_vpc.main.id
cidr_block = "10.0.0.0/16"
availability_zone = "us-west-2a"
map_public_ip_on_launch = true
}
resource "aws_route_table" "main" {
vpc_id = aws_vpc.main.id
}
resource "aws_route_table_association" "rta1" {
subnet_id = aws_subnet.main.id
route_table_id = aws_route_table.main.id
}
resource "aws_route_table_association" "rta2" {
gateway_id = aws_internet_gateway.main.id
route_table_id = aws_route_table.main.id
}
resource "aws_security_group" "sg-jenkins" {
name = "sg_jenkins"
description = "Allow inbound traffic for Jenkins instance"
vpc_id = aws_vpc.main.id
ingress = [
{
description = "inbound all"
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
ipv6_cidr_blocks = ["::/0"]
self = null
prefix_list_ids = null
security_groups = null
}
]
egress = [
{
description = "outbound all"
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
ipv6_cidr_blocks = ["::/0"]
self = null
prefix_list_ids = null
security_groups = null
}
]
}
resource "aws_network_interface" "main" {
subnet_id = aws_subnet.main.id
security_groups = [aws_security_group.sg-jenkins.id]
}
resource "aws_instance" "ec2_instance" {
ami = "ami-07764a7d8502d36a2"
instance_type = "t2.micro"
iam_instance_profile = "ecsInstanceRole"
key_name = "fran"
network_interface {
device_index = 0
network_interface_id = aws_network_interface.main.id
}
user_data = <<EOF
#!/bin/bash
echo ECS_CLUSTER=cluster >> /etc/ecs/ecs.config
EOF
depends_on = [aws_internet_gateway.main]
}
### Task definition
resource "aws_ecs_task_definition" "jenkins-task" {
family = "namespace"
container_definitions = jsonencode([
{
name = "jenkins"
image = "cnservices/jenkins-master"
cpu = 10
memory = 512
essential = true
portMappings = [
{
containerPort = 8080
hostPort = 8080
}
]
}
])
# network_mode = "awsvpc"
volume {
name = "service-storage"
host_path = "/ecs/service-storage"
}
placement_constraints {
type = "memberOf"
expression = "attribute:ecs.availability-zone in [us-west-2a]"
}
}
### Cluster
resource "aws_ecs_cluster" "cluster" {
name = "cluster"
setting {
name = "containerInsights"
value = "enabled"
}
}
### Service
resource "aws_ecs_service" "jenkins-service" {
name = "jenkins-service"
cluster = aws_ecs_cluster.cluster.id
task_definition = aws_ecs_task_definition.jenkins-task.arn
desired_count = 1
# iam_role = aws_iam_role.foo.arn
# depends_on = [aws_iam_role_policy.foo]
# network_configuration {
# security_groups = [aws_security_group.sg-jenkins.id]
# subnets = [aws_subnet.main.id]
# }
ordered_placement_strategy {
type = "binpack"
field = "cpu"
}
placement_constraints {
type = "memberOf"
expression = "attribute:ecs.availability-zone in [us-west-2a]"
}
}
You haven't created a route to your IGW. Thus your instance can't connect to the ECS service to register with your cluster. So remove rta2 and add a route:
# not needed. to be removed.
# resource "aws_route_table_association" "rta2" {
# gateway_id = aws_internet_gateway.main.id
# route_table_id = aws_route_table.main.id
# }
# add a missing route to the IGW
resource "aws_route" "r" {
route_table_id = aws_route_table.main.id
gateway_id = aws_internet_gateway.main.id
destination_cidr_block = "0.0.0.0/0"
}

ECS with Terraform

Is there a good / definitive reference or course for managing a ECS service using Terraform. I have referred this which creates the ECS Service, but I can't get to a state where my task runs on that cluster.
Here is what I have for now:
# create the VPC
resource "aws_vpc" "vpc" {
cidr_block = var.cidr_vpc
instance_tenancy = var.instanceTenancy
enable_dns_support = var.dnsSupport
enable_dns_hostnames = var.dnsHostNames
tags = {
Name = "tdemo"
}
}
# Create the Internet Gateway
resource "aws_internet_gateway" "igw" {
vpc_id = "${aws_vpc.vpc.id}"
tags = {
Name = "tdemo"
}
}
# Create the Public subnet
resource "aws_subnet" "subnet_public1" {
vpc_id = "${aws_vpc.vpc.id}"
cidr_block = var.cidr_pubsubnet1
map_public_ip_on_launch = "true"
availability_zone = var.availability_zone1
tags = {
Name = "tdemo"
}
}
resource "aws_subnet" "subnet_public2" {
vpc_id = "${aws_vpc.vpc.id}"
cidr_block = var.cidr_pubsubnet2
map_public_ip_on_launch = "true"
availability_zone = var.availability_zone2
tags = {
Name = "tdemo"
}
}
# Route table to connect to Internet Gateway
resource "aws_route_table" "rta_public" {
vpc_id = "${aws_vpc.vpc.id}"
route {
cidr_block = "0.0.0.0/0"
gateway_id = "${aws_internet_gateway.igw.id}"
}
tags = {
Name = "tdemo"
}
}
# Create Route Table Association to make the subet public over internet
resource "aws_route_table_association" "rta_subnet_public" {
subnet_id = "${aws_subnet.subnet_public1.id}"
route_table_id = "${aws_route_table.rta_public.id}"
}
# Configure Security Group inbound and outbound rules
resource "aws_security_group" "sg_22" {
name = "sg_22"
vpc_id = "${aws_vpc.vpc.id}"
ingress {
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
ingress {
from_port = 0
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "tdemo"
}
}
###############################################################################
resource "aws_iam_role" "ecs-service-role" {
name = "tdemo-ecs-service-role"
path = "/"
assume_role_policy = "${data.aws_iam_policy_document.ecs-service-policy.json}"
}
resource "aws_iam_role_policy_attachment" "ecs-service-role-attachment" {
role = "${aws_iam_role.ecs-service-role.name}"
policy_arn = "arn:aws:iam::aws:policy/service-role/AmazonEC2ContainerServiceRole"
}
data "aws_iam_policy_document" "ecs-service-policy" {
statement {
actions = ["sts:AssumeRole"]
principals {
type = "Service"
identifiers = ["ecs.amazonaws.com"]
}
}
}
resource "aws_iam_role" "ecs-instance-role" {
name = "tdemo-ecs-instance-role"
path = "/"
assume_role_policy = "${data.aws_iam_policy_document.ecs-instance-policy.json}"
}
data "aws_iam_policy_document" "ecs-instance-policy" {
statement {
actions = ["sts:AssumeRole"]
principals {
type = "Service"
identifiers = ["ec2.amazonaws.com"]
}
}
}
resource "aws_iam_role_policy_attachment" "ecs-instance-role-attachment" {
role = "${aws_iam_role.ecs-instance-role.name}"
policy_arn = "arn:aws:iam::aws:policy/service-role/AmazonEC2ContainerServiceforEC2Role"
}
resource "aws_iam_instance_profile" "ecs-instance-profile" {
name = "tdemo-ecs-instance-profile"
path = "/"
roles = ["${aws_iam_role.ecs-instance-role.id}"]
provisioner "local-exec" {
command = "ping 127.0.0.1 -n 11 > nul"
}
}
resource "aws_launch_configuration" "ecs-launch-configuration" {
name = "tdemo-ecs-launch-configuration"
image_id = var.amiid
instance_type = "t2.xlarge"
iam_instance_profile = "${aws_iam_instance_profile.ecs-instance-profile.id}"
root_block_device {
volume_type = "standard"
volume_size = 100
delete_on_termination = true
}
lifecycle {
create_before_destroy = true
}
security_groups = ["${aws_security_group.sg_22.id}"]
associate_public_ip_address = "true"
key_name = "${var.ecs_public_keyname}"
user_data = <<-EOF
#! /bin/bash
echo ECS_CLUSTER=your_cluster_name >> /etc/ecs/ecs.config
sudo sysctl -w vm.max_map_count=524288
sudo apt-get update
sudo apt-get install -y apache2
sudo systemctl start apache2
sudo systemctl enable apache2
echo "<h1>Deployed via Terraform</h1>" | sudo tee /var/www/html/index.html
EOF
}
resource "aws_ecs_cluster" "ecs-cluster" {
name = var.ecs_cluster
}
###############################################################################
data "aws_ecs_task_definition" "ecs_task_definition" {
task_definition = "${aws_ecs_task_definition.ecs_task_definition.family}"
}
resource "aws_ecs_task_definition" "ecs_task_definition" {
family = "hello_world"
container_definitions = <<DEFINITION
[
{
"name": "hello-world",
"image": "nginx:latest",
"essential": true,
"portMappings": [
{
"containerPort": 80,
"hostPort": 80
}
],
"memory": 500,
"cpu": 10
}
]
DEFINITION
}
resource "aws_alb" "ecs-load-balancer" {
name = "ecs-load-balancer"
security_groups = ["${aws_security_group.sg_22.id}"]
subnets = ["${aws_subnet.subnet_public1.id}", "${aws_subnet.subnet_public2.id}"]
tags = {
Name = "ecs-load-balancer"
}
}
resource "aws_alb_target_group" "ecs-target-group" {
name = "ecs-target-group"
port = "80"
protocol = "HTTP"
vpc_id = "${aws_vpc.vpc.id}"
health_check {
healthy_threshold = "5"
unhealthy_threshold = "2"
interval = "30"
matcher = "200"
path = "/"
port = "traffic-port"
protocol = "HTTP"
timeout = "5"
}
tags = {
Name = "ecs-target-group"
}
}
resource "aws_alb_listener" "alb-listener" {
load_balancer_arn = "${aws_alb.ecs-load-balancer.arn}"
port = "80"
protocol = "HTTP"
default_action {
target_group_arn = "${aws_alb_target_group.ecs-target-group.arn}"
type = "forward"
}
}
resource "aws_autoscaling_group" "ecs-autoscaling-group" {
name = "ecs-autoscaling-group"
max_size = "${var.max_instance_size}"
min_size = "${var.min_instance_size}"
desired_capacity = "${var.desired_capacity}"
vpc_zone_identifier = ["${aws_subnet.subnet_public1.id}", "${aws_subnet.subnet_public2.id}"]
launch_configuration = "${aws_launch_configuration.ecs-launch-configuration.name}"
health_check_type = "ELB"
}
resource "aws_ecs_service" "ecs-service" {
name = "tdemo-ecs-service"
iam_role = "${aws_iam_role.ecs-service-role.name}"
cluster = "${aws_ecs_cluster.ecs-cluster.id}"
task_definition = "${aws_ecs_task_definition.ecs_task_definition.family}:${max("${aws_ecs_task_definition.ecs_task_definition.revision}", "${data.aws_ecs_task_definition.ecs_task_definition.revision}")}"
desired_count = 1
load_balancer {
target_group_arn = "${aws_alb_target_group.ecs-target-group.arn}"
container_port = 80
container_name = "hello-world"
}
}
Thanks,
One thing that is apparent and that may be the source of the issue (at least one of them) is:
echo ECS_CLUSTER=your_cluster_name >> /etc/ecs/ecs.config
However, your cluster name is var.ecs_cluster. Thus the above line should be:
echo ECS_CLUSTER=${var.ecs_cluster} >> /etc/ecs/ecs.config
Please note, that there could be many other issues, which are not that clear to spot without actually deploying your terraform script.

Terraform dial tcp dns error while creating AWS ALB ingress with EKS cluster

I am trying to use Terraform to create an AWS EKS cluster with an ALB load balancer and kubernetes ingress.
I have been using this git repo and this blog to guide me.
The deploy fails with the following errors immediately after the cluster has been created.
Error: Post "https://E8475B1B3693C979073BF0D721D876A7.sk1.ap-southeast-1.eks.amazonaws.com/api/v1/namespaces/kube-system/configmaps": dial tcp: lookup E8475B1B3693C979073BF0D721D876A7.sk1.ap-southeast-1.eks.amazonaws.com on 8.8.8.8:53: no such host
on modules/alb/alb_ingress_controller.tf line 1, in resource "kubernetes_config_map" "aws_auth":
1: resource "kubernetes_config_map" "aws_auth" {
Error: Post "https://E8475B1B3693C979073BF0D721D876A7.sk1.ap-southeast-1.eks.amazonaws.com/apis/rbac.authorization.k8s.io/v1/clusterroles": dial tcp: lookup E8475B1B3693C979073BF0D721D876A7.sk1.ap-southeast-1.eks.amazonaws.com on 8.8.8.8:53: no such host
on modules/alb/alb_ingress_controller.tf line 20, in resource "kubernetes_cluster_role" "alb-ingress":
20: resource "kubernetes_cluster_role" "alb-ingress" {
Error: Post "https://E8475B1B3693C979073BF0D721D876A7.sk1.ap-southeast-1.eks.amazonaws.com/apis/rbac.authorization.k8s.io/v1/clusterrolebindings": dial tcp: lookup E8475B1B3693C979073BF0D721D876A7.sk1.ap-southeast-1.eks.amazonaws.com on 8.8.8.8:53: no such host
on modules/alb/alb_ingress_controller.tf line 41, in resource "kubernetes_cluster_role_binding" "alb-ingress":
41: resource "kubernetes_cluster_role_binding" "alb-ingress" {
Error: Post "https://E8475B1B3693C979073BF0D721D876A7.sk1.ap-southeast-1.eks.amazonaws.com/api/v1/namespaces/kube-system/serviceaccounts": dial tcp: lookup E8475B1B3693C979073BF0D721D876A7.sk1.ap-southeast-1.eks.amazonaws.com on 8.8.8.8:53: no such host
on modules/alb/alb_ingress_controller.tf line 62, in resource "kubernetes_service_account" "alb-ingress":
62: resource "kubernetes_service_account" "alb-ingress" {
Error: Failed to create Ingress 'default/main-ingress' because: Post "https://E8475B1B3693C979073BF0D721D876A7.sk1.ap-southeast-1.eks.amazonaws.com/apis/extensions/v1beta1/namespaces/default/ingresses": dial tcp: lookup E8475B1B3693C979073BF0D721D876A7.sk1.ap-southeast-1.eks.amazonaws.com on 8.8.8.8:53: no such host
on modules/alb/kubernetes_ingress.tf line 1, in resource "kubernetes_ingress" "main":
1: resource "kubernetes_ingress" "main" {
Error: Post "https://641480DEC80EB445C6CBBEDC9D1F0234.yl4.ap-southeast-1.eks.amazonaws.com/api/v1/namespaces/kube-system/configmaps": dial tcp 10.0.21.192:443: connect: no route to host
on modules/eks/allow_nodes.tf line 22, in resource "kubernetes_config_map" "aws_auth":
22: resource "kubernetes_config_map" "aws_auth" {
Here is my Terraform code:
provider "aws" {
region = var.aws_region
version = "~> 2.65.0"
ignore_tags {
keys = ["kubernetes.io/role/internal-elb", "app.kubernetes.io/name"]
key_prefixes = ["kubernetes.io/cluster/", "alb.ingress.kubernetes.io/"]
}
}
resource "kubernetes_config_map" "aws_auth" {
metadata {
name = "aws-auth"
namespace = "kube-system"
}
data = {
mapRoles = <<EOF
- rolearn: ${var.iam_role_node}
username: system:node:{{EC2PrivateDNSName}}
groups:
- system:bootstrappers
- system:nodes
EOF
}
depends_on = [
var.eks_cluster_name
]
}
resource "kubernetes_cluster_role" "alb-ingress" {
metadata {
name = "alb-ingress-controller"
labels = {
"app.kubernetes.io/name" = "alb-ingress-controller"
}
}
rule {
api_groups = ["", "extensions"]
resources = ["configmaps", "endpoints", "events", "ingresses", "ingresses/status", "services"]
verbs = ["create", "get", "list", "update", "watch", "patch"]
}
rule {
api_groups = ["", "extensions"]
resources = ["nodes", "pods", "secrets", "services", "namespaces"]
verbs = ["get", "list", "watch"]
}
}
resource "kubernetes_cluster_role_binding" "alb-ingress" {
metadata {
name = "alb-ingress-controller"
labels = {
"app.kubernetes.io/name" = "alb-ingress-controller"
}
}
role_ref {
api_group = "rbac.authorization.k8s.io"
kind = "ClusterRole"
name = "alb-ingress-controller"
}
subject {
kind = "ServiceAccount"
name = "alb-ingress-controller"
namespace = "kube-system"
}
}
resource "kubernetes_service_account" "alb-ingress" {
metadata {
name = "alb-ingress-controller"
namespace = "kube-system"
labels = {
"app.kubernetes.io/name" = "alb-ingress-controller"
}
}
automount_service_account_token = true
}
resource "kubernetes_deployment" "alb-ingress" {
metadata {
name = "alb-ingress-controller"
labels = {
"app.kubernetes.io/name" = "alb-ingress-controller"
}
namespace = "kube-system"
}
spec {
selector {
match_labels = {
"app.kubernetes.io/name" = "alb-ingress-controller"
}
}
template {
metadata {
labels = {
"app.kubernetes.io/name" = "alb-ingress-controller"
}
}
spec {
volume {
name = kubernetes_service_account.alb-ingress.default_secret_name
secret {
secret_name = kubernetes_service_account.alb-ingress.default_secret_name
}
}
container {
# This is where you change the version when Amazon comes out with a new version of the ingress controller
image = "docker.io/amazon/aws-alb-ingress-controller:v1.1.7"
name = "alb-ingress-controller"
args = [
"--ingress-class=alb",
"--cluster-name=${var.eks_cluster_name}",
"--aws-vpc-id=${var.vpc_id}",
"--aws-region=${var.aws_region}"]
}
service_account_name = "alb-ingress-controller"
}
}
}
}
########################################################################################
# setup provider for kubernetes
//data "external" "aws_iam_authenticator" {
// program = ["sh", "-c", "aws-iam-authenticator token -i ${var.cluster_name} | jq -r -c .status"]
//}
data "aws_eks_cluster_auth" "tf_eks_cluster" {
name = aws_eks_cluster.tf_eks_cluster.name
}
provider "kubernetes" {
host = aws_eks_cluster.tf_eks_cluster.endpoint
cluster_ca_certificate = base64decode(aws_eks_cluster.tf_eks_cluster.certificate_authority.0.data)
//token = data.external.aws_iam_authenticator.result.token
token = data.aws_eks_cluster_auth.tf_eks_cluster.token
load_config_file = false
version = "~> 1.9"
}
# Allow worker nodes to join cluster via config map
resource "kubernetes_config_map" "aws_auth" {
metadata {
name = "aws-auth"
namespace = "kube-system"
}
data = {
mapRoles = <<EOF
- rolearn: ${aws_iam_role.tf-eks-node.arn}
username: system:node:{{EC2PrivateDNSName}}
groups:
- system:bootstrappers
- system:nodes
EOF
}
depends_on = [aws_eks_cluster.tf_eks_cluster, aws_autoscaling_group.tf_eks_cluster]
}
resource "kubernetes_ingress" "main" {
metadata {
name = "main-ingress"
annotations = {
"alb.ingress.kubernetes.io/scheme" = "internet-facing"
"kubernetes.io/ingress.class" = "alb"
"alb.ingress.kubernetes.io/subnets" = var.app_subnet_stringlist
"alb.ingress.kubernetes.io/certificate-arn" = "${data.aws_acm_certificate.api.arn}, ${data.aws_acm_certificate.gitea.arn}"
"alb.ingress.kubernetes.io/listen-ports" = <<JSON
[
{"HTTP": 80},
{"HTTPS": 443}
]
JSON
"alb.ingress.kubernetes.io/actions.ssl-redirect" = <<JSON
{
"Type": "redirect",
"RedirectConfig": {
"Protocol": "HTTPS",
"Port": "443",
"StatusCode": "HTTP_301"
}
}
JSON
}
}
spec {
rule {
host = "api.xactpos.com"
http {
path {
backend {
service_name = "ssl-redirect"
service_port = "use-annotation"
}
path = "/*"
}
path {
backend {
service_name = "app-service1"
service_port = 80
}
path = "/service1"
}
path {
backend {
service_name = "app-service2"
service_port = 80
}
path = "/service2"
}
}
}
rule {
host = "gitea.xactpos.com"
http {
path {
backend {
service_name = "ssl-redirect"
service_port = "use-annotation"
}
path = "/*"
}
path {
backend {
service_name = "api-service1"
service_port = 80
}
path = "/service3"
}
path {
backend {
service_name = "api-service2"
service_port = 80
}
path = "/graphq4"
}
}
}
}
}
resource "aws_security_group" "eks-alb" {
name = "eks-alb-public"
description = "Security group allowing public traffic for the eks load balancer."
vpc_id = var.vpc_id
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = map(
"Name", "terraform-eks-alb",
"kubernetes.io/cluster/tf-eks-cluster", "owned"
)
}
resource "aws_security_group_rule" "eks-alb-public-https" {
description = "Allow eks load balancer to communicate with public traffic securely."
cidr_blocks = ["0.0.0.0/0"]
from_port = 443
protocol = "tcp"
security_group_id = aws_security_group.eks-alb.id
to_port = 443
type = "ingress"
}
resource "aws_security_group_rule" "eks-alb-public-http" {
description = "Allow eks load balancer to communicate with public traffic."
cidr_blocks = ["0.0.0.0/0"]
from_port = 80
protocol = "tcp"
security_group_id = aws_security_group.eks-alb.id
to_port = 80
type = "ingress"
}
resource "aws_eks_cluster" "tf_eks_cluster" {
name = var.cluster_name
role_arn = aws_iam_role.tf-eks-cluster.arn
vpc_config {
security_group_ids = [aws_security_group.tf-eks-cluster.id]
subnet_ids = var.app_subnet_ids
endpoint_private_access = true
endpoint_public_access = false
}
depends_on = [
aws_iam_role_policy_attachment.tf-eks-cluster-AmazonEKSClusterPolicy,
aws_iam_role_policy_attachment.tf-eks-cluster-AmazonEKSServicePolicy,
]
}
# Setup for IAM role needed to setup an EKS cluster
resource "aws_iam_role" "tf-eks-cluster" {
name = "tf-eks-cluster"
assume_role_policy = <<POLICY
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "eks.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}
POLICY
}
resource "aws_iam_role_policy_attachment" "tf-eks-cluster-AmazonEKSClusterPolicy" {
policy_arn = "arn:aws:iam::aws:policy/AmazonEKSClusterPolicy"
role = aws_iam_role.tf-eks-cluster.name
}
resource "aws_iam_role_policy_attachment" "tf-eks-cluster-AmazonEKSServicePolicy" {
policy_arn = "arn:aws:iam::aws:policy/AmazonEKSServicePolicy"
role = aws_iam_role.tf-eks-cluster.name
}
########################################################################################
# Setup IAM role & instance profile for worker nodes
resource "aws_iam_role" "tf-eks-node" {
name = "tf-eks-node"
assume_role_policy = <<POLICY
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "ec2.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}
POLICY
}
resource "aws_iam_instance_profile" "tf-eks-node" {
name = "tf-eks-node"
role = aws_iam_role.tf-eks-node.name
}
resource "aws_iam_role_policy_attachment" "tf-eks-node-AmazonEKSWorkerNodePolicy" {
policy_arn = "arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy"
role = aws_iam_role.tf-eks-node.name
}
resource "aws_iam_role_policy_attachment" "tf-eks-node-AmazonEKS_CNI_Policy" {
policy_arn = "arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy"
role = aws_iam_role.tf-eks-node.name
}
resource "aws_iam_role_policy_attachment" "tf-eks-node-AmazonEC2ContainerRegistryReadOnly" {
policy_arn = "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly"
role = aws_iam_role.tf-eks-node.name
}
resource "aws_iam_role_policy_attachment" "tf-eks-node-AmazonEC2FullAccess" {
policy_arn = "arn:aws:iam::aws:policy/AmazonEC2FullAccess"
role = aws_iam_role.tf-eks-node.name
}
resource "aws_iam_role_policy_attachment" "tf-eks-node-alb-ingress_policy" {
policy_arn = aws_iam_policy.alb-ingress.arn
role = aws_iam_role.tf-eks-node.name
}
resource "aws_iam_policy" "alb-ingress" {
name = "alb-ingress-policy"
policy = file("${path.module}/alb_ingress_policy.json")
}
# generate KUBECONFIG as output to save in ~/.kube/config locally
# save the 'terraform output eks_kubeconfig > config', run 'mv config ~/.kube/config' to use it for kubectl
locals {
kubeconfig = <<KUBECONFIG
apiVersion: v1
clusters:
- cluster:
server: ${aws_eks_cluster.tf_eks_cluster.endpoint}
certificate-authority-data: ${aws_eks_cluster.tf_eks_cluster.certificate_authority.0.data}
name: kubernetes
contexts:
- context:
cluster: kubernetes
user: aws
name: aws
current-context: aws
kind: Config
preferences: {}
users:
- name: aws
user:
exec:
apiVersion: client.authentication.k8s.io/v1alpha1
command: aws-iam-authenticator
args:
- "token"
- "-i"
- "${var.cluster_name}"
KUBECONFIG
}
########################################################################################
# Setup AutoScaling Group for worker nodes
# Setup data source to get amazon-provided AMI for EKS nodes
data "aws_ami" "eks-worker" {
filter {
name = "name"
values = ["amazon-eks-node-v*"]
}
most_recent = true
owners = ["602401143452"] # Amazon EKS AMI Account ID
}
# Is provided in demo code, no idea what it's used for though! TODO: DELETE
# data "aws_region" "current" {}
# EKS currently documents this required userdata for EKS worker nodes to
# properly configure Kubernetes applications on the EC2 instance.
# We utilize a Terraform local here to simplify Base64 encode this
# information and write it into the AutoScaling Launch Configuration.
# More information: https://docs.aws.amazon.com/eks/latest/userguide/launch-workers.html
locals {
tf-eks-node-userdata = <<USERDATA
#!/bin/bash
set -o xtrace
/etc/eks/bootstrap.sh --apiserver-endpoint '${aws_eks_cluster.tf_eks_cluster.endpoint}' --b64-cluster-ca '${aws_eks_cluster.tf_eks_cluster.certificate_authority.0.data}' '${var.cluster_name}'
USERDATA
}
resource "aws_launch_configuration" "tf_eks_cluster" {
associate_public_ip_address = true
iam_instance_profile = aws_iam_instance_profile.tf-eks-node.name
image_id = data.aws_ami.eks-worker.id
instance_type = var.instance_type
name_prefix = "tf-eks-spot"
security_groups = [aws_security_group.tf-eks-node.id]
user_data_base64 = base64encode(local.tf-eks-node-userdata)
lifecycle {
create_before_destroy = true
}
}
resource "aws_lb_target_group" "tf_eks_cluster" {
name = "tf-eks-cluster"
port = 31742
protocol = "HTTP"
vpc_id = var.vpc_id
target_type = "instance"
}
resource "aws_autoscaling_group" "tf_eks_cluster" {
desired_capacity = "2"
launch_configuration = aws_launch_configuration.tf_eks_cluster.id
max_size = "3"
min_size = 1
name = "tf-eks-cluster"
vpc_zone_identifier = var.app_subnet_ids
target_group_arns = [aws_lb_target_group.tf_eks_cluster.arn]
tag {
key = "Name"
value = "tf-eks-cluster"
propagate_at_launch = true
}
tag {
key = "kubernetes.io/cluster/${var.cluster_name}"
value = "owned"
propagate_at_launch = true
}
}
resource "aws_security_group" "tf-eks-cluster" {
name = "terraform-eks-cluster"
description = "Cluster communication with worker nodes"
vpc_id = var.vpc_id
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "terraform-eks"
}
}
resource "aws_security_group" "tf-eks-node" {
name = "terraform-eks-node"
description = "Security group for all nodes in the cluster"
vpc_id = var.vpc_id
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "terraform-eks"
}
}
# Allow inbound traffic from your local workstation external IP
# to the Kubernetes. You will need to replace A.B.C.D below with
# your real IP. Services like icanhazip.com can help you find this.
resource "aws_security_group_rule" "tf-eks-cluster-ingress-workstation-https" {
cidr_blocks = [var.accessing_computer_ip]
description = "Allow workstation to communicate with the cluster API Server"
from_port = 443
protocol = "tcp"
security_group_id = aws_security_group.tf-eks-cluster.id
to_port = 443
type = "ingress"
}
########################################################################################
# Setup worker node security group
resource "aws_security_group_rule" "tf-eks-node-ingress-self" {
description = "Allow node to communicate with each other"
from_port = 0
protocol = "-1"
security_group_id = aws_security_group.tf-eks-node.id
source_security_group_id = aws_security_group.tf-eks-node.id
to_port = 65535
type = "ingress"
}
resource "aws_security_group_rule" "tf-eks-node-ingress-cluster" {
description = "Allow worker Kubelets and pods to receive communication from the cluster control plane"
from_port = 1025
protocol = "tcp"
security_group_id = aws_security_group.tf-eks-node.id
source_security_group_id = aws_security_group.tf-eks-cluster.id
to_port = 65535
type = "ingress"
}
# allow worker nodes to access EKS master
resource "aws_security_group_rule" "tf-eks-cluster-ingress-node-https" {
description = "Allow pods to communicate with the cluster API Server"
from_port = 443
protocol = "tcp"
security_group_id = aws_security_group.tf-eks-node.id
source_security_group_id = aws_security_group.tf-eks-cluster.id
to_port = 443
type = "ingress"
}
resource "aws_security_group_rule" "tf-eks-node-ingress-master" {
description = "Allow cluster control to receive communication from the worker Kubelets"
from_port = 443
protocol = "tcp"
security_group_id = aws_security_group.tf-eks-cluster.id
source_security_group_id = aws_security_group.tf-eks-node.id
to_port = 443
type = "ingress"
}
resource "aws_internet_gateway" "eks" {
vpc_id = aws_vpc.eks.id
tags = {
Name = "internet_gateway"
}
}
resource "aws_eip" "nat_gateway" {
count = var.subnet_count
vpc = true
}
resource "aws_nat_gateway" "eks" {
count = var.subnet_count
allocation_id = aws_eip.nat_gateway.*.id[count.index]
subnet_id = aws_subnet.gateway.*.id[count.index]
tags = {
Name = "nat_gateway"
}
depends_on = [aws_internet_gateway.eks]
}
resource "aws_route_table" "application" {
count = var.subnet_count
vpc_id = aws_vpc.eks.id
route {
cidr_block = "0.0.0.0/0"
nat_gateway_id = aws_nat_gateway.eks.*.id[count.index]
}
tags = {
Name = "eks_application"
}
}
resource "aws_route_table" "vpn" {
vpc_id = aws_vpc.eks.id
tags = {
Name = "eks_vpn"
}
}
resource "aws_route_table" "gateway" {
vpc_id = aws_vpc.eks.id
route {
cidr_block = "0.0.0.0/0"
gateway_id = aws_internet_gateway.eks.id
}
tags = {
Name = "eks_gateway"
}
}
resource "aws_route_table_association" "application" {
count = var.subnet_count
subnet_id = aws_subnet.application.*.id[count.index]
route_table_id = aws_route_table.application.*.id[count.index]
}
resource "aws_route_table_association" "vpn" {
count = var.subnet_count
subnet_id = aws_subnet.vpn.*.id[count.index]
route_table_id = aws_route_table.vpn.id
}
resource "aws_route_table_association" "gateway" {
count = var.subnet_count
subnet_id = aws_subnet.gateway.*.id[count.index]
route_table_id = aws_route_table.gateway.id
}
data "aws_availability_zones" "available" {}
resource "aws_subnet" "gateway" {
count = var.subnet_count
availability_zone = data.aws_availability_zones.available.names[count.index]
cidr_block = "10.0.1${count.index}.0/24"
vpc_id = aws_vpc.eks.id
map_public_ip_on_launch = true
tags = {
Name = "eks_gateway"
}
}
resource "aws_subnet" "application" {
count = var.subnet_count
availability_zone = data.aws_availability_zones.available.names[count.index]
cidr_block = "10.0.2${count.index}.0/24"
vpc_id = aws_vpc.eks.id
map_public_ip_on_launch = true
tags = map(
"Name", "eks_application",
"kubernetes.io/cluster/${var.cluster_name}", "shared"
)
}
resource "aws_subnet" "vpn" {
count = var.subnet_count
availability_zone = data.aws_availability_zones.available.names[count.index]
cidr_block = "10.0.3${count.index}.0/24"
vpc_id = aws_vpc.eks.id
tags = {
Name = "eks_vpn"
}
}
resource "aws_vpc" "eks" {
cidr_block = "10.0.0.0/16"
enable_dns_hostnames = true
enable_dns_support = true
tags = map(
"Name", "eks-vpc",
"kubernetes.io/cluster/${var.cluster_name}", "shared"
)
}
I had tried to create the Kubernetes deployment in a single massive Terraform manifest. I needed to separate the Kubernetes deployment into a separate Terraform manifest which I applied after updating the ~/.kube/config file.
The DNS errors were due to this file not being current for the new cluster.
Additionally, I needed to ensure that endpoint_private_access = true is set in the eks cluster resource.

terraform - No Container Instances were found in your cluster

I deploy ecs using terraform.
When I run terraform apply everything is okay but when I browse to ecs service on events tab I have this error:
service nginx-ecs-service was unable to place a task because no container instance met all of its requirements. Reason: No Container Instances were found in your cluster.
How do I fix that? What is missing in my terraform file?
locals {
name = "myapp"
environment = "prod"
# This is the convention we use to know what belongs to each other
ec2_resources_name = "${local.name}-${local.environment}"
}
resource "aws_iam_server_certificate" "lb_cert" {
name = "lb_cert"
certificate_body = "${file("./www.example.com/cert.pem")}"
private_key = "${file("./www.example.com/privkey.pem")}"
certificate_chain = "${file("./www.example.com/chain.pem")}"
}
resource "aws_security_group" "bastion-sg" {
name = "bastion-security-group"
vpc_id = "${module.vpc.vpc_id}"
ingress {
protocol = "tcp"
from_port = 22
to_port = 22
cidr_blocks = ["0.0.0.0/0"]
}
egress {
protocol = -1
from_port = 0
to_port = 0
cidr_blocks = ["0.0.0.0/0"]
}
}
resource "aws_instance" "bastion" {
depends_on = ["aws_security_group.bastion-sg"]
ami = "ami-0d5d9d301c853a04a"
key_name = "myapp"
instance_type = "t2.micro"
vpc_security_group_ids = ["${aws_security_group.bastion-sg.id}"]
associate_public_ip_address = true
subnet_id = "${element(module.vpc.public_subnets, 0)}"
tags = {
Name = "bastion"
}
}
# VPC Definition
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "~> 2.0"
name = "my-vpc"
cidr = "10.1.0.0/16"
azs = ["us-east-2a", "us-east-2b", "us-east-2c"]
private_subnets = ["10.1.1.0/24", "10.1.2.0/24", "10.1.3.0/24"]
public_subnets = ["10.1.101.0/24", "10.1.102.0/24", "10.1.103.0/24"]
single_nat_gateway = true
enable_nat_gateway = true
enable_vpn_gateway = false
enable_dns_hostnames = true
public_subnet_tags = {
Name = "public"
}
private_subnet_tags = {
Name = "private"
}
public_route_table_tags = {
Name = "public-RT"
}
private_route_table_tags = {
Name = "private-RT"
}
tags = {
Environment = local.environment
Name = local.name
}
}
# ------------
resource "aws_ecs_cluster" "public-ecs-cluster" {
name = "myapp-${local.environment}"
lifecycle {
create_before_destroy = true
}
}
resource "aws_security_group" "ecs-vpc-secgroup" {
name = "ecs-vpc-secgroup"
description = "ecs-vpc-secgroup"
# vpc_id = "vpc-b8daecde"
vpc_id = "${module.vpc.vpc_id}"
ingress {
from_port = 0
to_port = 65535
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "ecs-security-group"
}
}
resource "aws_lb" "nginx-ecs-alb" {
name = "nginx-ecs-alb"
internal = false
load_balancer_type = "application"
subnets = module.vpc.public_subnets
security_groups = ["${aws_security_group.ecs-vpc-secgroup.id}"]
}
resource "aws_alb_target_group" "nginx-ecs-tg" {
name = "nginx-ecs-tg"
port = "80"
protocol = "HTTP"
vpc_id = "${module.vpc.vpc_id}"
health_check {
healthy_threshold = 3
unhealthy_threshold = 10
timeout = 5
interval = 10
path = "/"
}
depends_on = ["aws_lb.nginx-ecs-alb"]
}
resource "aws_alb_listener" "alb_listener" {
load_balancer_arn = "${aws_lb.nginx-ecs-alb.arn}"
port = "80"
protocol = "HTTP"
default_action {
target_group_arn = "${aws_alb_target_group.nginx-ecs-tg.arn}"
type = "forward"
}
}
resource "aws_ecs_task_definition" "nginx-image" {
family = "nginx-server"
network_mode = "bridge"
container_definitions = <<DEFINITION
[
{
"name": "nginx-web",
"image": "nginx:latest",
"essential": true,
"portMappings": [
{
"containerPort": 80,
"hostPort": 0,
"protocol": "tcp"
}
],
"memory": 128,
"cpu": 10
}
]
DEFINITION
}
data "aws_ecs_task_definition" "nginx-image" {
depends_on = ["aws_ecs_task_definition.nginx-image"]
task_definition = "${aws_ecs_task_definition.nginx-image.family}"
}
resource "aws_launch_configuration" "ecs-launch-configuration" {
name = "ecs-launch-configuration"
image_id = "ami-0d5d9d301c853a04a"
instance_type = "t2.micro"
iam_instance_profile = "ecsInstanceRole"
root_block_device {
volume_type = "standard"
volume_size = 35
delete_on_termination = true
}
security_groups = ["${aws_security_group.ecs-vpc-secgroup.id}"]
associate_public_ip_address = "true"
key_name = "myapp"
user_data = <<-EOF
#!/bin/bash
echo ECS_CLUSTER=${aws_ecs_cluster.public-ecs-cluster.name} >> /etc/ecs/ecs.config
EOF
}
resource "aws_autoscaling_group" "ecs-autoscaling-group" {
name = "ecs-autoscaling-group"
max_size = "1"
min_size = "1"
desired_capacity = "1"
# vpc_zone_identifier = ["subnet-5c66053a", "subnet-9cd1a2d4"]
vpc_zone_identifier = module.vpc.public_subnets
launch_configuration = "${aws_launch_configuration.ecs-launch-configuration.name}"
health_check_type = "EC2"
default_cooldown = "300"
lifecycle {
create_before_destroy = true
}
tag {
key = "Name"
value = "wizardet972_ecs-instance"
propagate_at_launch = true
}
tag {
key = "Owner"
value = "Wizardnet972"
propagate_at_launch = true
}
}
resource "aws_autoscaling_policy" "ecs-scale" {
name = "ecs-scale-policy"
policy_type = "TargetTrackingScaling"
autoscaling_group_name = "${aws_autoscaling_group.ecs-autoscaling-group.name}"
estimated_instance_warmup = 60
target_tracking_configuration {
predefined_metric_specification {
predefined_metric_type = "ASGAverageCPUUtilization"
}
target_value = "70"
}
}
resource "aws_ecs_service" "nginx-ecs-service" {
name = "nginx-ecs-service"
cluster = "${aws_ecs_cluster.public-ecs-cluster.id}"
task_definition = "${aws_ecs_task_definition.nginx-image.family}:${max("${aws_ecs_task_definition.nginx-image.revision}", "${aws_ecs_task_definition.nginx-image.revision}")}"
launch_type = "EC2"
desired_count = 1
load_balancer {
target_group_arn = "${aws_alb_target_group.nginx-ecs-tg.arn}"
container_name = "nginx-web"
container_port = 80
}
depends_on = ["aws_ecs_task_definition.nginx-image"]
}
Update:
I tried to create the terraform stack you shared with me, I was able to reproduce the issue.
The issue was, The ec2 instance was unhealthy and the autoscaling group was continuously terminating the instance and launch a new one.
the solution was to remove the following configuration.I think the volume_type standard was causing trouble.
root_block_device {
volume_type = "standard"
volume_size = 100
delete_on_termination = true
}
See if you have done the basic steps to prepare the ec2 instance. You should use an ecs-optimized ami to create the instance and then attach the AmazonEC2ContainerServiceforEC2Role permission to IAM role.
Reference:
AWS ECS Error when running task: No Container Instances were found in your cluster
setup instance role