Why does my AWS lb target group stay in a "draining" state? - amazon-web-services

I'm trying to deploy a docker image via terraform and AWS ECS using Fargate. Using terraform, I've created a VPC, two private and two public subnets, a ECR repository to store the image, an ECS cluster, ECS task, ECS service, and a load balancer with a target group.
These resources are created successfully, but the target group is constantly:
varying in the number of targets that are shown. For instance, refreshing will sometimes show 3 registered targets. Sometimes it will show 4.
Usually have a status of "draining" and details that say "Target deregistration in progress". Sometimes one of them will have a status of "initial" and details that say "Target registration in progress"
Additionally, visiting the URL of the load balancer returns a "503 Service Temporarily Unavailable"
I came across this post, that led to me this article, which helped me better understand how Fargate works but I'm having trouble translating this into the terraform + aws method I'm trying to implement.
I'm suspecting the issue could be in how the security groups are allowing/disallowing traffic but I'm still a novice with dev ops stuff so I appreciate in advance any help offered.
Here is the terraform main.tf that I've used to create the resources. Most of it is gathered from different tutorials and adjusted with updates whenever terraform screamed at me about a deprecation.
So, which parts of the following configuration is wrong and is causing the target groups to constantly be in a draining state?
Again, thanks in advance for any help or insights provided!
# ..terraform/main.tf
# START CREATE VPC
resource "aws_vpc" "vpc" {
cidr_block = "10.0.0.0/16"
instance_tenancy= "default"
enable_dns_hostnames = true
enable_dns_support = true
enable_classiclink = false
tags = {
Name = "vpc"
}
}
# END CREATE VPC
# START CREATE PRIVATE AND PUBLIC SUBNETS
resource "aws_subnet" "public_subnet_1" {
vpc_id = aws_vpc.vpc.id
cidr_block = "10.0.1.0/24"
map_public_ip_on_launch = true
availability_zone = "us-east-1a"
tags = {
Name = "public-subnet-1"
}
}
resource "aws_subnet" "public_subnet_2" {
vpc_id = aws_vpc.vpc.id
cidr_block = "10.0.2.0/24"
map_public_ip_on_launch = true
availability_zone = "us-east-1b"
tags = {
Name = "public-subnet-2"
}
}
resource "aws_subnet" "private_subnet_1" {
vpc_id = aws_vpc.vpc.id
cidr_block = "10.0.3.0/24"
map_public_ip_on_launch = false
availability_zone = "us-east-1a"
tags = {
Name = "private-subnet-1"
}
}
resource "aws_subnet" "private_subnet_2" {
vpc_id = aws_vpc.vpc.id
cidr_block = "10.0.4.0/24"
map_public_ip_on_launch = false
availability_zone = "us-east-1b"
tags = {
Name = "private-subnet-1"
}
}
# END CREATE PRIVATE AND PUBLIC SUBNETS
# START CREATE GATEWAY
resource "aws_internet_gateway" "vpc_gateway" {
vpc_id = aws_vpc.vpc.id
tags = {
Name = "vpc-gateway"
}
}
# END CREATE GATEWAY
# START CREATE ROUTE TABLE AND ASSOCIATIONS
resource "aws_route_table" "public_route_table" {
vpc_id = aws_vpc.vpc.id
route {
cidr_block = "0.0.0.0/0"
gateway_id = aws_internet_gateway.vpc_gateway.id
}
tags = {
Name = "public-route-table"
}
}
resource "aws_route_table_association" "route_table_association_1" {
subnet_id = aws_subnet.public_subnet_1.id
route_table_id = aws_route_table.public_route_table.id
}
resource "aws_route_table_association" "route_table_association_2" {
subnet_id = aws_subnet.public_subnet_2.id
route_table_id = aws_route_table.public_route_table.id
}
# END CREATE ROUTE TABLE AND ASSOCIATIONS
# START CREATE ECR REPOSITORY
resource "aws_ecr_repository" "api_ecr_repository" {
name = "api-ecr-repository"
}
# END CREATE ECR REPOSITORY
# START CREATE ECS CLUSTER
resource "aws_ecs_cluster" "api_cluster" {
name = "api-cluster"
}
# END CREATE ECS CLUSTER
# START CREATE ECS TASK AND DESIGNATE 'FARGATE'
resource "aws_ecs_task_definition" "api_cluster_task" {
family = "api-cluster-task"
container_definitions = <<DEFINITION
[
{
"name": "api-cluster-task",
"image": "${aws_ecr_repository.api_ecr_repository.repository_url}",
"essential": true,
"portMappings": [
{
"containerPort": 4000,
"hostPort": 4000
}
],
"memory": 512,
"cpu": 256
}
]
DEFINITION
requires_compatibilities = ["FARGATE"]
network_mode = "awsvpc"
memory = 512
cpu = 256
execution_role_arn = aws_iam_role.ecs_task_execution_role.arn
}
# END CREATE ECS TASK AND DESIGNATE 'FARGATE'
# START CREATE TASK POLICIES
data "aws_iam_policy_document" "assume_role_policy" {
version = "2012-10-17"
statement {
sid = ""
effect = "Allow"
actions = ["sts:AssumeRole"]
principals {
type = "Service"
identifiers = ["ecs-tasks.amazonaws.com"]
}
}
}
resource "aws_iam_role" "ecs_task_execution_role" {
name = "ecs-take-execution-role"
assume_role_policy = data.aws_iam_policy_document.assume_role_policy.json
}
resource "aws_iam_role_policy_attachment" "ecs_task_execution_role_attachment" {
role = aws_iam_role.ecs_task_execution_role.name
policy_arn = "arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy"
}
# END CREATE TASK POLICIES
# START CREATE ECS SERVICE
resource "aws_ecs_service" "api_cluster_service" {
name = "api-cluster-service"
cluster = aws_ecs_cluster.api_cluster.id
task_definition = aws_ecs_task_definition.api_cluster_task.arn
launch_type = "FARGATE"
desired_count = 1
load_balancer {
target_group_arn = aws_lb_target_group.api_lb_target_group.arn
container_name = aws_ecs_task_definition.api_cluster_task.family
container_port = 4000
}
network_configuration {
security_groups = [aws_security_group.ecs_tasks.id]
subnets = [
aws_subnet.public_subnet_1.id,
aws_subnet.public_subnet_2.id
]
assign_public_ip = true
}
depends_on = [aws_lb_listener.api_lb_listener, aws_iam_role_policy_attachment.ecs_task_execution_role_attachment]
}
resource "aws_security_group" "api_cluster_security_group" {
vpc_id = aws_vpc.vpc.id
ingress {
from_port = 0
to_port = 0
protocol = -1
security_groups = [aws_security_group.load_balancer_security_group.id]
}
egress {
from_port = 0
to_port = 0
protocol = -1
cidr_blocks = ["0.0.0.0/0"]
}
}
# END CREATE ECS SERVICE
# CREATE LOAD BALANCER
resource "aws_alb" "api_load_balancer" {
name = "api-load-balancer"
load_balancer_type = "application"
subnets = [
aws_subnet.public_subnet_1.id,
aws_subnet.public_subnet_2.id
]
security_groups = [aws_security_group.load_balancer_security_group.id]
}
resource "aws_security_group" "load_balancer_security_group" {
name = "allow-load-balancer-traffic"
vpc_id = aws_vpc.vpc.id
ingress {
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
}
# END CREATE LOAD BALANCER
# CREATE ECS TASK SECURITY GROUP
resource "aws_security_group" "ecs_tasks" {
name = "ecs-tasks-sg"
description = "allow inbound access from the ALB only"
vpc_id = aws_vpc.vpc.id
ingress {
protocol = "tcp"
from_port = 4000
to_port = 4000
cidr_blocks = ["0.0.0.0/0"]
security_groups = [aws_security_group.load_balancer_security_group.id]
}
egress {
protocol = "-1"
from_port = 0
to_port = 0
cidr_blocks = ["0.0.0.0/0"]
}
}
# END ECS TASK SECURITY GROUP
# START CREATE LOAD BALANCER TARGET GROUP
resource "aws_lb_target_group" "api_lb_target_group" {
name = "api-lb-target-group"
vpc_id = aws_vpc.vpc.id
port = 80
protocol = "HTTP"
target_type = "ip"
health_check {
healthy_threshold= "3"
interval = "90"
protocol = "HTTP"
matcher = "200-299"
timeout = "20"
path = "/"
unhealthy_threshold = "2"
}
}
# END CREATE LOAD BALANCER TARGET GROUP
# START CREATE LOAD BALANCER LISTENER
resource "aws_lb_listener" "api_lb_listener" {
load_balancer_arn = aws_alb.api_load_balancer.arn
port = 80
protocol = "HTTP"
default_action {
type = "forward"
target_group_arn = aws_lb_target_group.api_lb_target_group.arn
}
}
# END CREATE LOAD BALANCER LISTENER

Your are not using api_cluster_security_group at all in your setup, thus its not clear what it is its purpose. Also in your aws_security_group.ecs_tasks you are allowing only port 4000. However, due to dynamic port mapping between ALB and ECS services, you should allow all ports, not only 4000.
There could be other issues, which are not apparent yet.

Related

AWS Terraform - Code Build to Elastic Container Service

I'm trying to deploy AWS CI CD pipeline, where the developer commits in AWS Commit and AWS build will take of care of the building the docker image and pushes it to Elastic Container Registry. From the ECR to ECS fargate the deployment should be done.
I have tried it to do, I need help in fetching the docker image url from code build to move it to ECR. Find the below code
resource "aws_codecommit_repository" "repo" {
repository_name = var.repository_name
description = var.description
default_branch = var.default_branch
# Tags
tags = var.tags
}
# Triggers
resource "aws_codecommit_trigger" "triggers" {
count = length(var.triggers)
repository_name = aws_codecommit_repository.repo.repository_name
trigger {
name = lookup(element(var.triggers, count.index), "name")
events = lookup(element(var.triggers, count.index), "events")
destination_arn = lookup(element(var.triggers, count.index), "destination_arn")
}
}
resource "aws_ecr_repository" "my_sample_ecr_repo" {
name = "my-sample-ecr-repo"
}
resource "aws_codebuild_project" "codebuild_project" {
name = "sample-code"
description = "Codebuild demo with Terraform"
build_timeout = "120"
artifacts {
type = "NO_ARTIFACTS"
}
source {
type = "CodeCommit"
location = lookup(var.repository_name)
}
environment {
image = lookup(var.codebuild_params, "IMAGE")
type = lookup(var.codebuild_params, "TYPE")
compute_type = lookup(var.codebuild_params, "COMPUTE_TYPE")
image_pull_credentials_type = lookup(var.codebuild_params, "CRED_TYPE")
privileged_mode = true
dynamic "environment_variable" {
for_each = var.environment_variables
content {
name = environment_variable.key
value = environment_variable.value
}
}
}
logs_config {
cloudwatch_logs {
status = "DISABLED"
}
s3_logs {
status = "DISABLED"
}
}
}
resource "aws_ecs_cluster" "my_cluster" {
name = "my-cluster" # Naming the cluster
}
resource "aws_ecs_task_definition" "my_sample_task" {
family = "my-sample-task" # Naming the task
container_definitions = <<DEFINITION
[
{
"name": "my-sample-task",
"image": "${aws_ecr_repository.my_sample_ecr_repo.repository_url}",
"essential": true,
"portMappings": [
{
"containerPort": 3000,
"hostPort": 3000
}
]
"memory": 512,
"cpu": 256
}
]
DEFINITION
requires_compatibilities = ["FARGATE"] # Stating that we are using ECS Fargate
network_mode = "awsvpc" # Using awsvpc as our network mode as this is required for Fargate
memory = 512 # Specifying the memory our container requires
cpu = 256 # Specifying the CPU our container requires
execution_role_arn = "${aws_iam_role.ecsTaskExecutionRole.arn}"
}
resource "aws_iam_role" "ecsTaskExecutionRole" {
name = "ecsTaskExecutionRole"
assume_role_policy = "${data.aws_iam_policy_document.assume_role_policy.json}"
}
data "aws_iam_policy_document" "assume_role_policy" {
statement {
actions = ["sts:AssumeRole"]
principals {
type = "Service"
identifiers = ["ecs-tasks.amazonaws.com"]
}
}
}
resource "aws_iam_role_policy_attachment" "ecsTaskExecutionRole_policy" {
role = "${aws_iam_role.ecsTaskExecutionRole.name}"
policy_arn = "arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy"
}
resource "aws_ecs_service" "my_sample_service" {
name = "my-sample-service" # Naming our first service
cluster = "${aws_ecs_cluster.my_cluster.id}" # Referencing our created Cluster
task_definition = "${aws_ecs_task_definition.my_sample_task.arn}" # Referencing the task our service will spin up
launch_type = "FARGATE"
desired_count = 1 # Setting the number of containers to 3
load_balancer {
target_group_arn = "${aws_lb_target_group.target_group.arn}" # Referencing our target group
container_name = "${aws_ecs_task_definition.my_sample_task.family}"
container_port = 3000 # Specifying the container port
}
network_configuration {
subnets = ["${aws_default_subnet.default_subnet_a.id}", "${aws_default_subnet.default_subnet_b.id}", "${aws_default_subnet.default_subnet_c.id}"]
assign_public_ip = true # Providing our containers with public IPs
security_groups = ["${aws_security_group.service_security_group.id}"] # Setting the security group
}
}
resource "aws_default_vpc" "default_vpc" {
}
# Providing a reference to our default subnets
resource "aws_default_subnet" "default_subnet_a" {
availability_zone = "ap-south-1c"
}
resource "aws_default_subnet" "default_subnet_b" {
availability_zone = "ap-south-1b"
}
resource "aws_default_subnet" "default_subnet_c" {
availability_zone = "ap-south-1a"
}
resource "aws_security_group" "service_security_group" {
ingress {
from_port = 0
to_port = 0
protocol = "-1"
# Only allowing traffic in from the load balancer security group
security_groups = ["${aws_security_group.load_balancer_security_group.id}"]
}
egress {
from_port = 0 # Allowing any incoming port
to_port = 0 # Allowing any outgoing port
protocol = "-1" # Allowing any outgoing protocol
cidr_blocks = ["0.0.0.0/0"] # Allowing traffic out to all IP addresses
}
}
resource "aws_alb" "application_load_balancer" {
name = "test-lb-tf" # Naming our load balancer
load_balancer_type = "application"
subnets = [ # Referencing the default subnets
"${aws_default_subnet.default_subnet_a.id}",
"${aws_default_subnet.default_subnet_b.id}",
"${aws_default_subnet.default_subnet_c.id}"
]
# Referencing the security group
security_groups = ["${aws_security_group.load_balancer_security_group.id}"]
}
resource "aws_security_group" "load_balancer_security_group" {
ingress {
from_port = 80 # Allowing traffic in from port 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"] # Allowing traffic in from all sources
}
egress {
from_port = 0 # Allowing any incoming port
to_port = 0 # Allowing any outgoing port
protocol = "-1" # Allowing any outgoing protocol
cidr_blocks = ["0.0.0.0/0"] # Allowing traffic out to all IP addresses
}
}
resource "aws_lb_target_group" "target_group" {
name = "target-group"
port = 80
protocol = "HTTP"
target_type = "ip"
vpc_id = "${aws_default_vpc.default_vpc.id}" # Referencing the default VPC
health_check {
matcher = "200,301,302"
path = "/"
}
}
resource "aws_lb_listener" "listener" {
load_balancer_arn = "${aws_alb.application_load_balancer.arn}" # Referencing our load balancer
port = "80"
protocol = "HTTP"
default_action {
type = "forward"
target_group_arn = "${aws_lb_target_group.target_group.arn}" # Referencing our tagrte group
}
}

Solving conflicting route table issue on ALB

tldr;
I can't access to my service through the ALB DNS name. Trying to reach the URL will timeout.
I noticed that from IGW and Nate there's an isolated routed subnet (Public Subnet 2) and also a task that's not being exposed through the ALB because somehow it got a different attached subnet.
More general context
Got Terraform modules defining
ECS cluster, service and task definition
ALB setup, including a target group and a listener
Got a couple subnets and a security group for ALB
Got private subnets and own sg for ECS
Target group port is the same as container port already
Using CodePipeline a get a task running, I can see logs of my service meaning it starts.
Some questions
Can I have multiple IGW associated to a single NAT within a single VPC?
Tasks get attached a couple private subnets and a sg with permissions to the alb sg. Also, tasks should access a Redis instance so I'm also attaching to them a SG and a subnet where Elastic Cache node lives (shown in the terraform module below). Any advise here?
ALB and networking resources
variable "vpc_id" {
type = string
default = "vpc-0af6233d57f7a6e1b"
}
variable "environment" {
type = string
default = "dev"
}
data "aws_vpc" "vpc" {
id = var.vpc_id
}
### Public subnets
resource "aws_subnet" "public_subnet_us_east_1a" {
vpc_id = data.aws_vpc.vpc.id
cidr_block = "10.0.10.0/24"
map_public_ip_on_launch = true
availability_zone = "us-east-1a"
tags = {
Name = "audible-blog-us-${var.environment}-public-subnet-1a"
}
}
resource "aws_subnet" "public_subnet_us_east_1b" {
vpc_id = data.aws_vpc.vpc.id
cidr_block = "10.0.11.0/24"
availability_zone = "us-east-1b"
map_public_ip_on_launch = true
tags = {
Name = "audible-blog-us-${var.environment}-public-subnet-1b"
}
}
### Private subnets
resource "aws_subnet" "private_subnet_us_east_1a" {
vpc_id = data.aws_vpc.vpc.id
cidr_block = "10.0.12.0/24"
map_public_ip_on_launch = true
availability_zone = "us-east-1a"
tags = {
Name = "audible-blog-us-${var.environment}-private-subnet-1a"
}
}
resource "aws_subnet" "private_subnet_us_east_1b" {
vpc_id = data.aws_vpc.vpc.id
cidr_block = "10.0.13.0/24"
availability_zone = "us-east-1b"
tags = {
Name = "audible-blog-us-${var.environment}-private-subnet-1b"
}
}
# Create a NAT gateway with an EIP for each private subnet to get internet connectivity
resource "aws_eip" "gw_a" {
vpc = true
}
resource "aws_eip" "gw_b" {
vpc = true
}
resource "aws_nat_gateway" "gw_a" {
subnet_id = aws_subnet.public_subnet_us_east_1a.id
allocation_id = aws_eip.gw_a.id
}
resource "aws_nat_gateway" "gw_b" {
subnet_id = aws_subnet.public_subnet_us_east_1b.id
allocation_id = aws_eip.gw_b.id
}
# Create a new route table for the private subnets
# And make it route non-local traffic through the NAT gateway to the internet
resource "aws_route_table" "private_a" {
vpc_id = data.aws_vpc.vpc.id
route {
cidr_block = "0.0.0.0/0"
nat_gateway_id = aws_nat_gateway.gw_a.id
}
}
resource "aws_route_table" "private_b" {
vpc_id = data.aws_vpc.vpc.id
route {
cidr_block = "0.0.0.0/0"
nat_gateway_id = aws_nat_gateway.gw_b.id
}
}
# Explicitely associate the newly created route tables to the private subnets (so they don't default to the main route table)
resource "aws_route_table_association" "private_a" {
subnet_id = aws_subnet.private_subnet_us_east_1a.id
route_table_id = aws_route_table.private_a.id
}
resource "aws_route_table_association" "private_b" {
subnet_id = aws_subnet.private_subnet_us_east_1b.id
route_table_id = aws_route_table.private_b.id
}
# This is the group you need to edit if you want to restrict access to your application
resource "aws_security_group" "alb_sg" {
name = "audible-blog-us-${var.environment}-lb-sg"
description = "Internet to ALB Security Group"
vpc_id = data.aws_vpc.vpc.id
ingress {
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
ingress {
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
name = "audible-blog-us-${var.environment}-lb-sg"
}
}
# Traffic to the ECS Cluster should only come from the ALB
resource "aws_security_group" "ecs_tasks_sg" {
name = "audible-blog-us-${var.environment}-ecs-sg"
description = "ALB to ECS Security Group"
vpc_id = data.aws_vpc.vpc.id
ingress {
from_port = 8080
to_port = 8080
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
security_groups = [ aws_security_group.alb_sg.id ]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
name = "audible-blog-us-${var.environment}-ecs-sg"
}
}
resource "aws_alb" "alb" {
name = "audible-blog-us-${var.environment}-alb"
internal = false
load_balancer_type = "application"
subnets = [ aws_subnet.public_subnet_us_east_1a.id, aws_subnet.public_subnet_us_east_1b.id ]
security_groups = [ aws_security_group.alb_sg.id ]
tags = {
name = "audible-blog-us-${var.environment}-alb"
environment = var.environment
}
}
resource "aws_alb_target_group" "target_group" {
name = "audible-blog-us-${var.environment}-target-group"
port = "8080"
protocol = "HTTP"
vpc_id = data.aws_vpc.vpc.id
target_type = "ip"
health_check {
enabled = true
path = "/blog"
interval = 30
matcher = "200-304"
port = "traffic-port"
unhealthy_threshold = 5
}
depends_on = [aws_alb.alb]
}
resource "aws_alb_listener" "web_app_http" {
load_balancer_arn = aws_alb.alb.arn
port = 80
protocol = "HTTP"
depends_on = [aws_alb_target_group.target_group]
default_action {
target_group_arn = aws_alb_target_group.target_group.arn
type = "forward"
}
}
output "networking_details" {
value = {
load_balancer_arn = aws_alb.alb.arn
load_balancer_target_group_arn = aws_alb_target_group.target_group.arn
subnets = [
aws_subnet.private_subnet_us_east_1a.id,
aws_subnet.private_subnet_us_east_1b.id
]
security_group = aws_security_group.ecs_tasks_sg.id
}
}
ECS Fargate module
module "permissions" {
source = "./permissions"
environment = var.environment
}
resource "aws_ecs_cluster" "cluster" {
name = "adl-blog-us-${var.environment}"
}
resource "aws_cloudwatch_log_group" "logs_group" {
name = "/ecs/adl-blog-us-next-${var.environment}"
retention_in_days = 90
}
resource "aws_ecs_task_definition" "task" {
family = "adl-blog-us-task-${var.environment}"
container_definitions = jsonencode([
{
name = "adl-blog-us-next"
image = "536299334720.dkr.ecr.us-east-1.amazonaws.com/adl-blog-us:latest"
portMappings = [
{
containerPort = 8080
hostPort = 8080
},
{
containerPort = 6379
hostPort = 6379
}
]
environment: [
{
"name": "ECS_TASK_FAMILY",
"value": "adl-blog-us-task-${var.environment}"
}
],
logConfiguration: {
logDriver: "awslogs",
options: {
awslogs-group: "/ecs/adl-blog-us-next-${var.environment}",
awslogs-region: "us-east-1",
awslogs-stream-prefix: "ecs"
}
},
healthCheck: {
retries: 3,
command: [
"CMD-SHELL",
"curl -sf http://localhost:8080/blog || exit 1"
],
timeout: 5,
interval: 30,
startPeriod: null
}
}
])
cpu = 256
memory = 512
requires_compatibilities = ["FARGATE"]
network_mode = "awsvpc"
execution_role_arn = module.permissions.task_definition_execution_role_arn
task_role_arn = module.permissions.task_definition_execution_role_arn
}
resource "aws_ecs_service" "service" {
name = "adl-blog-us-task-service-${var.environment}"
cluster = aws_ecs_cluster.cluster.id
deployment_controller {
type = "ECS"
}
deployment_maximum_percent = 200
deployment_minimum_healthy_percent = 50
task_definition = aws_ecs_task_definition.task.family
desired_count = 3
launch_type = "FARGATE"
network_configuration {
subnets = concat(
var.public_alb_networking_details.subnets,
[ var.private_networking_details.subnet.id ]
)
security_groups = [
var.public_alb_networking_details.security_group,
var.private_networking_details.security_group.id
]
assign_public_ip = true
}
load_balancer {
target_group_arn = var.public_alb_networking_details.load_balancer_target_group_arn
container_name = "adl-blog-us-next"
container_port = 8080
}
force_new_deployment = true
lifecycle {
ignore_changes = [desired_count]
}
depends_on = [
module.permissions
]
}
variable "private_networking_details" {}
variable "public_alb_networking_details" {}
variable "environment" {
type = string
}
Your container ports are 8080 and 6379. However your target group says its 80. So you have to double check what are your actual ports that you use on Fargate and adjust your TG accordingly.
There could be other issues as well, which aren't yet apparent. For example, you are opening port 443, but there is no listener for that. So any attempt of using https will fail.

how to apply security groups to aws_elasticache_replication_group

My terraform script is as follow: eveything in VPC
resource "aws_security_group" "cacheSecurityGroup" {
name = "${var.devname}-${var.namespace}-${var.stage}-RedisCache-SecurityGroup"
vpc_id = var.vpc.vpc_id
tags = var.default_tags
ingress {
protocol = "tcp"
from_port = 6379
to_port = 6379
cidr_blocks = ["0.0.0.0/0"]
ipv6_cidr_blocks = ["::/0"]
}
egress {
protocol = "-1"
from_port = 0
to_port = 0
cidr_blocks = ["0.0.0.0/0"]
ipv6_cidr_blocks = ["::/0"]
}
}
resource "aws_elasticache_parameter_group" "usagemonitorCacheParameterGroup" {
name = "${var.devname}${var.namespace}${var.stage}-usagemonitor-cache-parameterGroup"
family = "redis6.x"
}
resource "aws_elasticache_subnet_group" "redis_subnet_group" {
name = "${var.devname}${var.namespace}${var.stage}-usagemonitor-cache-subnetGroup"
subnet_ids = var.vpc.database_subnets
}
resource "aws_elasticache_replication_group" "replication_group_usagemonitor" {
replication_group_id = "${var.devname}${var.namespace}${var.stage}-usagemonitor-cache"
replication_group_description = "Replication group for Usagemonitor"
node_type = "cache.t2.micro"
number_cache_clusters = 2
parameter_group_name = aws_elasticache_parameter_group.usagemonitorCacheParameterGroup.name
subnet_group_name = aws_elasticache_subnet_group.redis_subnet_group.name
#security_group_names = [aws_elasticache_security_group.bar.name]
automatic_failover_enabled = true
at_rest_encryption_enabled = true
port = 6379
}
if i uncomment the line
#security_group_names = [aws_elasticache_security_group.bar.name]
am getting
i get following error:
Error: Error creating Elasticache Replication Group: InvalidParameterCombination: Use of cache security groups is not permitted along with cache subnet group and/or security group Ids.
status code: 400, request id: 4e70e86d-b868-45b3-a1d2-88ab652dc85e
i read that we dont have to use aws_elasticache_security_group if all resources are inside VPC. What the correct way to assign security groups to aws_elasticache_replication_group ??? usinf subnets??? how ???
I do something like this, I believe this is the best way to assign required configuration:
resource "aws_security_group" "redis" {
name_prefix = "${var.name_prefix}-redis-"
vpc_id = var.vpc_id
lifecycle {
create_before_destroy = true
}
}
resource "aws_elasticache_replication_group" "redis" {
...
engine = "redis"
subnet_group_name = aws_elasticache_subnet_group.redis.name
security_group_ids = concat(var.security_group_ids, [aws_security_group.redis.id])
}
Your subnet group basically includes all private or public subnets from your VPC where the elasticache replication group is going to be created.
In general, use security group ids instead of names.
I have written a terraform module that definitely works and if you interested it is available under with examples https://github.com/umotif-public/terraform-aws-elasticache-redis.

Terraform - can't reach web server - instance out of service

I'm running the below terraform code to deploy an ec2 instance inside a VPC to work as web server but for some reason I cant reach the website and cant shh to it, I have set the ingress and egress rules properly I believe:
########Provider########
provider "aws" {
region = "us-west-2"
access_key = "[redacted]"
secret_key = "[redacted]"
}
########VPC########
resource "aws_vpc" "vpc1" {
cidr_block = "10.1.0.0/16"
tags = {
Name = "Production"
}
}
########Internet GW########
resource "aws_internet_gateway" "gw" {
vpc_id = aws_vpc.vpc1.id
}
########Route table########
resource "aws_route_table" "rt" {
vpc_id = aws_vpc.vpc1.id
route {
cidr_block = "0.0.0.0/24"
gateway_id = aws_internet_gateway.gw.id
}
route {
ipv6_cidr_block = "::/0"
gateway_id = aws_internet_gateway.gw.id
}
}
########Sub Net########
resource "aws_subnet" "subnet1" {
vpc_id = aws_vpc.vpc1.id
cidr_block = "10.1.0.0/24"
availability_zone = "us-west-2a"
map_public_ip_on_launch = "true"
tags = {
Name = "prod-subnet-1"
}
}
########RT assosiation########
resource "aws_route_table_association" "a" {
subnet_id = aws_subnet.subnet1.id
route_table_id = aws_route_table.rt.id
}
########Security Group########
resource "aws_security_group" "sec1" {
name = "allow_web"
description = "Allow web inbound traffic"
vpc_id = aws_vpc.vpc1.id
ingress {
description = "HTTP from VPC"
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["10.1.0.0/16"]
}
#SSH access from anywhere
ingress {
description = "SSH from VPC"
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "allow_web"
}
}
########Net Interface for the Instance########
#resource "aws_network_interface" "wsn" {
# subnet_id = aws_subnet.subnet1.id
# private_ips = ["10.0.1.50"]
# security_groups = [aws_security_group.sec1.id]
#}
########Load Balancer########
resource "aws_elb" "elb" {
name = "lb"
subnets = [aws_subnet.subnet1.id]
security_groups = [aws_security_group.sec1.id]
instances = [aws_instance.web1.id]
listener {
instance_port = 80
instance_protocol = "http"
lb_port = 80
lb_protocol = "http"
}
}
########EC2 Instance########
resource "aws_instance" "web1" {
ami = "ami-003634241a8fcdec0" #ubuntu 18.4
instance_type = "t2.micro"
availability_zone = "us-west-2a"
key_name = "main-key"
subnet_id = aws_subnet.subnet1.id
#network_interface {
# device_index = 0
# network_interface_id = aws_network_interface.wsn.id
#}
user_data = <<-EOF
#!/bin/bash
sudo apt update -y
sudo apt install apache2 -y
sudo systemctl start apache2
sudo bash -c 'echo Hello world!!! > /var/www/html/index.html'
EOF
tags = {
Name = "HelloWorld"
}
}
output "aws_elb_public_dns" {
value = aws_elb.elb.dns_name
}
The plan and the apply runs all fine but in the loadbalancer the instance is "outofservice"
what could be wrong here??
You are missing security group to your instance: vpc_security_group_ids.
Subsequently, you won't be able to ssh to it nor the http traffic will be allowed from the outside.
Also your route to IGW is incorrect. It should be:
cidr_block = "0.0.0.0/0"
Same for SG for your ELB to allow traffic from the internet. It should be:
cidr_blocks = ["0.0.0.0/0"]

AWS ECS Terraform: The requested configuration is currently not supported. Launching EC2 instance failed

I have been trying to spin up ECS using terraform. About two days ago it was working as expected, however today I tried to run terraform apply and I keep getting an error saying
"The requested configuration is currently not supported. Launching EC2 instance failed"
I have researched a lot about this issue, I tried hardcoding the VPC tenancy to default, I've tried changing the region, the instance type and nothing seems to fix the issue.
The is my terraform config:
provider "aws" {
region = var.region
}
data "aws_availability_zones" "available" {}
# Define a vpc
resource "aws_vpc" "motivy_vpc" {
cidr_block = var.motivy_network_cidr
tags = {
Name = var.motivy_vpc
}
enable_dns_support = "true"
instance_tenancy = "default"
enable_dns_hostnames = "true"
}
# Internet gateway for the public subnet
resource "aws_internet_gateway" "motivy_ig" {
vpc_id = aws_vpc.motivy_vpc.id
tags = {
Name = "motivy_ig"
}
}
# Public subnet 1
resource "aws_subnet" "motivy_public_sn_01" {
vpc_id = aws_vpc.motivy_vpc.id
cidr_block = var.motivy_public_01_cidr
availability_zone = data.aws_availability_zones.available.names[0]
tags = {
Name = "motivy_public_sn_01"
}
}
# Public subnet 2
resource "aws_subnet" "motivy_public_sn_02" {
vpc_id = aws_vpc.motivy_vpc.id
cidr_block = var.motivy_public_02_cidr
availability_zone = data.aws_availability_zones.available.names[1]
tags = {
Name = "motivy_public_sn_02"
}
}
# Routing table for public subnet 1
resource "aws_route_table" "motivy_public_sn_rt_01" {
vpc_id = aws_vpc.motivy_vpc.id
route {
cidr_block = "0.0.0.0/0"
gateway_id = aws_internet_gateway.motivy_ig.id
}
tags = {
Name = "motivy_public_sn_rt_01"
}
}
# Routing table for public subnet 2
resource "aws_route_table" "motivy_public_sn_rt_02" {
vpc_id = aws_vpc.motivy_vpc.id
route {
cidr_block = "0.0.0.0/0"
gateway_id = aws_internet_gateway.motivy_ig.id
}
tags = {
Name = "motivy_public_sn_rt_02"
}
}
# Associate the routing table to public subnet 1
resource "aws_route_table_association" "motivy_public_sn_rt_01_assn" {
subnet_id = aws_subnet.motivy_public_sn_01.id
route_table_id = aws_route_table.motivy_public_sn_rt_01.id
}
# Associate the routing table to public subnet 2
resource "aws_route_table_association" "motivy_public_sn_rt_02_assn" {
subnet_id = aws_subnet.motivy_public_sn_02.id
route_table_id = aws_route_table.motivy_public_sn_rt_02.id
}
# ECS Instance Security group
resource "aws_security_group" "motivy_public_sg" {
name = "motivys_public_sg"
description = "Test public access security group"
vpc_id = aws_vpc.motivy_vpc.id
ingress {
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = [
"0.0.0.0/0"]
}
ingress {
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = [
"0.0.0.0/0"]
}
ingress {
from_port = 5000
to_port = 5000
protocol = "tcp"
cidr_blocks = [
"0.0.0.0/0"]
}
ingress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = [
var.motivy_public_01_cidr,
var.motivy_public_02_cidr
]
}
egress {
# allow all traffic to private SN
from_port = "0"
to_port = "0"
protocol = "-1"
cidr_blocks = [
"0.0.0.0/0"]
}
tags = {
Name = "motivy_public_sg"
}
}
data "aws_ecs_task_definition" "motivy_server" {
task_definition = aws_ecs_task_definition.motivy_server.family
}
resource "aws_ecs_task_definition" "motivy_server" {
family = "motivy_server"
container_definitions = file("task-definitions/service.json")
}
data "aws_ami" "latest_ecs" {
most_recent = true # get the latest version
filter {
name = "name"
values = [
"amzn2-ami-ecs-*"] # ECS optimized image
}
owners = [
"amazon" # Only official images
]
}
resource "aws_launch_configuration" "ecs-launch-configuration" {
name = "ecs-launch-configuration"
image_id = data.aws_ami.latest_ecs.id
instance_type = "t2.micro"
iam_instance_profile = aws_iam_instance_profile.ecs-instance-profile.id
root_block_device {
volume_type = "standard"
volume_size = 100
delete_on_termination = true
}
enable_monitoring = true
lifecycle {
create_before_destroy = true
}
security_groups = [aws_security_group.motivy_public_sg.id]
associate_public_ip_address = "true"
key_name = var.ecs_key_pair_name
user_data = <<EOF
#!/bin/bash
echo ECS_CLUSTER=${var.ecs_cluster} >> /etc/ecs/ecs.config
EOF
}
resource "aws_appautoscaling_target" "ecs_motivy_server_target" {
max_capacity = 2
min_capacity = 1
resource_id = "service/${aws_ecs_cluster.motivy_ecs_cluster.name}/${aws_ecs_service.motivy_server_service.name}"
scalable_dimension = "ecs:service:DesiredCount"
service_namespace = "ecs"
depends_on = [ aws_ecs_service.motivy_server_service ]
}
resource "aws_iam_role" "ecs-instance-role" {
name = "ecs-instance-role"
path = "/"
assume_role_policy = data.aws_iam_policy_document.ecs-instance-policy.json
}
data "aws_iam_policy_document" "ecs-instance-policy" {
statement {
actions = ["sts:AssumeRole"]
principals {
type = "Service"
identifiers = ["ec2.amazonaws.com"]
}
}
}
resource "aws_iam_role_policy_attachment" "ecs-instance-role-attachment" {
role = aws_iam_role.ecs-instance-role.name
policy_arn = "arn:aws:iam::aws:policy/service-role/AmazonEC2ContainerServiceforEC2Role"
}
resource "aws_iam_instance_profile" "ecs-instance-profile" {
name = "ecs-instance-profile"
path = "/"
role = aws_iam_role.ecs-instance-role.id
provisioner "local-exec" {
command = "sleep 10"
}
}
resource "aws_autoscaling_group" "motivy-server-autoscaling-group" {
name = "motivy-server-autoscaling-group"
termination_policies = [
"OldestInstance" # When a “scale down” event occurs, which instances to kill first?
]
default_cooldown = 30
health_check_grace_period = 30
max_size = var.max_instance_size
min_size = var.min_instance_size
desired_capacity = var.desired_capacity
# Use this launch configuration to define “how” the EC2 instances are to be launched
launch_configuration = aws_launch_configuration.ecs-launch-configuration.name
lifecycle {
create_before_destroy = true
}
# Refer to vpc.tf for more information
# You could use the private subnets here instead,
# if you want the EC2 instances to be hidden from the internet
vpc_zone_identifier = [aws_subnet.motivy_public_sn_01.id, aws_subnet.motivy_public_sn_02.id]
tags = [{
key = "Name",
value = var.ecs_cluster,
# Make sure EC2 instances are tagged with this tag as well
propagate_at_launch = true
}]
}
resource "aws_alb" "motivy_server_alb_load_balancer" {
name = "motivy-alb-load-balancer"
security_groups = [aws_security_group.motivy_public_sg.id]
subnets = [aws_subnet.motivy_public_sn_01.id, aws_subnet.motivy_public_sn_02.id]
tags = {
Name = "motivy_server_alb_load_balancer"
}
}
resource "aws_alb_target_group" "motivy_server_target_group" {
name = "motivy-server-target-group"
port = 5000
protocol = "HTTP"
vpc_id = aws_vpc.motivy_vpc.id
deregistration_delay = "10"
health_check {
healthy_threshold = "2"
unhealthy_threshold = "6"
interval = "30"
matcher = "200,301,302"
path = "/"
protocol = "HTTP"
timeout = "5"
}
stickiness {
type = "lb_cookie"
}
tags = {
Name = "motivy-server-target-group"
}
}
resource "aws_alb_listener" "alb-listener" {
load_balancer_arn = aws_alb.motivy_server_alb_load_balancer.arn
port = "80"
protocol = "HTTP"
default_action {
target_group_arn = aws_alb_target_group.motivy_server_target_group.arn
type = "forward"
}
}
resource "aws_autoscaling_attachment" "asg_attachment_motivy_server" {
autoscaling_group_name = aws_autoscaling_group.motivy-server-autoscaling-group.id
alb_target_group_arn = aws_alb_target_group.motivy_server_target_group.arn
}
This is the exact error I get
Error: "motivy-server-autoscaling-group": Waiting up to 10m0s: Need at least 2 healthy instances in ASG, have 0. Most recent activity: {
ActivityId: "a775c531-9496-fdf9-5157-ab2448626293",
AutoScalingGroupName: "motivy-server-autoscaling-group",
Cause: "At 2020-04-05T22:10:28Z an instance was started in response to a difference between desired and actual capacity, increasing the capacity from 0 to 2.",
Description: "Launching a new EC2 instance. Status Reason: The requested configuration is currently not supported. Please check the documentation for supported configurations. Launching EC2 instance failed.",
Details: "{\"Subnet ID\":\"subnet-05de5fc0e994d05fe\",\"Availability Zone\":\"us-east-1a\"}",
EndTime: 2020-04-05 22:10:29 +0000 UTC,
Progress: 100,
StartTime: 2020-04-05 22:10:29.439 +0000 UTC,
StatusCode: "Failed",
StatusMessage: "The requested configuration is currently not supported. Please check the documentation for supported configurations. Launching EC2 instance failed."
}
I'm not sure why it worked two days ago.
But recent Amazon ECS-optimized AMIs' volume_type is gp2.
You should choose gp2 as root_block_device.volume_type.
resource "aws_launch_configuration" "ecs-launch-configuration" {
# ...
root_block_device {
volume_type = "gp2"
volume_size = 100
delete_on_termination = true
}
# ...
}
data "aws_ami" "latest_ecs" {
most_recent = true # get the latest version
filter {
name = "name"
values = ["amzn2-ami-ecs-hvm-*-x86_64-ebs"] # ECS optimized image
}
owners = [
"amazon" # Only official images
]
}
For me worked using t3 gen instead of t2