Why is my spring-cloud-config server on aws intermittently not responding? - amazon-web-services

I am running a spring cloud config server on aws which is just a docker container running a spring boot app. It is reading properties from a git repo. Our client applications read config from the server on startup and intermittently at runtime. About a third of the time, the client apps will timeout when pulling config at startup, causing the app to crash. At runtime, the apps seem to succeed 4 out of 5 times, though they will just use existing config if a request fails.
I am using an ec2 instance behind an alb which handles ssl termination. I was orignally using a t3.micro, but upgraded to an m5.large guessing that the t3 class may not support continuous availability.
The alb required 2 subnets, so I created a second with nothing in it initially. I am unsure if the alb will attempt to route to the second subnet at some point, which could be causing the failures. The target group is using a health check which returns correctly, but idk enough about alb's to rule out round-robin'ing to an empty subnet. I attempted to create a second ec2 instance to parallel my first config server in the second subnet. however, I was unable to ssh into the second instance even though it's using the same security group and config as the first. I'm not sure why that failed, but i'm guessing there is something else wrong with my setup.
All infrastructure was deployed with terraform, which I have included below.
resources.tf
provider "aws" {
region = "us-east-2"
version = ">= 2.38.0"
}
data "aws_ami" "amzn_linux" {
most_recent = true
filter {
name = "name"
values = ["amzn2-ami-hvm-2.0.*-x86_64-gp2"]
}
filter {
name = "virtualization-type"
values = ["hvm"]
}
owners = ["137112412989"]
}
resource "aws_vpc" "config-vpc" {
cidr_block = "10.0.0.0/16"
enable_dns_hostnames = true
enable_dns_support = true
}
resource "aws_security_group" "config_sg" {
name = "config-sg"
description = "http, https, and ssh"
vpc_id = aws_vpc.config-vpc.id
ingress {
from_port = 9000
to_port = 9000
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
ingress {
from_port = 8080
to_port = 8080
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
ingress {
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
ingress {
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
ingress {
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
}
resource "aws_subnet" "subnet-alpha" {
cidr_block = cidrsubnet(aws_vpc.config-vpc.cidr_block, 3, 1)
vpc_id = aws_vpc.config-vpc.id
availability_zone = "us-east-2a"
}
resource "aws_subnet" "subnet-beta" {
cidr_block = cidrsubnet(aws_vpc.config-vpc.cidr_block, 3, 2)
vpc_id = aws_vpc.config-vpc.id
availability_zone = "us-east-2b"
}
resource "aws_internet_gateway" "config-vpc-ig" {
vpc_id = aws_vpc.config-vpc.id
}
resource "aws_route_table" "config-vpc-rt" {
vpc_id = aws_vpc.config-vpc.id
route {
cidr_block = "0.0.0.0/0"
gateway_id = aws_internet_gateway.config-vpc-ig.id
}
}
resource "aws_route_table_association" "subnet-association-alpha" {
subnet_id = aws_subnet.subnet-alpha.id
route_table_id = aws_route_table.config-vpc-rt.id
}
resource "aws_route_table_association" "subnet-association-beta" {
subnet_id = aws_subnet.subnet-beta.id
route_table_id = aws_route_table.config-vpc-rt.id
}
resource "aws_alb" "alb" {
name = "config-alb"
subnets = [aws_subnet.subnet-alpha.id, aws_subnet.subnet-beta.id]
security_groups = [aws_security_group.config_sg.id]
}
resource "aws_alb_target_group" "alb_target_group" {
name = "config-tg"
port = 9000
protocol = "HTTP"
vpc_id = aws_vpc.config-vpc.id
health_check {
enabled = true
path = "/actuator/health"
port = 9000
protocol = "HTTP"
}
}
resource "aws_instance" "config_server_alpha" {
ami = data.aws_ami.amzn_linux.id
instance_type = "m5.large"
vpc_security_group_ids = [aws_security_group.config_sg.id]
key_name = "config-ssh"
subnet_id = aws_subnet.subnet-alpha.id
associate_public_ip_address = true
}
resource "aws_instance" "config_server_beta" {
ami = data.aws_ami.amzn_linux.id
instance_type = "m5.large"
vpc_security_group_ids = [aws_security_group.config_sg.id]
key_name = "config-ssh"
subnet_id = aws_subnet.subnet-beta.id
associate_public_ip_address = true
}
resource "aws_alb_target_group_attachment" "config-target-alpha" {
target_group_arn = aws_alb_target_group.alb_target_group.arn
target_id = aws_instance.config_server_alpha.id
port = 9000
}
resource "aws_alb_target_group_attachment" "config-target-beta" {
target_group_arn = aws_alb_target_group.alb_target_group.arn
target_id = aws_instance.config_server_beta.id
port = 9000
}
resource "aws_alb_listener" "alb_listener_80" {
load_balancer_arn = aws_alb.alb.arn
port = 80
default_action {
type = "redirect"
redirect {
port = 443
protocol = "HTTPS"
status_code = "HTTP_301"
}
}
}
resource "aws_alb_listener" "alb_listener_8080" {
load_balancer_arn = aws_alb.alb.arn
port = 8080
default_action {
type = "redirect"
redirect {
port = 443
protocol = "HTTPS"
status_code = "HTTP_301"
}
}
}
resource "aws_alb_listener" "alb_listener_https" {
load_balancer_arn = aws_alb.alb.arn
port = 443
protocol = "HTTPS"
ssl_policy = "ELBSecurityPolicy-2016-08"
certificate_arn = "arn:..."
default_action {
target_group_arn = aws_alb_target_group.alb_target_group.arn
type = "forward"
}
}
config server
#SpringBootApplication
#EnableConfigServer
public class ConfigserverApplication {
public static void main(String[] args) {
SpringApplication.run(ConfigserverApplication.class, args);
}
}
application.yml
spring:
profiles:
active: local
---
spring:
profiles: local, default, cloud
cloud:
config:
server:
git:
uri: ...
searchPaths: '{application}/{profile}'
username: ...
password:...
security:
user:
name: admin
password: ...
server:
port: 9000
management:
endpoint:
health:
show-details: always
info:
git:
mode: FULL
bootstrap.yml
spring:
application:
name: config-server
encrypt:
key: |
-----BEGIN RSA PRIVATE KEY-----
...
-----END RSA PRIVATE KEY-----

Related

User_data not working in Launch configuration. Is it a problem with my security group or my Launch configuration?

I am currently learning Terraform and I need help with regard to the below code. I want to create a simple architecture of an autoscaling group of EC2 instances behind an Application load balancer. The setup gets completed but when I try to access the application endpoint, it gets timed out. When I tried to access the EC2 instances, I was unable to (because EC2 instances were in a security group allowing access from the ALB security group only). I changed the instance security group ingress values and ran the user_data script manually following which I reverted the changes to the instance security group to complete my setup.
My question is why is my setup not working via the below code? Is it because the access is being restricted by the load balancer security group or is my launch configuration block incorrect?
data "aws_ami" "amazon-linux" {
most_recent = true
owners = ["amazon"]
filter {
name = "name"
values = ["amzn2-ami-kernel-5.10-hvm-2.0.20220426.0-x86_64-gp2"]
}
}
data "aws_availability_zones" "available" {
state = "available"
}
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "3.14.0"
name = "main-vpc"
cidr = "10.0.0.0/16"
azs = data.aws_availability_zones.available.names
public_subnets = ["10.0.4.0/24","10.0.5.0/24","10.0.6.0/24"]
enable_dns_hostnames = true
enable_dns_support = true
}
resource "aws_launch_configuration" "TestLC" {
name_prefix = "Lab-Instance-"
image_id = data.aws_ami.amazon-linux.id
instance_type = "t2.nano"
key_name = "CloudformationKeyPair"
user_data = file("./user_data.sh")
security_groups = [aws_security_group.TestInstanceSG.id]
lifecycle {
create_before_destroy = true
}
}
resource "aws_autoscaling_group" "TestASG" {
min_size = 1
max_size = 3
desired_capacity = 2
launch_configuration = aws_launch_configuration.TestLC.name
vpc_zone_identifier = module.vpc.public_subnets
}
resource "aws_lb_listener" "TestListener"{
load_balancer_arn = aws_lb.TestLB.arn
port = "80"
protocol = "HTTP"
default_action {
type = "forward"
target_group_arn = aws_lb_target_group.TestTG.arn
}
}
resource "aws_lb" "TestLB" {
name = "Lab-App-Load-Balancer"
internal = false
load_balancer_type = "application"
security_groups = [aws_security_group.TestLoadBalanceSG.id]
subnets = module.vpc.public_subnets
}
resource "aws_lb_target_group" "TestTG" {
name = "LabTargetGroup"
port = "80"
protocol = "HTTP"
vpc_id = module.vpc.vpc_id
}
resource "aws_autoscaling_attachment" "TestAutoScalingAttachment" {
autoscaling_group_name = aws_autoscaling_group.TestASG.id
lb_target_group_arn = aws_lb_target_group.TestTG.arn
}
resource "aws_security_group" "TestInstanceSG" {
name = "LAB-Instance-SecurityGroup"
ingress{
from_port = 80
to_port = 80
protocol = "tcp"
security_groups = [aws_security_group.TestLoadBalanceSG.id]
}
ingress{
from_port = 22
to_port = 22
protocol = "tcp"
security_groups = [aws_security_group.TestLoadBalanceSG.id]
}
egress{
from_port = 0
to_port = 0
protocol = "-1"
security_groups = [aws_security_group.TestLoadBalanceSG.id]
}
vpc_id = module.vpc.vpc_id
}
resource "aws_security_group" "TestLoadBalanceSG" {
name = "LAB-LoadBalancer-SecurityGroup"
ingress{
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
ingress{
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress{
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
vpc_id = module.vpc.vpc_id
}

Solving conflicting route table issue on ALB

tldr;
I can't access to my service through the ALB DNS name. Trying to reach the URL will timeout.
I noticed that from IGW and Nate there's an isolated routed subnet (Public Subnet 2) and also a task that's not being exposed through the ALB because somehow it got a different attached subnet.
More general context
Got Terraform modules defining
ECS cluster, service and task definition
ALB setup, including a target group and a listener
Got a couple subnets and a security group for ALB
Got private subnets and own sg for ECS
Target group port is the same as container port already
Using CodePipeline a get a task running, I can see logs of my service meaning it starts.
Some questions
Can I have multiple IGW associated to a single NAT within a single VPC?
Tasks get attached a couple private subnets and a sg with permissions to the alb sg. Also, tasks should access a Redis instance so I'm also attaching to them a SG and a subnet where Elastic Cache node lives (shown in the terraform module below). Any advise here?
ALB and networking resources
variable "vpc_id" {
type = string
default = "vpc-0af6233d57f7a6e1b"
}
variable "environment" {
type = string
default = "dev"
}
data "aws_vpc" "vpc" {
id = var.vpc_id
}
### Public subnets
resource "aws_subnet" "public_subnet_us_east_1a" {
vpc_id = data.aws_vpc.vpc.id
cidr_block = "10.0.10.0/24"
map_public_ip_on_launch = true
availability_zone = "us-east-1a"
tags = {
Name = "audible-blog-us-${var.environment}-public-subnet-1a"
}
}
resource "aws_subnet" "public_subnet_us_east_1b" {
vpc_id = data.aws_vpc.vpc.id
cidr_block = "10.0.11.0/24"
availability_zone = "us-east-1b"
map_public_ip_on_launch = true
tags = {
Name = "audible-blog-us-${var.environment}-public-subnet-1b"
}
}
### Private subnets
resource "aws_subnet" "private_subnet_us_east_1a" {
vpc_id = data.aws_vpc.vpc.id
cidr_block = "10.0.12.0/24"
map_public_ip_on_launch = true
availability_zone = "us-east-1a"
tags = {
Name = "audible-blog-us-${var.environment}-private-subnet-1a"
}
}
resource "aws_subnet" "private_subnet_us_east_1b" {
vpc_id = data.aws_vpc.vpc.id
cidr_block = "10.0.13.0/24"
availability_zone = "us-east-1b"
tags = {
Name = "audible-blog-us-${var.environment}-private-subnet-1b"
}
}
# Create a NAT gateway with an EIP for each private subnet to get internet connectivity
resource "aws_eip" "gw_a" {
vpc = true
}
resource "aws_eip" "gw_b" {
vpc = true
}
resource "aws_nat_gateway" "gw_a" {
subnet_id = aws_subnet.public_subnet_us_east_1a.id
allocation_id = aws_eip.gw_a.id
}
resource "aws_nat_gateway" "gw_b" {
subnet_id = aws_subnet.public_subnet_us_east_1b.id
allocation_id = aws_eip.gw_b.id
}
# Create a new route table for the private subnets
# And make it route non-local traffic through the NAT gateway to the internet
resource "aws_route_table" "private_a" {
vpc_id = data.aws_vpc.vpc.id
route {
cidr_block = "0.0.0.0/0"
nat_gateway_id = aws_nat_gateway.gw_a.id
}
}
resource "aws_route_table" "private_b" {
vpc_id = data.aws_vpc.vpc.id
route {
cidr_block = "0.0.0.0/0"
nat_gateway_id = aws_nat_gateway.gw_b.id
}
}
# Explicitely associate the newly created route tables to the private subnets (so they don't default to the main route table)
resource "aws_route_table_association" "private_a" {
subnet_id = aws_subnet.private_subnet_us_east_1a.id
route_table_id = aws_route_table.private_a.id
}
resource "aws_route_table_association" "private_b" {
subnet_id = aws_subnet.private_subnet_us_east_1b.id
route_table_id = aws_route_table.private_b.id
}
# This is the group you need to edit if you want to restrict access to your application
resource "aws_security_group" "alb_sg" {
name = "audible-blog-us-${var.environment}-lb-sg"
description = "Internet to ALB Security Group"
vpc_id = data.aws_vpc.vpc.id
ingress {
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
ingress {
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
name = "audible-blog-us-${var.environment}-lb-sg"
}
}
# Traffic to the ECS Cluster should only come from the ALB
resource "aws_security_group" "ecs_tasks_sg" {
name = "audible-blog-us-${var.environment}-ecs-sg"
description = "ALB to ECS Security Group"
vpc_id = data.aws_vpc.vpc.id
ingress {
from_port = 8080
to_port = 8080
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
security_groups = [ aws_security_group.alb_sg.id ]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
name = "audible-blog-us-${var.environment}-ecs-sg"
}
}
resource "aws_alb" "alb" {
name = "audible-blog-us-${var.environment}-alb"
internal = false
load_balancer_type = "application"
subnets = [ aws_subnet.public_subnet_us_east_1a.id, aws_subnet.public_subnet_us_east_1b.id ]
security_groups = [ aws_security_group.alb_sg.id ]
tags = {
name = "audible-blog-us-${var.environment}-alb"
environment = var.environment
}
}
resource "aws_alb_target_group" "target_group" {
name = "audible-blog-us-${var.environment}-target-group"
port = "8080"
protocol = "HTTP"
vpc_id = data.aws_vpc.vpc.id
target_type = "ip"
health_check {
enabled = true
path = "/blog"
interval = 30
matcher = "200-304"
port = "traffic-port"
unhealthy_threshold = 5
}
depends_on = [aws_alb.alb]
}
resource "aws_alb_listener" "web_app_http" {
load_balancer_arn = aws_alb.alb.arn
port = 80
protocol = "HTTP"
depends_on = [aws_alb_target_group.target_group]
default_action {
target_group_arn = aws_alb_target_group.target_group.arn
type = "forward"
}
}
output "networking_details" {
value = {
load_balancer_arn = aws_alb.alb.arn
load_balancer_target_group_arn = aws_alb_target_group.target_group.arn
subnets = [
aws_subnet.private_subnet_us_east_1a.id,
aws_subnet.private_subnet_us_east_1b.id
]
security_group = aws_security_group.ecs_tasks_sg.id
}
}
ECS Fargate module
module "permissions" {
source = "./permissions"
environment = var.environment
}
resource "aws_ecs_cluster" "cluster" {
name = "adl-blog-us-${var.environment}"
}
resource "aws_cloudwatch_log_group" "logs_group" {
name = "/ecs/adl-blog-us-next-${var.environment}"
retention_in_days = 90
}
resource "aws_ecs_task_definition" "task" {
family = "adl-blog-us-task-${var.environment}"
container_definitions = jsonencode([
{
name = "adl-blog-us-next"
image = "536299334720.dkr.ecr.us-east-1.amazonaws.com/adl-blog-us:latest"
portMappings = [
{
containerPort = 8080
hostPort = 8080
},
{
containerPort = 6379
hostPort = 6379
}
]
environment: [
{
"name": "ECS_TASK_FAMILY",
"value": "adl-blog-us-task-${var.environment}"
}
],
logConfiguration: {
logDriver: "awslogs",
options: {
awslogs-group: "/ecs/adl-blog-us-next-${var.environment}",
awslogs-region: "us-east-1",
awslogs-stream-prefix: "ecs"
}
},
healthCheck: {
retries: 3,
command: [
"CMD-SHELL",
"curl -sf http://localhost:8080/blog || exit 1"
],
timeout: 5,
interval: 30,
startPeriod: null
}
}
])
cpu = 256
memory = 512
requires_compatibilities = ["FARGATE"]
network_mode = "awsvpc"
execution_role_arn = module.permissions.task_definition_execution_role_arn
task_role_arn = module.permissions.task_definition_execution_role_arn
}
resource "aws_ecs_service" "service" {
name = "adl-blog-us-task-service-${var.environment}"
cluster = aws_ecs_cluster.cluster.id
deployment_controller {
type = "ECS"
}
deployment_maximum_percent = 200
deployment_minimum_healthy_percent = 50
task_definition = aws_ecs_task_definition.task.family
desired_count = 3
launch_type = "FARGATE"
network_configuration {
subnets = concat(
var.public_alb_networking_details.subnets,
[ var.private_networking_details.subnet.id ]
)
security_groups = [
var.public_alb_networking_details.security_group,
var.private_networking_details.security_group.id
]
assign_public_ip = true
}
load_balancer {
target_group_arn = var.public_alb_networking_details.load_balancer_target_group_arn
container_name = "adl-blog-us-next"
container_port = 8080
}
force_new_deployment = true
lifecycle {
ignore_changes = [desired_count]
}
depends_on = [
module.permissions
]
}
variable "private_networking_details" {}
variable "public_alb_networking_details" {}
variable "environment" {
type = string
}
Your container ports are 8080 and 6379. However your target group says its 80. So you have to double check what are your actual ports that you use on Fargate and adjust your TG accordingly.
There could be other issues as well, which aren't yet apparent. For example, you are opening port 443, but there is no listener for that. So any attempt of using https will fail.

Can't connect through tableau on EC2 using https

I am running tableau server 2021-1-2 on EC2 instance.
I can connect using the default public ip on port 80, also on port 8050 for the Tableau TSM UI. And the same using the hostname I defined. The only issue I have is despite following several guidelines I can't connect using https.
I setup the ports on the security-group, the load-balancer, the certificate, i waited for hours as I saw that the ssl certificate could take more than half of an hour and nothing.
I can connect using:
http://my_domain.domain
But not:
https://my_domain.domain
I receive the following error in the browser: Can't connect to the server https://my_domain.domain.
I run curl -i https://my_domain.domain
It returns:
curl: (7) Failed to connect to my_domain.domainport 443: Connection refused
The security group of my instance has the following ports (u can see it in tf too):
Here you have my tf setup.
I did the EC2 setup with:
resource "aws_instance" "tableau" {
ami = var.ami
instance_type = var.instance_type
associate_public_ip_address = true
key_name = var.key_name
subnet_id = compact(split(",", var.public_subnets))[0]
vpc_security_group_ids = [aws_security_group.tableau-sg.id]
root_block_device{
volume_size = var.volume_size
}
tags = {
Name = var.namespace
}
}
I created the load balancer setup using:
resource "aws_lb" "tableau-lb" {
name = "${var.namespace}-alb"
load_balancer_type = "application"
internal = false
subnets = compact(split(",", var.public_subnets))
security_groups = [aws_security_group.tableau-sg.id]
ip_address_type = "ipv4"
enable_cross_zone_load_balancing = true
lifecycle {
create_before_destroy = true
}
idle_timeout = 300
}
resource "aws_alb_listener" "https" {
depends_on = [aws_alb_target_group.target-group]
load_balancer_arn = aws_lb.tableau-lb.arn
protocol = "HTTPS"
port = "443"
ssl_policy = "my_ssl_policy"
certificate_arn = "arn:xxxx"
default_action {
target_group_arn = aws_alb_target_group.target-group.arn
type = "forward"
}
lifecycle {
ignore_changes = [
default_action.0.target_group_arn,
]
}
}
resource "aws_alb_target_group" "target-group" {
name = "${var.namespace}-group"
port = 80
protocol = "HTTP"
vpc_id = var.vpc_id
target_type = "instance"
health_check {
healthy_threshold = var.health_check_healthy_threshold
unhealthy_threshold = var.health_check_unhealthy_threshold
timeout = var.health_check_timeout
interval = var.health_check_interval
path = var.path
}
tags = {
Name = var.namespace
}
lifecycle {
create_before_destroy = false
}
depends_on = [aws_lb.tableau-lb]
}
resource "aws_lb_target_group_attachment" "tableau-attachment" {
target_group_arn = aws_alb_target_group.target-group.arn
target_id = aws_instance.tableau.id
port = 80
}
The security group:
resource "aws_security_group" "tableau-sg" {
name_prefix = "${var.namespace}-sg"
tags = {
Name = var.namespace
}
vpc_id = var.vpc_id
# HTTP from the load balancer
ingress {
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
# HTTP from the load balancer
ingress {
from_port = 8850
to_port = 8850
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
# HTTP from the load balancer
ingress {
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
# 443 secure access from anywhere
ingress {
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
# Outbound internet access
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
lifecycle {
create_before_destroy = true
}
}
Also setup a hostname domain using:
resource "aws_route53_record" "tableau-record-dns" {
zone_id = var.route_53_zone_id
name = "example.hostname"
type = "A"
ttl = "300"
records = [aws_instance.tableau.public_ip]
}
resource "aws_route53_record" "tableau-record-dns-https" {
zone_id = var.route_53_zone_id
name = "asdf.example.hostname"
type = "CNAME"
ttl = "300"
records = ["asdf.acm-validations.aws."]
}
Finally solved the issue, it was related to the record A. I was assignin an ip there an its impossible to redirect to an specific ip with the loadbalancer there. I redirect traffic to an ELB and worked fine

Terraform - can't reach web server - instance out of service

I'm running the below terraform code to deploy an ec2 instance inside a VPC to work as web server but for some reason I cant reach the website and cant shh to it, I have set the ingress and egress rules properly I believe:
########Provider########
provider "aws" {
region = "us-west-2"
access_key = "[redacted]"
secret_key = "[redacted]"
}
########VPC########
resource "aws_vpc" "vpc1" {
cidr_block = "10.1.0.0/16"
tags = {
Name = "Production"
}
}
########Internet GW########
resource "aws_internet_gateway" "gw" {
vpc_id = aws_vpc.vpc1.id
}
########Route table########
resource "aws_route_table" "rt" {
vpc_id = aws_vpc.vpc1.id
route {
cidr_block = "0.0.0.0/24"
gateway_id = aws_internet_gateway.gw.id
}
route {
ipv6_cidr_block = "::/0"
gateway_id = aws_internet_gateway.gw.id
}
}
########Sub Net########
resource "aws_subnet" "subnet1" {
vpc_id = aws_vpc.vpc1.id
cidr_block = "10.1.0.0/24"
availability_zone = "us-west-2a"
map_public_ip_on_launch = "true"
tags = {
Name = "prod-subnet-1"
}
}
########RT assosiation########
resource "aws_route_table_association" "a" {
subnet_id = aws_subnet.subnet1.id
route_table_id = aws_route_table.rt.id
}
########Security Group########
resource "aws_security_group" "sec1" {
name = "allow_web"
description = "Allow web inbound traffic"
vpc_id = aws_vpc.vpc1.id
ingress {
description = "HTTP from VPC"
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["10.1.0.0/16"]
}
#SSH access from anywhere
ingress {
description = "SSH from VPC"
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "allow_web"
}
}
########Net Interface for the Instance########
#resource "aws_network_interface" "wsn" {
# subnet_id = aws_subnet.subnet1.id
# private_ips = ["10.0.1.50"]
# security_groups = [aws_security_group.sec1.id]
#}
########Load Balancer########
resource "aws_elb" "elb" {
name = "lb"
subnets = [aws_subnet.subnet1.id]
security_groups = [aws_security_group.sec1.id]
instances = [aws_instance.web1.id]
listener {
instance_port = 80
instance_protocol = "http"
lb_port = 80
lb_protocol = "http"
}
}
########EC2 Instance########
resource "aws_instance" "web1" {
ami = "ami-003634241a8fcdec0" #ubuntu 18.4
instance_type = "t2.micro"
availability_zone = "us-west-2a"
key_name = "main-key"
subnet_id = aws_subnet.subnet1.id
#network_interface {
# device_index = 0
# network_interface_id = aws_network_interface.wsn.id
#}
user_data = <<-EOF
#!/bin/bash
sudo apt update -y
sudo apt install apache2 -y
sudo systemctl start apache2
sudo bash -c 'echo Hello world!!! > /var/www/html/index.html'
EOF
tags = {
Name = "HelloWorld"
}
}
output "aws_elb_public_dns" {
value = aws_elb.elb.dns_name
}
The plan and the apply runs all fine but in the loadbalancer the instance is "outofservice"
what could be wrong here??
You are missing security group to your instance: vpc_security_group_ids.
Subsequently, you won't be able to ssh to it nor the http traffic will be allowed from the outside.
Also your route to IGW is incorrect. It should be:
cidr_block = "0.0.0.0/0"
Same for SG for your ELB to allow traffic from the internet. It should be:
cidr_blocks = ["0.0.0.0/0"]

Error deleting Target Group: ResourceInUse when changing target ports in AWS through Terraform

I am currently working through the beta book "Terraform Up & Running, 2nd Edition". In chapter 2, I created an auto scaling group and a load balancer in AWS.
Now I made my backend server HTTP ports configurable. By default they listen on port 8080.
variable "server_port" {
…
default = 8080
}
resource "aws_launch_configuration" "example" {
…
user_data = <<-EOF
#!/bin/bash
echo "Hello, World" > index.html
nohup busybox httpd -f -p ${var.server_port} &
EOF
…
}
resource "aws_security_group" "instance" {
…
ingress {
from_port = var.server_port
to_port = var.server_port
…
}
}
The same port also needs to be configured in the application load balancer's target group.
resource "aws_lb_target_group" "asg" {
…
port = var.server_port
…
}
When my infrastructure is already deployed, for example with the configuration for the port set to 8080, and then I change the variable to 80 by running terraform apply --var server_port=80, the following error is reported:
> Error: Error deleting Target Group: ResourceInUse: Target group
> 'arn:aws:elasticloadbalancing:eu-central-1:…:targetgroup/terraform-asg-example/…'
> is currently in use by a listener or a rule status code: 400,
How can I refine my Terraform infrastructure definition to make this change possible? I suppose it might be related to a lifecycle option somewhere, but I didn't manage to figure it out yet.
For your reference I attach my whole infrastructure definition below:
provider "aws" {
region = "eu-central-1"
}
output "alb_location" {
value = "http://${aws_lb.example.dns_name}"
description = "The location of the load balancer"
}
variable "server_port" {
description = "The port the server will use for HTTP requests"
type = number
default = 8080
}
resource "aws_lb_listener_rule" "asg" {
listener_arn = aws_lb_listener.http.arn
priority = 100
condition {
field = "path-pattern"
values = ["*"]
}
action {
type = "forward"
target_group_arn = aws_lb_target_group.asg.arn
}
}
resource "aws_lb_target_group" "asg" {
name = "terraform-asg-example"
port = var.server_port
protocol = "HTTP"
vpc_id = data.aws_vpc.default.id
health_check {
path = "/"
protocol = "HTTP"
matcher = "200"
interval = 15
timeout = 3
healthy_threshold = 2
unhealthy_threshold = 2
}
}
resource "aws_lb_listener" "http" {
load_balancer_arn = aws_lb.example.arn
port = 80
protocol = "HTTP"
default_action {
type = "fixed-response"
fixed_response {
content_type = "text/plain"
message_body = "404: page not found"
status_code = 404
}
}
}
resource "aws_lb" "example" {
name = "terraform-asg-example"
load_balancer_type = "application"
subnets = data.aws_subnet_ids.default.ids
security_groups = [aws_security_group.alb.id]
}
resource "aws_security_group" "alb" {
name = "terraform-example-alb"
ingress {
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
}
resource "aws_autoscaling_group" "example" {
launch_configuration = aws_launch_configuration.example.name
vpc_zone_identifier = data.aws_subnet_ids.default.ids
target_group_arns = [aws_lb_target_group.asg.arn]
health_check_type = "ELB"
min_size = 2
max_size = 10
tag {
key = "Name"
value = "terraform-asg-example"
propagate_at_launch = true
}
}
resource "aws_launch_configuration" "example" {
image_id = "ami-0085d4f8878cddc81"
instance_type = "t2.micro"
security_groups = [aws_security_group.instance.id]
user_data = <<-EOF
#!/bin/bash
echo "Hello, World" > index.html
nohup busybox httpd -f -p ${var.server_port} &
EOF
lifecycle {
create_before_destroy = true
}
}
resource "aws_security_group" "instance" {
name = "terraform-example-instance"
ingress {
from_port = var.server_port
to_port = var.server_port
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
}
data "aws_subnet_ids" "default" {
vpc_id = data.aws_vpc.default.id
}
data "aws_vpc" "default" {
default = true
}
From the issue link in the comment on Cannot rename ALB Target Group if Listener present:
Add a lifecycle rule to your target group so it becomes:
resource "aws_lb_target_group" "asg" {
name = "terraform-asg-example"
port = var.server_port
protocol = "HTTP"
vpc_id = data.aws_vpc.default.id
health_check {
path = "/"
protocol = "HTTP"
matcher = "200"
interval = 15
timeout = 3
healthy_threshold = 2
unhealthy_threshold = 2
}
lifecycle {
create_before_destroy = true
}
}
However you will need to choose a method for changing the name of your target group as well. There is further discussion and suggestions on how to do this.
But one possible solution is to simply use a guid but ignore changes to the name:
resource "aws_lb_target_group" "asg" {
name = "terraform-asg-example-${substr(uuid(), 0, 3)}"
port = var.server_port
protocol = "HTTP"
vpc_id = data.aws_vpc.default.id
health_check {
path = "/"
protocol = "HTTP"
matcher = "200"
interval = 15
timeout = 3
healthy_threshold = 2
unhealthy_threshold = 2
}
lifecycle {
create_before_destroy = true
ignore_changes = [name]
}
}
Slightly simpler than #FGreg's solution, add a lifecycle policy and switch from name to name_prefix which will prevent naming collisions.
resource "aws_lb_target_group" "asg" {
name_prefix = "terraform-asg-example"
port = var.server_port
protocol = "HTTP"
vpc_id = data.aws_vpc.default.id
lifecycle {
create_before_destroy = true
}
health_check {
path = "/"
protocol = "HTTP"
matcher = "200"
interval = 15
timeout = 3
healthy_threshold = 2
unhealthy_threshold = 2
}
}
No need for uuid or ignore_changes settings.
change name = "terraform-asg-example"
to
name_prefix = "asg-"