I have a Terraform codebase which deploys a private EKS cluster, a bastion host and other AWS services. I have also added a few security groups to the in Terraform. One of the security groups allows inbound traffic from my Home IP to the bastion host so that i can SSH onto that node. This security group is called bastionSG, and that works fine also.
However, initially I am unable to run kubectl from my bastion host, which is the node I use to carry out my kubernetes development on against the EKS cluster nodes. The reason is because my EKS cluster is a private and only allows communication from nodes in the same VPC and i need to add a security group that allows the communication from my bastion host to the cluster control plane which is where my security group bastionSG comes in.
So my routine now is once Terraform deploys everything, I then find the automatic generated EKS security group and add my bastionSG as an inbound rule to it through the AWS Console (UI) as shown in the image below.
I would like to NOT have to do this through the UI, as i am already using Terraform to deploy my entire infrastructure.
I know i can query an existing security group like this
data "aws_security_group" "selectedSG" {
id = var.security_group_id
}
In this case, lets say selectedSG is the security group creared by EKS once terraform is completed the apply process. I would like to then add an inbound rule of bastionSG to it without it ovewriting the others it's added automatically.
UPDATE: > EKS NODE GROUP
resource "aws_eks_node_group" "flmd_node_group" {
cluster_name = var.cluster_name
node_group_name = var.node_group_name
node_role_arn = var.node_pool_role_arn
subnet_ids = [var.flmd_private_subnet_id]
instance_types = ["t2.small"]
scaling_config {
desired_size = 3
max_size = 3
min_size = 3
}
update_config {
max_unavailable = 1
}
remote_access {
ec2_ssh_key = "MyPemFile"
source_security_group_ids = [
var.allow_tls_id,
var.allow_http_id,
var.allow_ssh_id,
var.bastionSG_id
]
}
tags = {
"Name" = "flmd-eks-node"
}
}
As shown above, the EKS node group has the bastionSG security group in it. which i expect to allow the connection from my bastion host to the EKS control plane.
EKS Cluster
resource "aws_eks_cluster" "flmd_cluster" {
name = var.cluster_name
role_arn = var.role_arn
vpc_config {
subnet_ids =[var.flmd_private_subnet_id, var.flmd_public_subnet_id, var.flmd_public_subnet_2_id]
endpoint_private_access = true
endpoint_public_access = false
security_group_ids = [ var.bastionSG_id]
}
}
bastionSG_id is an output of the security group created below which is passed into the code above as a variable.
BastionSG security group
resource "aws_security_group" "bastionSG" {
name = "Home to bastion"
description = "Allow SSH - Home to Bastion"
vpc_id = var.vpc_id
ingress {
description = "Home to bastion"
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = [<MY HOME IP address>]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
ipv6_cidr_blocks = ["::/0"]
}
tags = {
Name = "Home to bastion"
}
}
Let's start with creating first of all a public security group.
################################################################################
# Create the Security Group
################################################################################
resource "aws_security_group" "public" {
vpc_id = local.vpc_id
name = format("${var.name}-${var.public_security_group_suffix}-SG")
description = format("${var.name}-${var.public_security_group_suffix}-SG")
dynamic "ingress" {
for_each = var.public_security_group_ingress
content {
cidr_blocks = lookup(ingress.value, "cidr_blocks", [])
ipv6_cidr_blocks = lookup(ingress.value, "ipv6_cidr_blocks", [])
from_port = lookup(ingress.value, "from_port", 0)
to_port = lookup(ingress.value, "to_port", 0)
protocol = lookup(ingress.value, "protocol", "-1")
}
}
dynamic "egress" {
for_each = var.public_security_group_egress
content {
cidr_blocks = lookup(egress.value, "cidr_blocks", [])
ipv6_cidr_blocks = lookup(egress.value, "ipv6_cidr_blocks", [])
from_port = lookup(egress.value, "from_port", 0)
to_port = lookup(egress.value, "to_port", 0)
protocol = lookup(egress.value, "protocol", "-1")
}
}
tags = merge(
{
"Name" = format(
"${var.name}-${var.public_security_group_suffix}-SG",
)
},
var.tags,
)
}
Now creating a private security group, making inbound from the public security group, and outbound to the elasticache and rds security group.
resource "aws_security_group" "private" {
vpc_id = local.vpc_id
name = format("${var.name}-${var.private_security_group_suffix}-SG")
description = format("${var.name}-${var.private_security_group_suffix}-SG")
ingress {
security_groups = [aws_security_group.public.id]
from_port = 0
to_port = 0
protocol = "-1"
}
dynamic "ingress" {
for_each = var.private_security_group_ingress
content {
cidr_blocks = lookup(ingress.value, "cidr_blocks", [])
ipv6_cidr_blocks = lookup(ingress.value, "ipv6_cidr_blocks", [])
from_port = lookup(ingress.value, "from_port", 0)
to_port = lookup(ingress.value, "to_port", 0)
protocol = lookup(ingress.value, "protocol", "-1")
}
}
dynamic "egress" {
for_each = var.private_security_group_egress
content {
cidr_blocks = lookup(egress.value, "cidr_blocks", [])
ipv6_cidr_blocks = lookup(egress.value, "ipv6_cidr_blocks", [])
from_port = lookup(egress.value, "from_port", 0)
to_port = lookup(egress.value, "to_port", 0)
protocol = lookup(egress.value, "protocol", "-1")
}
}
egress {
security_groups = [aws_security_group.elsaticache_private.id] # it communciates via network interfaces
from_port = 6379 # redis port
to_port = 6379
protocol = "tcp"
}
egress {
security_groups = [aws_security_group.rds_mysql_private.id]
from_port = 3306
to_port = 3306
protocol = "tcp"
}
tags = merge(
{
"Name" = format(
"${var.name}-${var.private_security_group_suffix}-SG"
)
},
var.tags,
)
depends_on = [aws_security_group.elsaticache_private, aws_security_group.rds_mysql_private]
}
Creating just an egress rule in elasticache security group, and adding one more rule for ingress from the private security group as it resolves the dependency. The same goes for the RDS Security group.
resource "aws_security_group" "elsaticache_private" {
vpc_id = local.vpc_id
name = format("${var.name}-${var.private_security_group_suffix}-elasticache-SG")
description = format("${var.name}-${var.private_security_group_suffix}-elasticache-SG")
egress {
cidr_blocks = ["0.0.0.0/0"]
ipv6_cidr_blocks = ["::/0"]
from_port = 0
to_port = 0
protocol = "-1"
}
tags = merge(
{
"Name" = format(
"${var.name}-${var.public_security_group_suffix}-elasticache-SG",
)
},
var.tags,
)
}
resource "aws_security_group_rule" "elsaticache_private_rule" {
type = "ingress"
from_port = 6379 # redis port
to_port = 6379
protocol = "tcp"
source_security_group_id = aws_security_group.private.id
security_group_id = aws_security_group.elsaticache_private.id
depends_on = [aws_security_group.private]
}
resource "aws_security_group" "rds_mysql_private" {
vpc_id = local.vpc_id
name = format("${var.name}-${var.private_security_group_suffix}-rds-mysql-SG")
description = format("${var.name}-${var.private_security_group_suffix}-rds-mysql-SG")
egress {
cidr_blocks = ["0.0.0.0/0"]
ipv6_cidr_blocks = ["::/0"]
from_port = 0
to_port = 0
protocol = "-1"
}
tags = merge(
{
"Name" = format(
"${var.name}-${var.public_security_group_suffix}-rds-mysql-SG",
)
},
var.tags,
)
}
resource "aws_security_group_rule" "rds_mysql_private_rule" {
type = "ingress"
from_port = 3306 # mysql / aurora port
to_port = 3306
protocol = "tcp"
source_security_group_id = aws_security_group.private.id
security_group_id = aws_security_group.rds_mysql_private.id
depends_on = [aws_security_group.private]
}
There was a simpler solution.
Query AWS using terraform data attribute, get the id of the security group then use that to create security_group_rule in terraform with the inbound rule that is required.
I have a ubuntu ec2 instance in region eu-west-3, in a public subnet.
I am able to run curl on websites successfuly, as well as ping on IP addresses.
However, when I run ping on addresses such as google.com, I have a dns failure error.
Any idea what can be wrong in the configurations of my EC2 instance?
This is the content of /etc/resolv.conf
nameserver 127.0.0.53
options edns0
search eu-west-3.compute.internal
And the terraform code:
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "~> 2.21.0"
name = "${local.env_type}-vpc"
cidr = local.workspace["net_cidr"]
azs = ["eu-west-3a", "eu-west-3b"]
private_subnets = local.workspace["private_subnets"]
public_subnets = local.workspace["public_subnets"]
enable_nat_gateway = true
single_nat_gateway = true
reuse_nat_ips = false
enable_vpn_gateway = false
enable_dns_hostnames = true
create_database_subnet_group = true
enable_ipv6 = true
assign_ipv6_address_on_creation = true
private_subnet_assign_ipv6_address_on_creation = false
public_subnet_ipv6_prefixes = [0, 1]
private_subnet_ipv6_prefixes = [2, 3]
database_subnet_ipv6_prefixes = [4, 5]
database_subnets = local.workspace["database_subnets"]
tags = {
ManagedByTerraform = "true"
EnvironmentType = "${local.env_type}"
}
}
resource "aws_security_group" "public_instance" {
#vpc_id = module.vpc.vpc_id
ingress {
from_port = 22
to_port = 22
protocol = "tcp"
# Note that opening to 0.0.0.0/0 can lead to security vulnerabilities.
cidr_blocks = ["0.0.0.0/0"]
}
ingress {
from_port = 8080
to_port = 8080
protocol = "tcp"
# Note that opening to 0.0.0.0/0 can lead to security vulnerabilities.
cidr_blocks = ["0.0.0.0/0"]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
}
resource "aws_instance" "ubuntu_vm" {
ami = "ami-064736ff8301af3ee"
instance_type = "t3.medium"
key_name = "aws-key"
iam_instance_profile = aws_iam_instance_profile.instance_pcb_profile.id
security_groups = [
aws_security_group.public_instance.id
]
tags = {
Name = "pcb-${local.env_type}"
}
}
resource "aws_eip" "ip" {
vpc = true
instance = aws_instance.ubuntu_vm.id
}
I am currently learning Terraform and I need help with regard to the below code. I want to create a simple architecture of an autoscaling group of EC2 instances behind an Application load balancer. The setup gets completed but when I try to access the application endpoint, it gets timed out. When I tried to access the EC2 instances, I was unable to (because EC2 instances were in a security group allowing access from the ALB security group only). I changed the instance security group ingress values and ran the user_data script manually following which I reverted the changes to the instance security group to complete my setup.
My question is why is my setup not working via the below code? Is it because the access is being restricted by the load balancer security group or is my launch configuration block incorrect?
data "aws_ami" "amazon-linux" {
most_recent = true
owners = ["amazon"]
filter {
name = "name"
values = ["amzn2-ami-kernel-5.10-hvm-2.0.20220426.0-x86_64-gp2"]
}
}
data "aws_availability_zones" "available" {
state = "available"
}
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "3.14.0"
name = "main-vpc"
cidr = "10.0.0.0/16"
azs = data.aws_availability_zones.available.names
public_subnets = ["10.0.4.0/24","10.0.5.0/24","10.0.6.0/24"]
enable_dns_hostnames = true
enable_dns_support = true
}
resource "aws_launch_configuration" "TestLC" {
name_prefix = "Lab-Instance-"
image_id = data.aws_ami.amazon-linux.id
instance_type = "t2.nano"
key_name = "CloudformationKeyPair"
user_data = file("./user_data.sh")
security_groups = [aws_security_group.TestInstanceSG.id]
lifecycle {
create_before_destroy = true
}
}
resource "aws_autoscaling_group" "TestASG" {
min_size = 1
max_size = 3
desired_capacity = 2
launch_configuration = aws_launch_configuration.TestLC.name
vpc_zone_identifier = module.vpc.public_subnets
}
resource "aws_lb_listener" "TestListener"{
load_balancer_arn = aws_lb.TestLB.arn
port = "80"
protocol = "HTTP"
default_action {
type = "forward"
target_group_arn = aws_lb_target_group.TestTG.arn
}
}
resource "aws_lb" "TestLB" {
name = "Lab-App-Load-Balancer"
internal = false
load_balancer_type = "application"
security_groups = [aws_security_group.TestLoadBalanceSG.id]
subnets = module.vpc.public_subnets
}
resource "aws_lb_target_group" "TestTG" {
name = "LabTargetGroup"
port = "80"
protocol = "HTTP"
vpc_id = module.vpc.vpc_id
}
resource "aws_autoscaling_attachment" "TestAutoScalingAttachment" {
autoscaling_group_name = aws_autoscaling_group.TestASG.id
lb_target_group_arn = aws_lb_target_group.TestTG.arn
}
resource "aws_security_group" "TestInstanceSG" {
name = "LAB-Instance-SecurityGroup"
ingress{
from_port = 80
to_port = 80
protocol = "tcp"
security_groups = [aws_security_group.TestLoadBalanceSG.id]
}
ingress{
from_port = 22
to_port = 22
protocol = "tcp"
security_groups = [aws_security_group.TestLoadBalanceSG.id]
}
egress{
from_port = 0
to_port = 0
protocol = "-1"
security_groups = [aws_security_group.TestLoadBalanceSG.id]
}
vpc_id = module.vpc.vpc_id
}
resource "aws_security_group" "TestLoadBalanceSG" {
name = "LAB-LoadBalancer-SecurityGroup"
ingress{
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
ingress{
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress{
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
vpc_id = module.vpc.vpc_id
}
I am running tableau server 2021-1-2 on EC2 instance.
I can connect using the default public ip on port 80, also on port 8050 for the Tableau TSM UI. And the same using the hostname I defined. The only issue I have is despite following several guidelines I can't connect using https.
I setup the ports on the security-group, the load-balancer, the certificate, i waited for hours as I saw that the ssl certificate could take more than half of an hour and nothing.
I can connect using:
http://my_domain.domain
But not:
https://my_domain.domain
I receive the following error in the browser: Can't connect to the server https://my_domain.domain.
I run curl -i https://my_domain.domain
It returns:
curl: (7) Failed to connect to my_domain.domainport 443: Connection refused
The security group of my instance has the following ports (u can see it in tf too):
Here you have my tf setup.
I did the EC2 setup with:
resource "aws_instance" "tableau" {
ami = var.ami
instance_type = var.instance_type
associate_public_ip_address = true
key_name = var.key_name
subnet_id = compact(split(",", var.public_subnets))[0]
vpc_security_group_ids = [aws_security_group.tableau-sg.id]
root_block_device{
volume_size = var.volume_size
}
tags = {
Name = var.namespace
}
}
I created the load balancer setup using:
resource "aws_lb" "tableau-lb" {
name = "${var.namespace}-alb"
load_balancer_type = "application"
internal = false
subnets = compact(split(",", var.public_subnets))
security_groups = [aws_security_group.tableau-sg.id]
ip_address_type = "ipv4"
enable_cross_zone_load_balancing = true
lifecycle {
create_before_destroy = true
}
idle_timeout = 300
}
resource "aws_alb_listener" "https" {
depends_on = [aws_alb_target_group.target-group]
load_balancer_arn = aws_lb.tableau-lb.arn
protocol = "HTTPS"
port = "443"
ssl_policy = "my_ssl_policy"
certificate_arn = "arn:xxxx"
default_action {
target_group_arn = aws_alb_target_group.target-group.arn
type = "forward"
}
lifecycle {
ignore_changes = [
default_action.0.target_group_arn,
]
}
}
resource "aws_alb_target_group" "target-group" {
name = "${var.namespace}-group"
port = 80
protocol = "HTTP"
vpc_id = var.vpc_id
target_type = "instance"
health_check {
healthy_threshold = var.health_check_healthy_threshold
unhealthy_threshold = var.health_check_unhealthy_threshold
timeout = var.health_check_timeout
interval = var.health_check_interval
path = var.path
}
tags = {
Name = var.namespace
}
lifecycle {
create_before_destroy = false
}
depends_on = [aws_lb.tableau-lb]
}
resource "aws_lb_target_group_attachment" "tableau-attachment" {
target_group_arn = aws_alb_target_group.target-group.arn
target_id = aws_instance.tableau.id
port = 80
}
The security group:
resource "aws_security_group" "tableau-sg" {
name_prefix = "${var.namespace}-sg"
tags = {
Name = var.namespace
}
vpc_id = var.vpc_id
# HTTP from the load balancer
ingress {
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
# HTTP from the load balancer
ingress {
from_port = 8850
to_port = 8850
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
# HTTP from the load balancer
ingress {
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
# 443 secure access from anywhere
ingress {
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
# Outbound internet access
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
lifecycle {
create_before_destroy = true
}
}
Also setup a hostname domain using:
resource "aws_route53_record" "tableau-record-dns" {
zone_id = var.route_53_zone_id
name = "example.hostname"
type = "A"
ttl = "300"
records = [aws_instance.tableau.public_ip]
}
resource "aws_route53_record" "tableau-record-dns-https" {
zone_id = var.route_53_zone_id
name = "asdf.example.hostname"
type = "CNAME"
ttl = "300"
records = ["asdf.acm-validations.aws."]
}
Finally solved the issue, it was related to the record A. I was assignin an ip there an its impossible to redirect to an specific ip with the loadbalancer there. I redirect traffic to an ELB and worked fine
My terraform script is as follow: eveything in VPC
resource "aws_security_group" "cacheSecurityGroup" {
name = "${var.devname}-${var.namespace}-${var.stage}-RedisCache-SecurityGroup"
vpc_id = var.vpc.vpc_id
tags = var.default_tags
ingress {
protocol = "tcp"
from_port = 6379
to_port = 6379
cidr_blocks = ["0.0.0.0/0"]
ipv6_cidr_blocks = ["::/0"]
}
egress {
protocol = "-1"
from_port = 0
to_port = 0
cidr_blocks = ["0.0.0.0/0"]
ipv6_cidr_blocks = ["::/0"]
}
}
resource "aws_elasticache_parameter_group" "usagemonitorCacheParameterGroup" {
name = "${var.devname}${var.namespace}${var.stage}-usagemonitor-cache-parameterGroup"
family = "redis6.x"
}
resource "aws_elasticache_subnet_group" "redis_subnet_group" {
name = "${var.devname}${var.namespace}${var.stage}-usagemonitor-cache-subnetGroup"
subnet_ids = var.vpc.database_subnets
}
resource "aws_elasticache_replication_group" "replication_group_usagemonitor" {
replication_group_id = "${var.devname}${var.namespace}${var.stage}-usagemonitor-cache"
replication_group_description = "Replication group for Usagemonitor"
node_type = "cache.t2.micro"
number_cache_clusters = 2
parameter_group_name = aws_elasticache_parameter_group.usagemonitorCacheParameterGroup.name
subnet_group_name = aws_elasticache_subnet_group.redis_subnet_group.name
#security_group_names = [aws_elasticache_security_group.bar.name]
automatic_failover_enabled = true
at_rest_encryption_enabled = true
port = 6379
}
if i uncomment the line
#security_group_names = [aws_elasticache_security_group.bar.name]
am getting
i get following error:
Error: Error creating Elasticache Replication Group: InvalidParameterCombination: Use of cache security groups is not permitted along with cache subnet group and/or security group Ids.
status code: 400, request id: 4e70e86d-b868-45b3-a1d2-88ab652dc85e
i read that we dont have to use aws_elasticache_security_group if all resources are inside VPC. What the correct way to assign security groups to aws_elasticache_replication_group ??? usinf subnets??? how ???
I do something like this, I believe this is the best way to assign required configuration:
resource "aws_security_group" "redis" {
name_prefix = "${var.name_prefix}-redis-"
vpc_id = var.vpc_id
lifecycle {
create_before_destroy = true
}
}
resource "aws_elasticache_replication_group" "redis" {
...
engine = "redis"
subnet_group_name = aws_elasticache_subnet_group.redis.name
security_group_ids = concat(var.security_group_ids, [aws_security_group.redis.id])
}
Your subnet group basically includes all private or public subnets from your VPC where the elasticache replication group is going to be created.
In general, use security group ids instead of names.
I have written a terraform module that definitely works and if you interested it is available under with examples https://github.com/umotif-public/terraform-aws-elasticache-redis.