I'm trying to use an elasticache redis cluster from a fargate service, but am getting address resolution issues - "ENOTFOUND" errors from "getaddrinfo" for the cluster endpoint in my code, and NXDOMAIN errors if I try and do nslookup from inside the container.
The cluster is set up as
resource "aws_elasticache_cluster" "redis_service" {
cluster_id = "${terraform.workspace}-${local.app_name}-redis-service"
engine = "redis"
node_type = "cache.t4g.micro"
num_cache_nodes = 1
parameter_group_name = "default.redis6.x"
engine_version = "6.2"
port = 6379
subnet_group_name = aws_elasticache_subnet_group.redis_subnet.name
security_group_ids = [aws_security_group.redis.id]
}
resource "aws_security_group" "redis" {
vpc_id = aws_vpc.ecs-vpc.id
name = "${terraform.workspace}-redis"
ingress {
from_port = 0
to_port = 0
protocol = "-1"
self = true
cidr_blocks = ["0.0.0.0/0"] # Originally had this restricting to the security group that my fargate service is in, but made it less restrictive trying to debug the issue
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
}
### AZ 1 (and similar for other AZs)
resource "aws_subnet" "redis1" {
vpc_id = aws_vpc.ecs-vpc.id
cidr_block = "10.0.40.0/24"
availability_zone = "eu-west-2a"
}
resource "aws_route_table" "redis1" {
vpc_id = aws_vpc.ecs-vpc.id
}
resource "aws_route_table_association" "redis1" {
route_table_id = aws_route_table.redis1.id
subnet_id = aws_subnet.redis1.id
}
resource "aws_route" "redis1" {
destination_cidr_block = "0.0.0.0/0"
nat_gateway_id = aws_nat_gateway.subnet1.id
route_table_id = aws_route_table.redis1.id
}
The fargate service I'm trying to connect from is in the same vpc.
I've checked throw the AWS documentation on connected to elasticache clusters, but not seen anything that seems to be missing.
The VPC reachability analyzer seems to confirm that the cluster should be reachable from the service (using the relevant enis).
Any ideas for fixes or debugging steps?
Related
Is it possible to launch multiple ec2 instances from terraform using a single VPC? I'm building something which requires multiple instances to be launched from the same region and I'm doing all this using Terraform. But there's a limit in AWS VPC: per region only 5 VPCs are allowed. What I've been doing until now is each time when I need to launch an instance I create a separate VPC for it in terraform. Below is the code for reference:
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 3.0"
}
}
}
# Configure the AWS Provider
provider "aws" {
region = "us-east-2"
access_key = "XXXXXXXXXXXXXXXXX"
secret_key = "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
}
# 1. Create vpc
resource "aws_vpc" "prod-vpc" {
cidr_block = "10.0.0.0/16"
tags = {
Name = "production"
}
}
# 2. Create Internet Gateway
resource "aws_internet_gateway" "gw" {
vpc_id = aws_vpc.prod-vpc.id
}
# 3. Create Custom Route Table
resource "aws_route_table" "prod-route-table" {
vpc_id = aws_vpc.prod-vpc.id
route {
cidr_block = "0.0.0.0/0"
gateway_id = aws_internet_gateway.gw.id
}
route {
ipv6_cidr_block = "::/0"
gateway_id = aws_internet_gateway.gw.id
}
tags = {
Name = "Prod"
}
}
# 4. Create a Subnet
resource "aws_subnet" "subnet-1" {
vpc_id = aws_vpc.prod-vpc.id
cidr_block = "10.0.1.0/24"
availability_zone = "us-east-2a"
tags = {
Name = "prod-subnet"
}
}
# 5. Associate subnet with Route Table
resource "aws_route_table_association" "a" {
subnet_id = aws_subnet.subnet-1.id
route_table_id = aws_route_table.prod-route-table.id
}
# 6. Create Security Group to allow port 22,80,443
resource "aws_security_group" "allow_web" {
name = "allow_web_traffic"
description = "Allow Web inbound traffic"
vpc_id = aws_vpc.prod-vpc.id
ingress {
description = "HTTPS"
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
ingress {
description = "self"
from_port = 8000
to_port = 8000
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
ingress {
description = "HTTP"
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
ingress {
description = "SSH"
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "allow_web"
}
}
# 7. Create a network interface with an ip in the subnet that was created in step 4
resource "aws_network_interface" "web-server-nic" {
subnet_id = aws_subnet.subnet-1.id
private_ips = ["10.0.1.50"]
security_groups = [aws_security_group.allow_web.id]
}
# 8. Assign an elastic IP to the network interface created in step 7
resource "aws_eip" "one" {
vpc = true
network_interface = aws_network_interface.web-server-nic.id
associate_with_private_ip = "10.0.1.50"
depends_on = [aws_internet_gateway.gw]
}
output "server_public_ip" {
value = aws_eip.one.public_ip
}
# 9. Create Ubuntu server and install/enable apache2
resource "aws_instance" "web-server-instance" {
ami = var.AMI_ID
instance_type = "g4dn.xlarge"
availability_zone = "us-east-2a"
key_name = "us-east-2"
network_interface {
device_index = 0
network_interface_id = aws_network_interface.web-server-nic.id
}
root_block_device {
volume_size = "200"
}
iam_instance_profile = aws_iam_instance_profile.training_profile.name
depends_on = [aws_eip.one]
user_data = <<-EOF
#!/bin/bash
python3 /home/ubuntu/setting_instance.py
EOF
tags = {
Name = var.INSTANCE_NAME
}
}
The only downside to this code is it creates separate VPC everytime I create an instance. I read in a stackoverflow post that we can import an existing VPC using terraform import command. Along with the VPC, I had to import the Internet Gateway and Route Table as well (it was throwing error otherwise). But then I wasn't able to access the instance using SSH and also the commands in the user_data part didn't execute (setting_instance.py will send a firebase notification once the instance starts. That's the only purpose of setting_instance.py)
Not only VPC I'd also like to know if I can use the other resources as well to it's fullest extent possible.
I'm new to terraform and AWS. Any suggestions in the above code are welcome.
EDIT: Instances are created one at a time according to the need, i.e., whenever there is a need to create a new instance I use this code. In the current scenario if there are already 5 instances running up in a region then I won't be able to use this code to create a 6th instance in the same region when the demand arises.
If as you say, they would be exactly same, the easiest way would be to use count, which would indicate how many instance you want to have. For that you can introduce new variable:
variable "number_of_instance" {
default = 1
}
and then
resource "aws_instance" "web-server-instance" {
count = var.number_of_instance
ami = var.AMI_ID
instance_type = "g4dn.xlarge"
availability_zone = "us-east-2a"
key_name = "us-east-2"
network_interface {
device_index = 0
network_interface_id = aws_network_interface.web-server-nic.id
}
root_block_device {
volume_size = "200"
}
iam_instance_profile = aws_iam_instance_profile.training_profile.name
depends_on = [aws_eip.one]
user_data = <<-EOF
#!/bin/bash
python3 /home/ubuntu/setting_instance.py
EOF
tags = {
Name = var.INSTANCE_NAME
}
}
All this must be manage by same state file, not fully separate state files, as again you will end up with duplicates of the VPC. You only change number_of_instance to what you want. For more resilient solution, you would have to use autoscaling group for the instances.
I'm a bit new to terraform and was and needed some help on what's the issue with this. It creates the according resources but when connecting to the endpoint, I get a timeout. I noticed the security group isn't actually being created but I'm not sure why. Any help would be appreciated.
configuration:
provider "aws" {
region = "us-west-2"
}
resource "aws_elasticache_cluster" "example" {
cluster_id = "cluster-example"
engine = "redis"
node_type = "cache.m4.large"
num_cache_nodes = 1
parameter_group_name = "default.redis3.2"
engine_version = "3.2.10"
port = 6379
}
resource "aws_security_group" "example" {
name = "example"
description = "Used by the example Redis cluster"
vpc_id = "${aws_vpc.example.id}"
ingress {
description = "TLS from VPC"
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = [aws_vpc.example.cidr_block]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
ipv6_cidr_blocks = ["::/0"]
}
}
resource "aws_vpc" "example" {
cidr_block = "10.0.0.0/16"
tags = {
Name = "example"
}
}
resource "aws_subnet" "example" {
vpc_id = "${aws_vpc.example.id}"
cidr_block = "10.0.0.0/20"
tags = {
Name = "example"
}
}
resource "aws_elasticache_subnet_group" "example" {
name = "example"
description = "Example subnet group"
subnet_ids = ["${aws_subnet.example.id}"]
}
connection to endpoint:
import os
import redis
ENDPOINT = os.environ.get('REDIS_HOST')
client = redis.Redis(host=ENDPOINT, port=6379, db=0)
client.ping()
(passwordless cluster)
EDIT:
I call the endpoint in python on my local machine.
You can't access EC cluster from outside of AWS directly, as it can only be accessed from VPC. You must use VPN, Direct Connect or SSH tunnel if you want to connect from your home network.
I'm trying to set up a dev environment: 1x private subnet, 1x public subnet in the dev VPC; Postgres RDS instance in the private subnet; each subnet's resources are in its own security group. The source RDS instance is in the prod VPC. I have created a peering connection and the CIDRs of each VPC do not over lap.
I am getting
Error: Error creating DB Instance: InvalidParameterCombination: The DB instance and EC2 security group are in different VPCs. The DB instance is in prod-vpc and the EC2 security group is in dev-vpc
Here are my terraform defintions. I have also added the other peer's relevant CIDRs to the route tables of each peer VPC. The source RDS and prod VPC were both created in a separate process and already exist outside of this terraform process.
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "2.77.0"
name = "dev-vpc"
cidr = "192.168.0.0/16"
azs = ["us-west-2a"]
enable_dns_hostnames = true
enable_dns_support = true
}
module "keypair" {
source = "git::https://github.com/rhythmictech/terraform-aws-secretsmanager-keypair"
name_prefix = "ec2-ssh"
description = "SSH keypair for instances"
}
resource "aws_security_group" "dev-sg-pub" {
vpc_id = module.vpc.vpc_id
ingress {
from_port = 5432 # testing
to_port = 5432 # testing
protocol = "tcp"
cidr_blocks = ["192.168.1.0/28","192.168.2.0/24"]
self = true
}
egress {
from_port = 5432 # testing
to_port = 5432 # testing
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
}
resource "aws_security_group" "dev-sg-priv" {
vpc_id = module.vpc.vpc_id
ingress {
from_port = 5432 # testing
to_port = 5432 # testing
protocol = "tcp"
cidr_blocks = ["192.168.1.0/28", "192.168.2.0/24"]
security_groups = ["sg-xxxxxxxxxxxxxxx"] # the pub subnet's sg
self = true
}
egress {
from_port = 5432 # testing
to_port = 5432 # testing
protocol = "tcp"
cidr_blocks = ["192.168.1.0/28", "192.168.2.0/24"]
}
}
resource "aws_subnet" "dev-subnet-pub" {
vpc_id = module.vpc.vpc_id
cidr_block = "192.168.1.0/28"
tags = {
Name = "dev-subnet-pub"
Terraform = "true"
Environment = "dev"
}
}
resource "aws_subnet" "dev-subnet-priv" {
vpc_id = module.vpc.vpc_id
cidr_block = "192.168.2.0/24"
tags = {
Name = "dev-subnet-priv"
Terraform = "true"
Environment = "dev"
}
}
resource "aws_vpc_peering_connection" "dev-peer-conn" {
peer_vpc_id = "vpc-xxxxxxxxxxxxxxa"
vpc_id = module.vpc.vpc_id
auto_accept = true
}
resource "aws_db_instance" "dev-replica" {
name = "dev-replica"
identifier = "dev-replica"
replicate_source_db = "arn:aws:rds:us-west-2:9999999999:db:tf-xxxxxxxx"
instance_class = "db.t3.small"
apply_immediately = false
publicly_accessible = false
skip_final_snapshot = true
vpc_security_group_ids = [aws_security_group.dev-sg-priv.id, "sg-xxxxxxxxxxx"]
depends_on = [aws_vpc_peering_connection.dev-peer-conn]
}
You can't do this. SGs have VAC-scope, and your RDS must use SG from the VPC it is located it.
Since you peered your VPCs, you can only reference SG across VPCs in your aws_security_group.dev-sg-priv.
I have a setup via Terraform which includes a VPC, a public subnet, and an EC2 instance with a security group. I am trying to ping the EC2 instance but get timeouts.
A few things I've tried to ensure:
the EC2 is in the subnet, and the subnet is routed to internet via the gateway
the EC2 has a security group allowing all traffic both ways
the EC2 has an elastic IP
The VPC has an ACL that is attached to the subnet and allows all traffic both ways
I'm not sure what I missed here.
My tf file looks like (edited to reflect latest changes):
resource "aws_vpc" "foobar" {
cidr_block = "10.0.0.0/16"
}
resource "aws_internet_gateway" "foobar_gateway" {
vpc_id = aws_vpc.foobar.id
}
/*
Public subnet
*/
resource "aws_subnet" "foobar_subnet" {
vpc_id = aws_vpc.foobar.id
cidr_block = "10.0.1.0/24"
}
resource "aws_route_table" "foobar_routetable" {
vpc_id = aws_vpc.foobar.id
route {
cidr_block = "0.0.0.0/0"
gateway_id = aws_internet_gateway.foobar_gateway.id
}
}
resource "aws_route_table_association" "foobar_routetable_assoc" {
subnet_id = aws_subnet.foobar_subnet.id
route_table_id = aws_route_table.foobar_routetable.id
}
/*
Web
*/
resource "aws_security_group" "web" {
name = "vpc_web"
vpc_id = aws_vpc.foobar.id
ingress {
protocol = -1
from_port = 0
to_port = 0
cidr_blocks = ["0.0.0.0/0"]
}
egress {
protocol = -1
from_port = 0
to_port = 0
cidr_blocks = ["0.0.0.0/0"]
}
}
resource "aws_network_acl" "main" {
vpc_id = aws_vpc.foobar.id
subnet_ids = [aws_subnet.foobar_subnet.id]
egress {
protocol = -1
rule_no = 100
action = "allow"
cidr_block = "0.0.0.0/0"
from_port = 0
to_port = 0
}
ingress {
protocol = -1
rule_no = 100
action = "allow"
cidr_block = "0.0.0.0/0"
from_port = 0
to_port = 0
}
}
resource "aws_instance" "web-1" {
ami = "ami-0323c3dd2da7fb37d"
instance_type = "t2.micro"
subnet_id = aws_subnet.foobar_subnet.id
associate_public_ip_address = true
}
resource "aws_eip" "web-1" {
instance = aws_instance.web-1.id
vpc = true
}
Why can I not ping my EC2 instance when I've set up the VPC and EC2 via Terraform?
Why are you adding the self parameter in your security group rule. The docs for terraform state that If true, the security group itself will be added as a source to this ingress rule. Which basically means that only that security group can access the instance. Please remove that and try.
EDIT: see comments below for steps that fixed the problem
Allowing all the traffic through security group would not enable ping to the instance. You need to add a specific security rule - image shown below to enable the ping request.
Remember that AWS has made this rule separate to ensure that you know what you are doing. Being able to ping the instance from anywhere around the world leaves your instance vulnerable to people trying to find instance by bruteforcing various IP address.
Hence, it is advisable to carefully change this rule.
I want to create two ec2 instance under a vpc via terraform. With my code I can easily deploy the vpc, security group & subnet but when in instance section found some error. Can you help me to resolve this issue. But 'terraform plan' command will execute successfully.
provider "aws" {
region = "us-west-2"
access_key = "xxxxxxxx"
secret_key = "xxxxxxxxxxxxx"
}
resource "aws_vpc" "main" {
cidr_block = "10.0.0.0/16"
instance_tenancy = "default"
}
resource "aws_subnet" "public" {
vpc_id = "${aws_vpc.main.id}"
cidr_block = "10.0.1.0/24"
}
resource "aws_security_group" "allow_ssh" {
name = "allow_ssh"
description = "Allow ssh inbound traffic"
vpc_id = "${aws_vpc.main.id}"
ingress {
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
}
resource "aws_instance" "instance"{
ami = "ami-0fc025e3171c5a1bf"
instance_type = "t2.micro"
vpc_security_group_ids = ["${aws_security_group.allow_ssh.id}"]
subnet_id = "${aws_subnet.public.id}"
}
output "vpc_id" {
value = "${aws_vpc.main.id}"
}
output "subnet_id" {
value = "${aws_subnet.public.id}"
}
output "vpc_security_group_id" {
value = "${aws_security_group.allow_ssh.id}"
}
Error
ami-0fc025e3171c5a1bf is for the arm64 architecture and will not work with a t2.micro. If you need an arm platform, you will need to use an instance type in the a1 family.
Otherwise, you can use the x86-64 equivalent of Ubuntu Server 18.04 LTS (HVM), SSD Volume using ami-0d1cd67c26f5fca19.