AWS Elasticache redis cluster configuration - amazon-web-services

I'm a bit new to terraform and was and needed some help on what's the issue with this. It creates the according resources but when connecting to the endpoint, I get a timeout. I noticed the security group isn't actually being created but I'm not sure why. Any help would be appreciated.
configuration:
provider "aws" {
region = "us-west-2"
}
resource "aws_elasticache_cluster" "example" {
cluster_id = "cluster-example"
engine = "redis"
node_type = "cache.m4.large"
num_cache_nodes = 1
parameter_group_name = "default.redis3.2"
engine_version = "3.2.10"
port = 6379
}
resource "aws_security_group" "example" {
name = "example"
description = "Used by the example Redis cluster"
vpc_id = "${aws_vpc.example.id}"
ingress {
description = "TLS from VPC"
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = [aws_vpc.example.cidr_block]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
ipv6_cidr_blocks = ["::/0"]
}
}
resource "aws_vpc" "example" {
cidr_block = "10.0.0.0/16"
tags = {
Name = "example"
}
}
resource "aws_subnet" "example" {
vpc_id = "${aws_vpc.example.id}"
cidr_block = "10.0.0.0/20"
tags = {
Name = "example"
}
}
resource "aws_elasticache_subnet_group" "example" {
name = "example"
description = "Example subnet group"
subnet_ids = ["${aws_subnet.example.id}"]
}
connection to endpoint:
import os
import redis
ENDPOINT = os.environ.get('REDIS_HOST')
client = redis.Redis(host=ENDPOINT, port=6379, db=0)
client.ping()
(passwordless cluster)
EDIT:
I call the endpoint in python on my local machine.

You can't access EC cluster from outside of AWS directly, as it can only be accessed from VPC. You must use VPN, Direct Connect or SSH tunnel if you want to connect from your home network.

Related

How to launch multiple AWS EC2 instances from a single VPC using Terraform?

Is it possible to launch multiple ec2 instances from terraform using a single VPC? I'm building something which requires multiple instances to be launched from the same region and I'm doing all this using Terraform. But there's a limit in AWS VPC: per region only 5 VPCs are allowed. What I've been doing until now is each time when I need to launch an instance I create a separate VPC for it in terraform. Below is the code for reference:
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 3.0"
}
}
}
# Configure the AWS Provider
provider "aws" {
region = "us-east-2"
access_key = "XXXXXXXXXXXXXXXXX"
secret_key = "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
}
# 1. Create vpc
resource "aws_vpc" "prod-vpc" {
cidr_block = "10.0.0.0/16"
tags = {
Name = "production"
}
}
# 2. Create Internet Gateway
resource "aws_internet_gateway" "gw" {
vpc_id = aws_vpc.prod-vpc.id
}
# 3. Create Custom Route Table
resource "aws_route_table" "prod-route-table" {
vpc_id = aws_vpc.prod-vpc.id
route {
cidr_block = "0.0.0.0/0"
gateway_id = aws_internet_gateway.gw.id
}
route {
ipv6_cidr_block = "::/0"
gateway_id = aws_internet_gateway.gw.id
}
tags = {
Name = "Prod"
}
}
# 4. Create a Subnet
resource "aws_subnet" "subnet-1" {
vpc_id = aws_vpc.prod-vpc.id
cidr_block = "10.0.1.0/24"
availability_zone = "us-east-2a"
tags = {
Name = "prod-subnet"
}
}
# 5. Associate subnet with Route Table
resource "aws_route_table_association" "a" {
subnet_id = aws_subnet.subnet-1.id
route_table_id = aws_route_table.prod-route-table.id
}
# 6. Create Security Group to allow port 22,80,443
resource "aws_security_group" "allow_web" {
name = "allow_web_traffic"
description = "Allow Web inbound traffic"
vpc_id = aws_vpc.prod-vpc.id
ingress {
description = "HTTPS"
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
ingress {
description = "self"
from_port = 8000
to_port = 8000
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
ingress {
description = "HTTP"
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
ingress {
description = "SSH"
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "allow_web"
}
}
# 7. Create a network interface with an ip in the subnet that was created in step 4
resource "aws_network_interface" "web-server-nic" {
subnet_id = aws_subnet.subnet-1.id
private_ips = ["10.0.1.50"]
security_groups = [aws_security_group.allow_web.id]
}
# 8. Assign an elastic IP to the network interface created in step 7
resource "aws_eip" "one" {
vpc = true
network_interface = aws_network_interface.web-server-nic.id
associate_with_private_ip = "10.0.1.50"
depends_on = [aws_internet_gateway.gw]
}
output "server_public_ip" {
value = aws_eip.one.public_ip
}
# 9. Create Ubuntu server and install/enable apache2
resource "aws_instance" "web-server-instance" {
ami = var.AMI_ID
instance_type = "g4dn.xlarge"
availability_zone = "us-east-2a"
key_name = "us-east-2"
network_interface {
device_index = 0
network_interface_id = aws_network_interface.web-server-nic.id
}
root_block_device {
volume_size = "200"
}
iam_instance_profile = aws_iam_instance_profile.training_profile.name
depends_on = [aws_eip.one]
user_data = <<-EOF
#!/bin/bash
python3 /home/ubuntu/setting_instance.py
EOF
tags = {
Name = var.INSTANCE_NAME
}
}
The only downside to this code is it creates separate VPC everytime I create an instance. I read in a stackoverflow post that we can import an existing VPC using terraform import command. Along with the VPC, I had to import the Internet Gateway and Route Table as well (it was throwing error otherwise). But then I wasn't able to access the instance using SSH and also the commands in the user_data part didn't execute (setting_instance.py will send a firebase notification once the instance starts. That's the only purpose of setting_instance.py)
Not only VPC I'd also like to know if I can use the other resources as well to it's fullest extent possible.
I'm new to terraform and AWS. Any suggestions in the above code are welcome.
EDIT: Instances are created one at a time according to the need, i.e., whenever there is a need to create a new instance I use this code. In the current scenario if there are already 5 instances running up in a region then I won't be able to use this code to create a 6th instance in the same region when the demand arises.
If as you say, they would be exactly same, the easiest way would be to use count, which would indicate how many instance you want to have. For that you can introduce new variable:
variable "number_of_instance" {
default = 1
}
and then
resource "aws_instance" "web-server-instance" {
count = var.number_of_instance
ami = var.AMI_ID
instance_type = "g4dn.xlarge"
availability_zone = "us-east-2a"
key_name = "us-east-2"
network_interface {
device_index = 0
network_interface_id = aws_network_interface.web-server-nic.id
}
root_block_device {
volume_size = "200"
}
iam_instance_profile = aws_iam_instance_profile.training_profile.name
depends_on = [aws_eip.one]
user_data = <<-EOF
#!/bin/bash
python3 /home/ubuntu/setting_instance.py
EOF
tags = {
Name = var.INSTANCE_NAME
}
}
All this must be manage by same state file, not fully separate state files, as again you will end up with duplicates of the VPC. You only change number_of_instance to what you want. For more resilient solution, you would have to use autoscaling group for the instances.

Can't resolve AWS ElastiCache redis cluster endpoint

I'm trying to use an elasticache redis cluster from a fargate service, but am getting address resolution issues - "ENOTFOUND" errors from "getaddrinfo" for the cluster endpoint in my code, and NXDOMAIN errors if I try and do nslookup from inside the container.
The cluster is set up as
resource "aws_elasticache_cluster" "redis_service" {
cluster_id = "${terraform.workspace}-${local.app_name}-redis-service"
engine = "redis"
node_type = "cache.t4g.micro"
num_cache_nodes = 1
parameter_group_name = "default.redis6.x"
engine_version = "6.2"
port = 6379
subnet_group_name = aws_elasticache_subnet_group.redis_subnet.name
security_group_ids = [aws_security_group.redis.id]
}
resource "aws_security_group" "redis" {
vpc_id = aws_vpc.ecs-vpc.id
name = "${terraform.workspace}-redis"
ingress {
from_port = 0
to_port = 0
protocol = "-1"
self = true
cidr_blocks = ["0.0.0.0/0"] # Originally had this restricting to the security group that my fargate service is in, but made it less restrictive trying to debug the issue
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
}
### AZ 1 (and similar for other AZs)
resource "aws_subnet" "redis1" {
vpc_id = aws_vpc.ecs-vpc.id
cidr_block = "10.0.40.0/24"
availability_zone = "eu-west-2a"
}
resource "aws_route_table" "redis1" {
vpc_id = aws_vpc.ecs-vpc.id
}
resource "aws_route_table_association" "redis1" {
route_table_id = aws_route_table.redis1.id
subnet_id = aws_subnet.redis1.id
}
resource "aws_route" "redis1" {
destination_cidr_block = "0.0.0.0/0"
nat_gateway_id = aws_nat_gateway.subnet1.id
route_table_id = aws_route_table.redis1.id
}
The fargate service I'm trying to connect from is in the same vpc.
I've checked throw the AWS documentation on connected to elasticache clusters, but not seen anything that seems to be missing.
The VPC reachability analyzer seems to confirm that the cluster should be reachable from the service (using the relevant enis).
Any ideas for fixes or debugging steps?

Cannot reach YUM repos on terraform EC2 instances

background:
I used terraform to build an AWS autoscaling group with a few instances spread across availability zones and linked by a load balancer. Everything is created properly, but the load balancer has no valid targets because they're nothing listening on port 80.
Fine, I thought. I'll install NGINX and throw up a basic config.
expected behavior
instances should be able to reach yum repos
actual behavior
I'm unable to ping anything or run any of the package manager commands, getting the following error
Could not retrieve mirrorlist https://amazonlinux-2-repos-us-east-2.s3.dualstack.us-east-2.amazonaws.com/2/core/latest/x86_64/mirror.list error was
12: Timeout on https://amazonlinux-2-repos-us-east-2.s3.dualstack.us-east-2.amazonaws.com/2/core/latest/x86_64/mirror.list: (28, 'Failed to connect to amazonlinux-2-repos-us-east-2.s3.dualstack.us-east-2.amazonaws.com port 443 after 2700 ms: Connection timed out')
One of the configured repositories failed (Unknown),
and yum doesn't have enough cached data to continue. At this point the only
safe thing yum can do is fail. There are a few ways to work "fix" this:
1. Contact the upstream for the repository and get them to fix the problem.
2. Reconfigure the baseurl/etc. for the repository, to point to a working
upstream. This is most often useful if you are using a newer
distribution release than is supported by the repository (and the
packages for the previous distribution release still work).
3. Run the command with the repository temporarily disabled
yum --disablerepo=<repoid> ...
4. Disable the repository permanently, so yum won't use it by default. Yum
will then just ignore the repository until you permanently enable it
again or use --enablerepo for temporary usage:
yum-config-manager --disable <repoid>
or
subscription-manager repos --disable=<repoid>
5. Configure the failing repository to be skipped, if it is unavailable.
Note that yum will try to contact the repo. when it runs most commands,
so will have to try and fail each time (and thus. yum will be be much
slower). If it is a very temporary problem though, this is often a nice
compromise:
yum-config-manager --save --setopt=<repoid>.skip_if_unavailable=true
Cannot find a valid baseurl for repo: amzn2-core/2/x86_64
troubleshooting steps taken
I'm new to Terraform, and I'm still having issues doing the automated provisioning of user_data, so I SSH'd into the instance. The instance is set up in a public subnet with an auto-provisioned public IP. Below is the code for the security groups.
resource "aws_security_group" "elb_webtrafic_sg" {
name = "elb-webtraffic-sg"
description = "Allow inbound web trafic to load balancer"
vpc_id = aws_vpc.main_vpc.id
ingress {
description = "HTTPS trafic from vpc"
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
ingress {
description = "HTTP trafic from vpc"
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
ingress {
description = "allow SSH"
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
description = "all traffic out"
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "elb-webtraffic-sg"
}
}
resource "aws_security_group" "instance_sg" {
name = "instance-sg"
description = "Allow traffic from load balancer to instances"
vpc_id = aws_vpc.main_vpc.id
ingress {
description = "web traffic from load balancer"
security_groups = [ aws_security_group.elb_webtrafic_sg.id ]
from_port = 80
to_port = 80
protocol = "tcp"
}
ingress {
description = "web traffic from load balancer"
security_groups = [ aws_security_group.elb_webtrafic_sg.id ]
from_port = 443
to_port = 443
protocol = "tcp"
}
ingress {
description = "ssh traffic from anywhere"
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
description = "all traffic to load balancer"
security_groups = [ aws_security_group.elb_webtrafic_sg.id ]
from_port = 0
to_port = 0
protocol = "-1"
}
tags = {
Name = "instance-sg"
}
}
#this is a workaround for the cyclical security group id call
#I would like to figure out a way for this to destroy this first
#it currently takes longer to destroy than to set up
#terraform hangs because of the dependancy each SG has on each other,
#but will eventually struggle down to this rule and delete it, clearing the deadlock
resource "aws_security_group_rule" "elb_egress_to_webservers" {
security_group_id = aws_security_group.elb_webtrafic_sg.id
type = "egress"
source_security_group_id = aws_security_group.instance_sg.id
from_port = 80
to_port = 80
protocol = "tcp"
}
resource "aws_security_group_rule" "elb_tls_egress_to_webservers" {
security_group_id = aws_security_group.elb_webtrafic_sg.id
type = "egress"
source_security_group_id = aws_security_group.instance_sg.id
from_port = 443
to_port = 443
protocol = "tcp"
}
Since I was able to SSH into the machine, I tried to set up the web instance security group to allow direct connection from the internet to the instance. Same errors: cannot ping outside web addresses, same error on YUM commands.
I can ping the default gateway in each subnet. 10.0.0.1, 10.0.1.1, 10.0.2.1.
Here is the routing configuration I currently have setup.
resource "aws_vpc" "main_vpc" {
cidr_block = "10.0.0.0/16"
tags = {
Name = "production-vpc"
}
}
resource "aws_key_pair" "aws_key" {
key_name = "Tanchwa_pc_aws"
public_key = file(var.public_key_path)
}
#internet gateway
resource "aws_internet_gateway" "gw" {
vpc_id = aws_vpc.main_vpc.id
tags = {
Name = "internet-gw"
}
}
resource "aws_route_table" "route_table" {
vpc_id = aws_vpc.main_vpc.id
route {
cidr_block = "0.0.0.0/0"
gateway_id = aws_internet_gateway.gw.id
}
tags = {
Name = "production-route-table"
}
}
resource "aws_subnet" "public_us_east_2a" {
vpc_id = aws_vpc.main_vpc.id
cidr_block = "10.0.0.0/24"
availability_zone = "us-east-2a"
tags = {
Name = "Public-Subnet us-east-2a"
}
}
resource "aws_subnet" "public_us_east_2b" {
vpc_id = aws_vpc.main_vpc.id
cidr_block = "10.0.1.0/24"
availability_zone = "us-east-2b"
tags = {
Name = "Public-Subnet us-east-2b"
}
}
resource "aws_subnet" "public_us_east_2c" {
vpc_id = aws_vpc.main_vpc.id
cidr_block = "10.0.2.0/24"
availability_zone = "us-east-2c"
tags = {
Name = "Public-Subnet us-east-2c"
}
}
resource "aws_route_table_association" "a" {
subnet_id = aws_subnet.public_us_east_2a.id
route_table_id = aws_route_table.route_table.id
}
resource "aws_route_table_association" "b" {
subnet_id = aws_subnet.public_us_east_2b.id
route_table_id = aws_route_table.route_table.id
}
resource "aws_route_table_association" "c" {
subnet_id = aws_subnet.public_us_east_2c.id
route_table_id = aws_route_table.route_table.id
}

Can't connect to Terraform-created instance with Private Key, but CAN connect when I create instance in Console

I've created the following key pair and EC2 instance using Terraform. I'll leave the SG config out of it, but it allows SSH from the internet.
When I try to SSH into this instance I get the errors "Server Refused our Key" and "No supported authentication methods available (server sent: publickey).
However I am able to login when I create a separate EC2 instance in the console and assign it the same key pair assigned in the TF script.
Has anyone seen this behavior? What causes it?
# Create Dev VPC
resource "aws_vpc" "dev_vpc" {
cidr_block = "10.0.0.0/16"
instance_tenancy = "default"
enable_dns_hostnames = "true"
tags = {
Name = "dev"
}
}
# Create an Internet Gateway Resource
resource "aws_internet_gateway" "igw" {
vpc_id = aws_vpc.dev_vpc.id
tags = {
Name = "dev-engineering-igw"
}
}
# Create a Route Table
resource "aws_route_table" " _dev_public_routes" {
vpc_id = aws_vpc. _dev.id
tags = {
name = " _dev_public_routes"
}
}
# Create a Route
resource "aws_route" " _dev_internet_access" {
route_table_id = aws_route_table. _dev_public_routes.id
destination_cidr_block = "0.0.0.0/0"
gateway_id = aws_internet_gateway.igw.id
}
# Associate the Route Table to our Public Subnet
resource "aws_route_table_association" " _dev_public_subnet_assoc" {
subnet_id = aws_subnet. _dev_public.id
route_table_id = aws_route_table. _dev_public_routes.id
}
# Create public subnet for hosting customer-facing Django app
resource "aws_subnet" " _dev_public" {
vpc_id = aws_vpc. _dev.id
cidr_block = "10.0.0.0/17"
availability_zone = "us-west-2a"
tags = {
Env = "dev"
}
}
resource "aws_security_group" "allow_https" {
name = "allow_https"
description = "Allow http and https inbound traffic"
vpc_id = aws_vpc. _dev.id
ingress {
description = "HTTP and HTTPS into VPC"
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
ingress {
description = "HTTP and HTTPS into VPC"
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
ingress {
description = "SSH"
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
description = "HTTP and HTTPS out of VPC for Session Manager"
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "allow_https"
}
}
resource "aws_instance" "web" {
ami = data.aws_ami.ubuntu20.id
instance_type = "t3.micro"
subnet_id = aws_subnet. _dev_public.id
associate_public_ip_address = "true"
vpc_security_group_ids = ["${aws_security_group.allow_https.id}"]
key_name = "key_name"
metadata_options { #Enabling IMDSv2
http_endpoint = "disabled"
http_tokens = "required"
}
tags = {
Env = "dev"
}
}
As specified in the comments, removing the metadata_options from the instance resource resolves the issue.
The fix is to update the metadata_options to be:
metadata_options { #Enabling IMDSv2
http_endpoint = "enabled"
http_tokens = "required"
}
Looking at the Terraform documentation for metadata_options shows that:
http_endpoint = "disabled" means that the metadata service is unavailable.
http_tokens = "required" means that the metadata service requires session tokens (ie IMDSv2).
This is an invalid configuration, as specified in the AWS docs:
You can opt in to require that IMDSv2 is used when requesting instance metadata. Use the modify-instance-metadata-options CLI command and set the http-tokens parameter to required. When you specify a value for http-tokens, you must also set http-endpoint to enabled.

Want to create two ec2 instance under a vpc using terraform

I want to create two ec2 instance under a vpc via terraform. With my code I can easily deploy the vpc, security group & subnet but when in instance section found some error. Can you help me to resolve this issue. But 'terraform plan' command will execute successfully.
provider "aws" {
region = "us-west-2"
access_key = "xxxxxxxx"
secret_key = "xxxxxxxxxxxxx"
}
resource "aws_vpc" "main" {
cidr_block = "10.0.0.0/16"
instance_tenancy = "default"
}
resource "aws_subnet" "public" {
vpc_id = "${aws_vpc.main.id}"
cidr_block = "10.0.1.0/24"
}
resource "aws_security_group" "allow_ssh" {
name = "allow_ssh"
description = "Allow ssh inbound traffic"
vpc_id = "${aws_vpc.main.id}"
ingress {
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
}
resource "aws_instance" "instance"{
ami = "ami-0fc025e3171c5a1bf"
instance_type = "t2.micro"
vpc_security_group_ids = ["${aws_security_group.allow_ssh.id}"]
subnet_id = "${aws_subnet.public.id}"
}
output "vpc_id" {
value = "${aws_vpc.main.id}"
}
output "subnet_id" {
value = "${aws_subnet.public.id}"
}
output "vpc_security_group_id" {
value = "${aws_security_group.allow_ssh.id}"
}
Error
ami-0fc025e3171c5a1bf is for the arm64 architecture and will not work with a t2.micro. If you need an arm platform, you will need to use an instance type in the a1 family.
Otherwise, you can use the x86-64 equivalent of Ubuntu Server 18.04 LTS (HVM), SSD Volume using ami-0d1cd67c26f5fca19.