Correct design pattern for single server in AWS - amazon-web-services

I have customized cluster software that runs in a single AZ (subnet). One of the servers is the "controller". There can only be one of these running at a time. I need to be able to have it in the local DNS. It needs to automatically rebuild itself if it fails for any reason. I do not believe I need an elb/alb/nlb setup for this. However, when I set the system up with an autoscaling group, I am not able to get to the private IP address to update the route53 record. Is there a correct design pattern for this in AWS?
Here is the stub code, which does work in rebuilding the server from scratch if it is stopped or becomes unhealthy.
resource "aws_launch_configuration" "example" {
image_id = "${lookup(var.AmiLinux, var.region)}"
instance_type = "t2.micro"
security_groups = [aws_security_group.ingress-all-test.id]
key_name = "akeyname"
lifecycle {
create_before_destroy = true
}
}
data "aws_availability_zones" "all" {}
resource "aws_autoscaling_group" "example" {
launch_configuration = aws_launch_configuration.example.id
min_size = 1
max_size = 1
health_check_grace_period = 60
vpc_zone_identifier = ["${aws_subnet.subnetTest.id}"]
tag {
key = "Name"
value = "tf-asg-example"
propagate_at_launch = true
}
}
I do like the above as it does maintain a single server in an AZ. However, ASG makes it rather hard to get to the IP. I am not looking for the user-data to "hack" the change on boot. Since it can only run in a single subnet (AZ), I cannot use an elb. Thanks in advance for any design pattern for this type of setup.

Related

Google Cloud VM Autoscaling in Terraform - Updating Images

I can create a fully-functioning autoscaling group in GCP Compute via Terraform, but it's not clear to me how to update the ASG to use a new image.
Here's an example of a working ASG:
resource "google_compute_region_autoscaler" "default" {
name = "example-autoscaler"
region = "us-west1"
project = "my-project
target = google_compute_region_instance_group_manager.default.id
autoscaling_policy {
max_replicas = 10
min_replicas = 3
cooldown_period = 60
cpu_utilization {
target = 0.5
}
}
}
resource "google_compute_region_instance_group_manager" "default" {
name = "example-igm"
region = "us-west1"
version {
instance_template = google_compute_instance_template.default.id
name = "primary"
}
target_pools = [google_compute_target_pool.default.id]
base_instance_name = "example"
}
resource "google_compute_target_pool" "default" {
name = "example-pool"
}
resource "google_compute_instance_template" "default" {
name = "example-template"
machine_type = "e2-medium"
can_ip_forward = false
tags = ["my-tag"]
disk {
source_image = data.google_compute_image.default.id
}
network_interface {
subnetwork = "my-subnet"
}
}
data "google_compute_image" "default" {
name = "my-image"
}
My goal is to be able to create a new Image (out of band) and then update my infrastructure to utilize it. It doesn't appear possible to change a google_compute_instance_template while it's in use.
One option I can think of is to create two separate templates, and then adjust the google_compute_region_instance_group_manager to refer to a different google_compute_instance_template which references the new image.
Another possible option is to use the version block inside the instance group manager. You can use this similarly to above to essentially toggle between two versions. You start with one "version" at 100%, and the other at 0%. When you create a new image, you update the version that is at 0% to point to the new image and change its skew to 100% and the other to 0%. I'm not actually sure this works, because you'd still have to update the template of the version that's at 0% and it might actually be in use.
Regardless, both of those methods are incredibly bulky for a large-scale production environment where we have multiple autoscaling groups in multiple regions and update them frequently.
Ideally I'd be able to change a variable that represents the image, terraform apply and that's it. Is there any way to do what I'm describing?
You need to use the lifecycle block with the create_before_destroy but also the attribute name of the google_compute_instance_template resource must be omitted or replaced by name_prefix attribute.
With this setup Terraform generates a unique name for your Instance Template and can then update the Instance Group manager without conflict before destroying the previous Instance Template.
References:
https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_instance_template#using-with-instance-group-manager

Filter out Subnet IDs based on sufficient capacity in availability zones in Terraform

I'm trying to deploy an EKS cluster and everything seems to be fine except for one!
The facade module looks like this:
module "eks" {
source = "../../../../infrastructure_modules/eks"
## EKS ##
create_eks = var.create_eks
cluster_version = var.cluster_version
cluster_name = local.cluster_name
vpc_id = data.aws_vpc.this.id
subnets = data.aws_subnet_ids.this.ids
# note: either pass worker_groups or node_groups
# this is for (EKSCTL API) unmanaged node group
worker_groups = var.worker_groups
# this is for (EKS API) managed node group
node_groups = var.node_groups
## Common tag metadata ##
env = var.env
app_name = var.app_name
tags = local.eks_tags
region = var.region
}
The VPC id is retrieved through the following block :
data "aws_vpc" "this" {
tags = {
Name = "tagName"
}
}
Which then is used to retrieve the subnet_IDs as following:
data "aws_subnet_ids" "this" {
vpc_id = data.aws_vpc.this.id
}
Nevertheless, deploying this results in error stating:
Error: error creating EKS Cluster (data-layer-eks):
UnsupportedAvailabilityZoneException: Cannot create cluster
'data-layer-eks' because us-east-1e, the targeted availability zone,
does not currently have sufficient capacity to support the cluster.
Which is a well known error, and anybody can come across this for even EC2s.
I could solve this by simply hardcoding the subnet value, but that's really undesirable and hardly maintainable.
So the question is, how can I filter out subnet_IDs based on availability zones that have sufficient capacity?
First you need to collect the subnets with all of their attributes:
data "aws_subnets" "this" {
filter {
name = "vpc-id"
values = [data.aws_vpc.this.id]
}
}
data "aws_subnet" "this" {
for_each = toset(data.aws_subnets.this.ids)
id = each.value
}
data.aws_subnet.this is now a map(object) with all of the subnets and their attributes. You can now filter by availability zone accordingly:
subnets = [for subnet in data.aws_subnet.this : subnet.id if subnet.availability_zone != "us-east-1e"]
You can also filter by truthy conditionals if that condition is easier for you:
subnets = [for subnet in data.aws_subnet.this : subnet.id if contains(["us-east-1a", "us-east-1b", "us-east-1c", "us-east-1d"], subnet.availability_zone)]
It depends on your personal use case.

Dinamically add resources in Terraform

I set up a jenkins pipeline that launches terraform to create a new EC2 instance in our VPC and register it to our private hosted zone on R53 (which is created at the same time) at every run.
I also managed to save the state into S3 so it doesn't fail with the hosted zone being re-created.
the main issue I have is that at every run terraform keeps replacing the previous instance with the new one and not adding it to the pool of instances.
How can avoid this?
here's a snippet of my code
terraform {
backend "s3" {
bucket = "<redacted>"
key = "<redacted>/terraform.tfstate"
region = "eu-west-1"
}
}
provider "aws" {
region = "${var.region}"
}
data "aws_ami" "image" {
# limit search criteria for performance
most_recent = "${var.ami_filter_most_recent}"
name_regex = "${var.ami_filter_name_regex}"
owners = ["${var.ami_filter_name_owners}"]
# filter on tag purpose
filter {
name = "tag:purpose"
values = ["${var.ami_filter_purpose}"]
}
# filter on tag os
filter {
name = "tag:os"
values = ["${var.ami_filter_os}"]
}
}
resource "aws_instance" "server" {
# use extracted ami from image data source
ami = data.aws_ami.image.id
availability_zone = data.aws_subnet.most_available.availability_zone
subnet_id = data.aws_subnet.most_available.id
instance_type = "${var.instance_type}"
vpc_security_group_ids = ["${var.security_group}"]
user_data = "${var.user_data}"
iam_instance_profile = "${var.iam_instance_profile}"
root_block_device {
volume_size = "${var.root_disk_size}"
}
ebs_block_device {
device_name = "${var.extra_disk_device_name}"
volume_size = "${var.extra_disk_size}"
}
tags = {
Name = "${local.available_name}"
}
}
resource "aws_route53_zone" "private" {
name = var.hosted_zone_name
vpc {
vpc_id = var.vpc_id
}
}
resource "aws_route53_record" "record" {
zone_id = aws_route53_zone.private.zone_id
name = "${local.available_name}.${var.hosted_zone_name}"
type = "A"
ttl = "300"
records = [aws_instance.server.private_ip]
depends_on = [
aws_route53_zone.private
]
}
the outcome is that my previously created instance is destroyed and a new one is created. what I want is to keep adding instances with this code.
thank you
Your code creates only one instance aws_instance.server, and any change to its properties will modify that one instance only as your backend is in S3, thus it acts as a global state for each pipeline. The same goes for aws_route53_record.record and anything else in your script.
If you want different pipelines to reuse the same exact script, you should either use different workspaces, or create different TF states for each pipeline. The other alternative is to redefine your TF script to take a map of instances as an input variable and use for_each to create different instances.
If those instances should be same, you should manage their count using using aws_autoscaling_group and desired capacity.

terraform modules ec2 and vpc AWS

I have question about how use modules in terraform.
See below my code.
module "aws_vpc"{
source = "../modules/vpc"
vpc_cidr_block = "192.168.0.0/16"
name_cidr = "ec2-eks"
name_subnet = "ec2-eks-subnet"
subnet_cidr = ["192.168.1.0/25"]
}
module "ec2-eks" {
source = "../modules/ec2"
ami_id = "ami-07c8bc5c1ce9598c3"
subnet_id = module.aws_vpc.aws_subnet[0]
count_server = 1
}
output "aws_vpc" {
value = module.aws_vpc.aws_subnet[0]
}
I`m creating a vpc and want the next step to attach ec2 by my created subnet.But terraform attached by VPC of default.
What do I need to do that attach ec2 to my vpc(subnet)?
Thank you for you answers
What do I need to do that attach ec2 to my vpc(subnet)?
aws_instance has subnet_id attribute. Thus to place your instance in a given subnet, you have to set the subnet_id.
Since you are using a module to create your aws_vpc, likely the module will output subnet IDs as well. Due to lack of details of the module, its difficult to provide an exact answer, but in a general scenario you would do something along these lines (example):
resource "aws_instance" "web" {
ami = data.aws_ami.ubuntu.id
instance_type = "t3.micro"
subnet_id = module.aws_vpc.subnet_id
tags = {
Name = "HelloWorld"
}
}
Obviously, the above depends on the implementation of your module.
Thank you.
I`ve got success resources in AWS. I forget to set in the module of ec2 a parameter subnet_id

How to format and mount an ephemeral disk with Terraform?

I'm in the process of writing Packer and Terraform code to create an immutable infra on aws. However, it does not seem very straightforward to install ext4 on a disk and mount it.
The steps seem simple:
Creating the ami with packer on t2.micro that contains all software, to be used first on test and afterwards on production.
Launch a r3.4xlarge instance from this ami that has a 300GB ephemeral disk. Format this disk as ext4, mount it and redirect /var/lib/docker to the new filesystem for performance reasons.
Complete the rest of the application launching.
First of all:
Is it best practice to create the ami with the same instance type you will use it for or to have one 'generic' image and start multipe instance types from that?
What philosophy is the best?
packer(software versions) -> terraform(instance + mount disk) -> deploy?
packer(software versions) -> packer(instancetype specific mounts) -> terraform(instance) -> deploy?
packer(software versions, instance specific mounts) -> terraform -> deploy?
The latter is starting to look better and better but requires an ami per instance type.
What I have tried so far:
According to this answer it is better to use the user_data way of working instead of the provisioners way. So I'm going down that road.
This answer seemed promising but is so old it does not work anymore. I could update it but there might be a different, better way.
This answer also seemed promising but was complaining about the ${DEVICE}. I am wondering where that variable is coming from as there are no vars specified in the template_file. If I set my own DEVICE variable to xvdb then it runs, but does not produce a result because xvdb is visible in lsblk but not in blkid.
Here is my code. The format_disks.sh file is the same as the one mentioned above. Any help is greatly appreciated.
# Create a new instance of the latest Ubuntu 16.04 on an
# t2.micro node with an AWS Tag naming it "test1"
provider "aws" {
region = "us-east-1"
}
data "template_file" "format-disks" {
template = "${file("format_disk.sh")}"
vars {
DEVICE = "xvdb"
}
}
resource "aws_instance" "test1" {
ami = "ami-98181234"
instance_type = "r3.4xlarge"
key_name = "keypair-1" # This needs to be changed so multiple users can use this
subnet_id = "subnet-a0aeb123" # maps to the vpc for the us production
associate_public_ip_address = "true"
vpc_security_group_ids = ["sg-f3e91234"] #backendservers
user_data = "${data.template_file.format-disks.rendered}"
tags {
Name = "test1"
}
ephemeral_block_device {
device_name = "xvdb"
virtual_name = "ephemeral0"
}
}
Let me give you my thoughts about this topic.
I think the cloud-init is the key to AWS because you can create the machine you want dynamically.
First, try to change some global script, will be used when your machine is starting. Then, you should add that script as user data I suggest you play with ec2 autoscaling at the same time, so, if you change the cloud-init script, you may terminate the instance, another one will be created automatically.
My structure directories.
.
|____main.tf
|____templates
| |____cloud-init.tpl
main.tf
provider "aws" {
region = "us-east-1"
}
data "template_file" "cloud_init" {
template = file("${path.module}/templates/cloud-init.tpl")
}
data "aws_ami" "linux_ami" {
most_recent = "true"
owners = ["amazon"]
filter {
name = "name"
values = ["amzn2-ami-hvm-2.0.????????.?-x86_64-gp2"]
}
}
resource "aws_instance" "test1" {
ami = data.aws_ami.linux_ami.image_id
instance_type = "r3.4xlarge"
key_name = "keypair-1"
subnet_id = "subnet-xxxxxx"
associate_public_ip_address = true
vpc_security_group_ids = ["sg-xxxxxxx"]
user_data = data.template_file.cloud_init.rendered
root_block_device {
delete_on_termination = true
encrypted = true
volume_size = 10
volume_type = "gp2"
}
ebs_block_device {
device_name = "ebs-block-device-name"
delete_on_termination = true
encrypted = true
volume_size = 10
volume_type = "gp2"
}
network_interface {
device_index = 0
network_interface_id = var.network_interface_id
delete_on_termination = true
}
tags = {
Name = "test1"
costCenter = "xxxxx"
owner = "xxxxx"
}
}
templates/cloud-init.tpl
#!/bin/bash -x
yum update -y
yum install -y https://s3.amazonaws.com/ec2-downloads-windows/SSMAgent/latest/linux_amd64/amazon-ssm-agent.rpm
systemctl enable amazon-ssm-agent
systemctl start amazon-ssm-agent
pip install aws-ssm-tunnel-agent
echo "[INFO] SSM agent has been installed!"
# More scripts here.
Would you like to have a temporal disk attached? Have you tried to add a root_block_device with delete_on_termination with a true as value? This way after destroying the aws ec2 instance resource, the disk will be deleted. It's a good way to save costs on AWS but be carefull, Just use it if the data stored on isn't important or if you've backed up.
If you need to attach an external ebs disk on this instance, you can use the AWS API, make sure you have the machine in the same AZ that the disk you can use it.
Let me know if you need some bash script but this is straightforward to do.