Insufficient capacity in availability zone on AWS

Insufficient capacity in availability zone on AWS - amazon-web-services

I got the following error from AWS today.
"We currently do not have sufficient m3.large capacity in the Availability Zone you requested (us-east-1a). Our system will be working on provisioning additional capacity. You can currently get m3.large capacity by not specifying an Availability Zone in your request or choosing us-east-1e, us-east-1b."
What does this mean exactly? It sounds like AWS doesn't have the physical resources to allocate me the virtual resources that I need. That seems unbelievable though.
What's the solution? Is there an easy way to change the availability zone of an instance?
Or do I need to create an AMI and restore it in a new availability zone?

This is not a new issue. You cannot change the availability zone. Best option is to create an AMI and relaunch the instance in new AZ, as you have already said. You would have everything in place. If you want to go across regions, see this - http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/CopyingAMIs.html

You can try getting reserved instances, which guarantee you get the instances all the time.

I fixed this eror by fixing my aws_region and availability_zone values. Once I added aws_subnet_ids, error msg showed me exactly which zone my ec2 was being created.
variable "availability_zone" {
default = "ap-southeast-2c"
}
variable "aws_region" {
description = "EC2 Region for the VPC"
default = "ap-southeast-2c"
}
data "aws_vpc" "default" {
default = true
}
data "aws_subnet_ids" "all" {
vpc_id = "${data.aws_vpc.default.id}"
}
resource "aws_instance" "ec2" {
....
subnet_id = "${element(data.aws_subnet_ids.all.ids, 0)}"
availability_zone = "${var.availability_zone}"
}

Related

Attach existing EBS volumes to launch configuration in Terraform

I have 3 existing EBS volumes that I am trying to attach to instances created with Autoscaling groups. Below is Terraform code on how the EBS volumes are defined:
EBS Volumes
resource "aws_ebs_volume" "volumes" {
count = "${(var.enable ? 1 : 0) * var.number_of_zones}"
availability_zone = "${element(var.azs, count.index)}"
size = "${var.volume_size}"
type = "${var.volume_type}"
lifecycle {
ignore_changes = [
"tags",
]
}
tags {
Name = "${var.cluster_name}-${count.index + 1}"
}
}
My plan is to first use the Terraform import utility so the volumes can be managed my Terraform. Without doing this import, Terraform assumes I am trying to create new EBS volumes which I do not want.
Additionally, I discovered this aws_volume_attachment resource to attach these volumes to instances. I'm struggling to determine what value to put as the instance_id in this resource:
Volume Attachment
resource "aws_volume_attachment" "volume_attachment" {
count = length("${aws_ebs_volume.volumes.id}")
device_name = "/dev/sdf"
volume_id = aws_ebs_volume.volumes.*.id
instance_id = "instance_id_from_autoscaling_group"
}
Additionally, the launch configuration block has an ebs_volume_device block, do I need anything else included in this block? Any advice on this matter would be helpful, as I am having some trouble.
ebs_block_device {
device_name = "/dev/sdf"
no_device = true
}

I'm struggling to determine what value to put as the instance_id in this resource
If you create ASG using TF, you don't have access to the instance IDs. The reason is that ASG is treated as one entity, rather then individual instances.
The only way to get the instance ids from ASG created would be through external data resource or lambda function source.
But even if you could theoretically do it, instances in ASG should be treated as fully disposable, interchangeable and identical. You should not need to customize them, as they can be terminated and replaced at any time by AWS's AZ rebalancing, instance failures or scaling activities.

Terraform, get count of length of data source

Folks,
I'm trying to create a subnet per each aws availability zone available in a AWS region.
data "aws_availability_zones" "azs" {
depends_on = [aws_vpc.k3s_vpc]
state = "available"
}
locals {
azs= "${data.aws_availability_zones.azs.names}"
}
resource "aws_subnet" "private_subnets" {
count = length(data.aws_availability_zones.azs.names)
vpc_id = aws_vpc.k3s_vpc.id
cidr_block = var.private_subnets_cidr[count.index]
availability_zone = local.azs[count.index]
}
getting below error
Error: Invalid count argument
The "count" value depends on resource attributes that cannot be determined
until apply, so Terraform cannot predict how many instances will be created.
To work around this, use the -target argument to first apply only the
resources that the count depends on.
Any ideas ?

Inside your data "aws_availability-zones" "azs" block you've written depends_on = [aws_vpc.k3s_vpc], which means that Terraform can't look up the availability zones until the VPC is already created, but the VPC doesn't exist yet during planning and so you see this error.
The availability zones for a particular region your account don't vary based on the creation of VPCs, so it's not clear to me why you included that dependency. If you remove it then Terraform should see that it is able to resolve that data source during the planning phase and thus determine how many subnets to create.
However, I would still suggest caution with this approach: if the result of that lookup were to change in future in a way that introduces availability zones anywhere except the end of the list then your existing subnets would get reassigned to new availability zones, and thus the provider will plan to replace them. Instead, it might be better to use the availability zone names themselves as the identifiers for the subnets, so that it won't matter what order they appear in the resulting list:
data "aws_availability_zones" "azs" {
state = "available"
}
locals {
azs = toset(data.aws_availability_zones.azs.names)
}
resource "aws_subnet" "private_subnets" {
for_each = local.azs
vpc_id = aws_vpc.k3s_vpc.id
cidr_block = var.private_subnets_cidr[each.value]
availability_zone = each.value
}
Notice that under this approach you'd also need to change variable "public_subnets_cidr" to be a map instead of a list, with the availability zone names as keys, so that the CIDR ranges are also assigned directly to AZs and won't get reassigned if new zones appear in your account later.

Adding new AWS EBS Volume to ASG in same AZ

ok, so I am trying to attach an EBS volume which I have created using Terraform to an ASG's instance using userdata, but now issue is both are in different AZ's, due to which, it failing to attach. Below is the steps I am trying and failing:
resource "aws_ebs_volume" "this" {
for_each = var.ebs_block_device
size = lookup(each.value,"volume_size", null)
type = lookup(each.value,"volume_type", null)
iops = lookup(each.value, "iops", null)
encrypted = lookup(each.value, "volume_encrypt", null)
kms_key_id = lookup(each.value, "kms_key_id", null)
availability_zone = join(",",random_shuffle.az.result)
}
In above resource, I am using random provider to get one AZ from list of AZs, and same list is provided to ASG resource below:
resource "aws_autoscaling_group" "this" {
desired_capacity = var.desired_capacity
launch_configuration = aws_launch_configuration.this.id
max_size = var.max_size
min_size = var.min_size
name = var.name
vpc_zone_identifier = var.subnet_ids // <------ HERE
health_check_grace_period = var.health_check_grace_period
load_balancers = var.load_balancer_names
target_group_arns = var.target_group_arns
tag {
key = "Name"
value = var.name
propagate_at_launch = true
}
}
And here is userdata which I am using:
TOKEN=`curl -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600"`
instanceId = curl -H "X-aws-ec2-metadata-token: $TOKEN" http://169.254.169.254/latest/meta-data/instance-id
aws ec2 attach-volume --volume-id ${ebs_volume_id} --instance-id $instanceId --device /dev/nvme1n1
Above will attach the newly created volume, as I am passing output ${ebs_volume_id} of above resource.
But, its failing because instance and volume are in different AZs.
Can anyone help me on this as a better solution than hardcoding AZ on both ASG and Volume?

I'd have to understand more about what you're trying to do to solve this with just the aws provider and terraform. And honestly, most ideas are going to be a bit complex.
You could have an ASG per AZ. Otherwise, the ASG is going to select some AZ at each launch. And you'll have more instances in an AZ than you have volumes and volumes in other AZs with no instances to attach to.
So you could create a number of volumes per az and an ASG per AZ. Then the userdata should list all the volumes in the AZ that are not attached to an instance. Then pick the id of the first volume that is unattached. Then attach it. If all are attached, you should trigger your alerting because you have more instances than you have volumes.
Any attempt to do this with a single ASG is really an attempt at writing your own ASG but doing it in a way that fights with your actual ASG.
But there is a company who offers managing this as a service. They also help you manage them as spot instances to save cost: https://spot.io/
The elastigroup resource is an ASG managed by them. So you won't have an aws asg anymore. But they have some interesting stateful configurations.
We support instance persistence via the following configurations. all values are boolean. For more information on instance persistence please see: Stateful configuration
persist_root_device - (Optional) Boolean, should the instance maintain its root device volumes.
persist_block_devices - (Optional) Boolean, should the instance maintain its Data volumes.
persist_private_ip - (Optional) Boolean, should the instance maintain its private IP.
block_devices_mode - (Optional) String, determine the way we attach the data volumes to the data devices, possible values: "reattach" and "onLaunch" (default is onLaunch).
private_ips - (Optional) List of Private IPs to associate to the group instances.(e.g. "172.1.1.0"). Please note: This setting will only apply if persistence.persist_private_ip is set to true
stateful_deallocation {
should_delete_images = false
should_delete_network_interfaces = false
should_delete_volumes = false
should_delete_snapshots = false
}
This allows you to have an autoscaler that preserves volumes and handles the complexities for you.

Is AWS ECS with Terraform broken?

I am trying to spin up an ECS cluster with Terraform, but can not make EC2 instances register as container instances in the cluster.
I first tried with the verified module from Terraform, but this seems out dated (ecs-instance-profile has wrong path).
Then I tried with another module from anrim, but still no container instances. Here is the script I used:
provider "aws" {
region = "us-east-1"
}
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "2.21.0"
name = "ecs-alb-single-svc"
cidr = "10.10.10.0/24"
azs = ["us-east-1a", "us-east-1b", "us-east-1c"]
private_subnets = ["10.10.10.0/27", "10.10.10.32/27", "10.10.10.64/27"]
public_subnets = ["10.10.10.96/27", "10.10.10.128/27", "10.10.10.160/27"]
tags = {
Owner = "user"
Environment = "me"
}
}
module "ecs_cluster" {
source = "../../modules/cluster"
name = "ecs-alb-single-svc"
vpc_id = module.vpc.vpc_id
vpc_subnets = module.vpc.private_subnets
tags = {
Owner = "user"
Environment = "me"
}
}
I then created a new ecs cluster (from the aws console) on the same VPC and carefully compared the differences in resources. I managed to find some small differences, fixed them and tried again. But still no container instances!
A fork of the module is available here.

Can you see instances being created in the autoscaling group? If so, I'd suggest SSHing to one of them (either directly or using a bastion host, eg. see this module) and checking ECS agent logs. In my experience those problems are usually related to IAM policies, and that's pretty visible in logs but YMMV.

How to block Terraform from deleting an imported resource?

I'm brand new to Terraform so I'm sure i'm missing something, but the answers i'm finding don't seem to be asking the same question I have.
I have an AWS VPC/Security Group that we need our EC2 instances to be created under and this VPC/SG is already created. To create an EC2 instance, Terraform requires that if I don't have a default VPC, I must import my own. But once I import and apply my plan, when I wish to destroy it, its trying to destroy my VPC as well. How do I encapsulate my resources so when I run "terraform apply", I can create an EC2 instance with my imported VPC, but when I run "terraform destroy" I only destroy my EC2 instance?
In case anyone wants to mention, I understand that:
lifecycle = {
prevent_destroy = true
}
is not what I'm looking for.
Here is my current practice code.
resource "aws_vpc" "my_vpc" {
cidr_block = "xx.xx.xx.xx/24"
}
provider "aws" {
region = "us-west-2"
}
data "aws_ami" "ubuntu" {
most_recent = true
filter {
name = "name"
values = ["ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-*"]
}
owners = ["099720109477"] # Canonical
}
resource "aws_instance" "web" {
ami = "${data.aws_ami.ubuntu.id}"
instance_type = "t3.nano"
vpc_security_group_ids = ["sg-0e27d851dxxxxxxxxxx"]
subnet_id = "subnet-0755c2exxxxxxxx"
tags = {
Name = "HelloWorld"
}
}

Terraform should not require you to deploy or import a VPC in order to deploy an EC2 instance into it. You should be able to reference the VPC, subnets and security groups by id so TF is aware of your existing network infrastructure just like you've already done for SGs and subnets. All you should need to deploy the EC2 instance "aws_instance" is give it an existing subnet id in the existing VPC like you already did. Why do you say deploying or importing a VPC is required by Terraform? What error or issue do you have deploying without the VPC and just using the existing one?
You can protect the VPC through AWS if you really wanted to, but I don't think you really want to import the VPC into your Terraform state and let Terraform manage it here. Sounds like you want the VPC to service other resources, maybe applications manually deployed or through other TF stacks, and the VPC to live independent of anyone application deployment.

To answer the original question, you can use a data source and match your VPC by id or tag name :
data "aws_vpc" "main" {
tags = {
Name = "main_vpc"
}
}
Or
data "aws_vpc" "main" {
id = "vpc-nnnnnnnn"
}
Then refer to it with : data.aws_vpc.main
Also, if you already included your VPC but would like not to destroy it while remove it from your state, you can manage to do it with the terraform state command : https://www.terraform.io/docs/commands/state/index.html

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js