This is basically the same issue as in this question, but the answers there didn't get me to a solution.
My configuration is: 1 VPC, 1 subnet, 1 security group. My Lambda runs in the VPC/subnet/security group and tries to add a message to an SQS queue, but gets a timeout. I've double-checked the permissions granted to the lambda, the policy on the VPC Endpoint, the policy on the SQS queue, opened the rules on the Security Group, ensured the Network ACLs are open.
I successfully went through this tutorial, which sets up VPC/etc+EC2 with cloudformation, then demonstrates sending a message to SQS from EC2.
To reproduce my problem, I started with the cloudformation from that tutorial and added the following to it:
the VPC Endpoint (rather than creating it through console like in the tutorial)
a Lambda (plus IAM role+policy) in the same VPC that tries to send a message to the SQS queue
The resulting cloudformation template is below.
I can reproduce the problem like this:
Create the cloudformation template (see below) (note that I had to make one small change to the template in the tutorial to get it to work in us-west-2).
SSH to the EC2 and run the command to send an SQS message (see step 5 from the tutorial). This succeeds.
In the console, go to the Lambda, paste the URL of the SQS queue into the code, deploy, and run the lambda. It times out.
In the console, edit the Lambda configuration to set VPC=None, then rerun the lambda. It succeeds.
So the SQS queue is accessible by the lambda outside the VPC, and by EC2 inside the VPC/subnet/sg, but not the lambda inside the VPC/subnet/sg.
Any idea what could be missing?
Cloudformation (from tutorial + my additions):
# Copied from this tutorial: https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-sending-messages-from-vpc.html
AWSTemplateFormatVersion: 2010-09-09
Description: CloudFormation Template for SQS VPC Endpoints Tutorial
Parameters:
KeyName:
Description: Name of an existing EC2 KeyPair to enable SSH access to the instance
Type: 'AWS::EC2::KeyPair::KeyName'
ConstraintDescription: must be the name of an existing EC2 KeyPair.
SSHLocation:
Description: The IP address range that can be used to SSH to the EC2 instance
Type: String
MinLength: '9'
MaxLength: '18'
Default: 0.0.0.0/0
AllowedPattern: '(\d{1,3})\.(\d{1,3})\.(\d{1,3})\.(\d{1,3})/(\d{1,2})'
ConstraintDescription: must be a valid IP CIDR range of the form x.x.x.x/x.
Conditions:
IsT3Supported: !Equals [!Ref 'AWS::Region', eu-north-1]
Mappings:
RegionMap:
us-east-1:
AMI: ami-428aa838
us-east-2:
AMI: ami-710e2414
us-west-1:
AMI: ami-4a787a2a
us-west-2:
AMI: ami-7f43f307
ap-northeast-1:
AMI: ami-c2680fa4
ap-northeast-2:
AMI: ami-3e04a450
ap-southeast-1:
AMI: ami-4f89f533
ap-southeast-2:
AMI: ami-38708c5a
ap-south-1:
AMI: ami-3b2f7954
ca-central-1:
AMI: ami-7549cc11
eu-central-1:
AMI: ami-1b2bb774
eu-west-1:
AMI: ami-db1688a2
eu-west-2:
AMI: ami-6d263d09
eu-north-1:
AMI: ami-87fe70f9
eu-west-3:
AMI: ami-5ce55321
sa-east-1:
AMI: ami-f1337e9d
Resources:
VPC:
Type: 'AWS::EC2::VPC'
Properties:
CidrBlock: 10.0.0.0/16
EnableDnsSupport: 'true'
EnableDnsHostnames: 'true'
Tags:
- Key: Name
Value: SQS-VPCE-Tutorial-VPC
Subnet:
Type: 'AWS::EC2::Subnet'
Properties:
VpcId: !Ref VPC
# I had to add (uncomment) this line to avoid using us-west-2d, which doesn't support the instance type
# AvailabilityZone: us-west-2a
CidrBlock: 10.0.0.0/24
Tags:
- Key: Name
Value: SQS-VPCE-Tutorial-Subnet
InternetGateway:
Type: 'AWS::EC2::InternetGateway'
Properties:
Tags:
- Key: Name
Value: SQS-VPCE-Tutorial-InternetGateway
VPCGatewayAttachment:
Type: 'AWS::EC2::VPCGatewayAttachment'
Properties:
VpcId: !Ref VPC
InternetGatewayId: !Ref InternetGateway
RouteTable:
Type: 'AWS::EC2::RouteTable'
Properties:
VpcId: !Ref VPC
Tags:
- Key: Name
Value: SQS-VPCE-Tutorial-RouteTable
SubnetRouteTableAssociation:
Type: 'AWS::EC2::SubnetRouteTableAssociation'
Properties:
RouteTableId: !Ref RouteTable
SubnetId: !Ref Subnet
InternetGatewayRoute:
Type: 'AWS::EC2::Route'
Properties:
RouteTableId: !Ref RouteTable
GatewayId: !Ref InternetGateway
DestinationCidrBlock: 0.0.0.0/0
SecurityGroup:
Type: 'AWS::EC2::SecurityGroup'
Properties:
GroupName: SQS VPCE Tutorial Security Group
GroupDescription: Security group for SQS VPC endpoint tutorial
VpcId: !Ref VPC
SecurityGroupIngress:
- IpProtocol: '-1'
CidrIp: 10.0.0.0/16
- IpProtocol: tcp
FromPort: '22'
ToPort: '22'
CidrIp: !Ref SSHLocation
SecurityGroupEgress:
- IpProtocol: '-1'
CidrIp: 10.0.0.0/16
Tags:
- Key: Name
Value: SQS-VPCE-Tutorial-SecurityGroup
EC2Instance:
Type: 'AWS::EC2::Instance'
Properties:
KeyName: !Ref KeyName
InstanceType: !If [IsT3Supported, t3.micro, t2.micro]
ImageId: !FindInMap
- RegionMap
- !Ref 'AWS::Region'
- AMI
NetworkInterfaces:
- AssociatePublicIpAddress: 'true'
DeviceIndex: '0'
GroupSet:
- !Ref SecurityGroup
SubnetId: !Ref Subnet
IamInstanceProfile: !Ref EC2InstanceProfile
Tags:
- Key: Name
Value: SQS-VPCE-Tutorial-EC2Instance
EC2InstanceProfile:
Type: 'AWS::IAM::InstanceProfile'
Properties:
Roles:
- !Ref EC2InstanceRole
InstanceProfileName: !Sub 'EC2InstanceProfile-${AWS::Region}'
EC2InstanceRole:
Type: 'AWS::IAM::Role'
Properties:
RoleName: !Sub 'SQS-VPCE-Tutorial-EC2InstanceRole-${AWS::Region}'
AssumeRolePolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Principal:
Service: ec2.amazonaws.com
Action: 'sts:AssumeRole'
ManagedPolicyArns:
- 'arn:aws:iam::aws:policy/AmazonSQSFullAccess'
CFQueue:
Type: 'AWS::SQS::Queue'
Properties:
VisibilityTimeout: 60
# Stuff I added starting here:
VPCEndpointForSQS:
Type: 'AWS::EC2::VPCEndpoint'
Properties:
VpcEndpointType: 'Interface'
PolicyDocument:
Statement:
- Action: '*'
Effect: Allow
Resource: '*'
Principal: '*'
ServiceName: !Sub 'com.amazonaws.${AWS::Region}.sqs'
VpcId: !Ref VPC
SubnetIds:
- !Ref Subnet
PrivateDnsEnabled: true
SecurityGroupIds:
- !Ref SecurityGroup
LambdaRole:
Type: 'AWS::IAM::Role'
Properties:
RoleName: !Sub 'SQS-VPCE-Tutorial-LambdaRole-${AWS::Region}'
ManagedPolicyArns:
- 'arn:aws:iam::aws:policy/AmazonSQSFullAccess'
- 'arn:aws:iam::aws:policy/service-role/AWSLambdaVPCAccessExecutionRole'
AssumeRolePolicyDocument:
Version: "2012-10-17"
Statement:
- Effect: Allow
Principal:
Service:
- lambda.amazonaws.com
Action:
- 'sts:AssumeRole'
LambdaPolicy:
Type: 'AWS::IAM::Policy'
Properties:
PolicyName: !Sub 'SQS-VPCE-Tutorial-LambdaPolicy-${AWS::Region}'
PolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Action:
- 'logs:CreateLogGroup'
Resource: '*'
- Effect: Allow
Action:
- logs:CreateLogStream
- logs:PutLogEvents
Resource: '*'
Roles:
- !Ref LambdaRole
LambdaFunction:
Type: 'AWS::Lambda::Function'
Properties:
FunctionName: 'SQS-VPCE-Tutorial-Lambda'
Role: !GetAtt LambdaRole.Arn
Runtime: 'python3.9'
Handler: 'index.lambda_handler'
Timeout: 20
VpcConfig:
SecurityGroupIds:
- !Ref SecurityGroup
SubnetIds:
- !Ref Subnet
Code:
ZipFile: |
import json
import boto3
from botocore.exceptions import ClientError
sqs = boto3.resource('sqs')
queue = sqs.Queue('<INSERT SQS QUEUE URL HERE>')
def lambda_handler(event, context):
print("before")
queue.send_message(MessageBody='Hello from Amazon SQS.')
print("after")
Of course, as soon as I posted this, I found this answer, that there is a bug in boto3 that prevents it from using VPC Endpoint for SQS by default. I tried the solution there and is solved the problem!
Related
I am trying to launch a containerized grpc application on aws fargate. I've tested the image locally, and have pushed it to ecr. I've created a task with a role that has permission to reach ecr, yet I am still getting an error pulling the container (error message shown below). I even tried launching the container in a public subnet with internet gateway/route table association that auto assigns public ips and the security group allows all outbound traffic.
The full cloudformation template is given below:
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Resources:
TicketingAppTaskDefinition:
Type: AWS::ECS::TaskDefinition
Properties:
ContainerDefinitions:
- Essential: true
Image: !Sub "${AWS::AccountId}.dkr.ecr.${AWS::Region}.amazonaws.com/ticketing-app:latest"
Name: ticketing-app
PortMappings:
- ContainerPort: 8080
Cpu: "1 vCPU"
ExecutionRoleArn: !Ref ExecutionRole
Memory: "2 GB" #this is smallest for 1 vcpu .... could maybe decrease
NetworkMode: awsvpc
RuntimePlatform:
CpuArchitecture: X86_64
OperatingSystemFamily: LINUX
ExecutionRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Statement:
- Action: [ sts:AssumeRole ]
Effect: Allow
Principal:
Service: [ ecs-tasks.amazonaws.com ]
ManagedPolicyArns:
- arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy
TicketingEcsService:
Type: AWS::ECS::Service
Properties:
Cluster: !Ref TicketingEcsCluster
LaunchType: FARGATE
#TODO I think we eventually need to specify load balancers here
NetworkConfiguration:
AwsvpcConfiguration:
AssignPublicIp: ENABLED #TODO remove this when done, just seeing if this let's us grab image from ecr?
SecurityGroups: [ !GetAtt TicketingServiceSecurityGroup.GroupId ]
Subnets:
- !Ref TicketingServicePrivateSubnet01
- !Ref TicketingServicePrivateSubnet02
TaskDefinition: !Ref TicketingAppTaskDefinition
TicketingEcsCluster:
Type: AWS::ECS::Cluster
TicketingServiceSecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupDescription: "A security group used for the ticketing app"
VpcId: !Ref TicketingServiceVpc
TicketingServiceVpc:
Type: AWS::EC2::VPC
Properties:
CidrBlock: 10.0.0.0/16
TicketingServicePrivateSubnet01:
Type: AWS::EC2::Subnet
Properties:
AvailabilityZone: !Sub "${AWS::Region}a"
VpcId: !Ref TicketingServiceVpc
CidrBlock: 10.0.0.0/18
TicketingServicePrivateSubnet02:
Type: AWS::EC2::Subnet
Properties:
AvailabilityZone: !Sub "${AWS::Region}b"
VpcId: !Ref TicketingServiceVpc
CidrBlock: 10.0.64.0/18
#TODO public subnets and NAT gateway?
TicketingSecurityGroupHttpIngress:
Type: AWS::EC2::SecurityGroupIngress
Properties:
#TODO I would probably use load balancer security group name here once we make one instead of opening up to any ip
GroupId: !GetAtt TicketingServiceSecurityGroup.GroupId
CidrIpv6: "::/0"
FromPort: 8080
IpProtocol: tcp
ToPort: 8080
TicketingSecurityGroupAllTrafficEgress:
Type: AWS::EC2::SecurityGroupEgress
Properties:
GroupId: !GetAtt TicketingServiceSecurityGroup.GroupId
IpProtocol: "-1" #-1 indicates all -- like a wildcard
CidrIp: "0.0.0.0/0"
TicketingServiceInternetGateway:
Type: AWS::EC2::InternetGateway
DependsOn: TicketingServiceVpc
AttachGateway:
Type: AWS::EC2::VPCGatewayAttachment
Properties:
VpcId: !Ref TicketingServiceVpc
InternetGatewayId: !Ref TicketingServiceInternetGateway
TicketingAppRouteTable:
Type: AWS::EC2::RouteTable
Properties:
VpcId: !Ref TicketingServiceVpc
TicketingVPCRouteAllTrafficToInternetGateway:
Type: AWS::EC2::Route
DependsOn: AttachGateway
Properties:
RouteTableId: !Ref TicketingAppRouteTable
DestinationCidrBlock: "0.0.0.0/0"
GatewayId: !Ref TicketingServiceInternetGateway
Would anyone be able to point out a simple way to get this working (even if it uses not best practices, such as public subnet instead of private subnet and nat gateway) -- it is just for a poc.
Thanks
I had pushed my image to ecr in a different region..... changed the region and it worked (face palm).
So check your region if anyone else is in the same spot as me.
I've created a VPC and EKS cluster using CloudFormation, and when I try to create an AWS managed nodegroup through CloudFormation, it fails to create with the error message: Nodegroup test-ng failed to stabilize: [{Code: NodeCreationFailure,Message: Unhealthy nodes in the kubernetes cluster. I haven't been able to pinpoint the exact issue, but my setup is based on the AWS docs. For reference, I want a VPC with 3 public and 3 private subnets, with the managed nodegroup deployed to the private subnets. Here are the templates I've used to deploy everything:
VPC Template:
---
AWSTemplateFormatVersion: '2010-09-09'
Description: 'EKS VPC - Private and Public Subnets'
Parameters:
VpcName:
Type: String
Default: EKS-VPC
Description: The name of the VPC
VpcBlock:
Type: String
Default: 10.0.0.0/16
Description: The CIDR range for the VPC. This should be a valid private (RFC 1918) CIDR range.
PrivateSubnet01Block:
Type: String
Default: 10.0.0.0/19
Description: CidrBlock for private subnet 01 within the VPC
PrivateSubnet02Block:
Type: String
Default: 10.0.32.0/19
Description: CidrBlock for private subnet 02 within the VPC
PrivateSubnet03Block:
Type: String
Default: 10.0.64.0/19
Description: CidrBlock for private subnet 03 within the VPC
PublicSubnet01Block:
Type: String
Default: 10.0.128.0/20
Description: CidrBlock for public subnet 01 within the VPC
PublicSubnet02Block:
Type: String
Default: 10.0.144.0/20
Description: CidrBlock for public subnet 02 within the VPC
PublicSubnet03Block:
Type: String
Default: 10.0.160.0/20
Description: CidrBlock for public subnet 02 within the VPC
Metadata:
AWS::CloudFormation::Interface:
ParameterGroups:
-
Label:
default: "Main"
Parameters:
- VpcName
-
Label:
default: "Network Configuration"
Parameters:
- VpcBlock
- PublicSubnet01Block
- PublicSubnet02Block
- PublicSubnet03Block
- PrivateSubnet01Block
- PrivateSubnet02Block
- PrivateSubnet03Block
Resources:
VPC:
Type: AWS::EC2::VPC
Properties:
CidrBlock: !Ref VpcBlock
EnableDnsSupport: true
EnableDnsHostnames: true
Tags:
- Key: Name
Value: !Ref VpcName
InternetGateway:
Type: "AWS::EC2::InternetGateway"
VPCGatewayAttachment:
Type: "AWS::EC2::VPCGatewayAttachment"
Properties:
InternetGatewayId: !Ref InternetGateway
VpcId: !Ref VPC
PublicRouteTable:
Type: AWS::EC2::RouteTable
Properties:
VpcId: !Ref VPC
Tags:
- Key: Name
Value: Public Subnets RT
- Key: Network
Value: Public
PrivateRouteTable01:
Type: AWS::EC2::RouteTable
Properties:
VpcId: !Ref VPC
Tags:
- Key: Name
Value: Private Subnet 01 RT
- Key: Network
Value: Private
PrivateRouteTable02:
Type: AWS::EC2::RouteTable
Properties:
VpcId: !Ref VPC
Tags:
- Key: Name
Value: Private Subnet 02 RT
- Key: Network
Value: Private
PrivateRouteTable03:
Type: AWS::EC2::RouteTable
Properties:
VpcId: !Ref VPC
Tags:
- Key: Name
Value: Private Subnet 03 RT
- Key: Network
Value: Private
PublicRoute:
DependsOn: VPCGatewayAttachment
Type: AWS::EC2::Route
Properties:
RouteTableId: !Ref PublicRouteTable
DestinationCidrBlock: 0.0.0.0/0
GatewayId: !Ref InternetGateway
PrivateRoute01:
DependsOn:
- VPCGatewayAttachment
- NatGateway01
Type: AWS::EC2::Route
Properties:
RouteTableId: !Ref PrivateRouteTable01
DestinationCidrBlock: 0.0.0.0/0
NatGatewayId: !Ref NatGateway01
PrivateRoute02:
DependsOn:
- VPCGatewayAttachment
- NatGateway02
Type: AWS::EC2::Route
Properties:
RouteTableId: !Ref PrivateRouteTable02
DestinationCidrBlock: 0.0.0.0/0
NatGatewayId: !Ref NatGateway02
PrivateRoute03:
DependsOn:
- VPCGatewayAttachment
- NatGateway03
Type: AWS::EC2::Route
Properties:
RouteTableId: !Ref PrivateRouteTable03
DestinationCidrBlock: 0.0.0.0/0
NatGatewayId: !Ref NatGateway03
NatGateway01:
DependsOn:
- NatGatewayEIP1
- PublicSubnet01
- VPCGatewayAttachment
Type: AWS::EC2::NatGateway
Properties:
AllocationId: !GetAtt 'NatGatewayEIP1.AllocationId'
SubnetId: !Ref PublicSubnet01
Tags:
- Key: Name
Value: !Sub '${VpcName}-NatGateway01'
NatGateway02:
DependsOn:
- NatGatewayEIP2
- PublicSubnet02
- VPCGatewayAttachment
Type: AWS::EC2::NatGateway
Properties:
AllocationId: !GetAtt 'NatGatewayEIP2.AllocationId'
SubnetId: !Ref PublicSubnet02
Tags:
- Key: Name
Value: !Sub '${VpcName}-NatGateway02'
NatGateway03:
DependsOn:
- NatGatewayEIP3
- PublicSubnet03
- VPCGatewayAttachment
Type: AWS::EC2::NatGateway
Properties:
AllocationId: !GetAtt 'NatGatewayEIP3.AllocationId'
SubnetId: !Ref PublicSubnet03
Tags:
- Key: Name
Value: !Sub '${VpcName}-NatGateway03'
NatGatewayEIP1:
DependsOn:
- VPCGatewayAttachment
Type: 'AWS::EC2::EIP'
Properties:
Domain: vpc
NatGatewayEIP2:
DependsOn:
- VPCGatewayAttachment
Type: 'AWS::EC2::EIP'
Properties:
Domain: vpc
NatGatewayEIP3:
DependsOn:
- VPCGatewayAttachment
Type: 'AWS::EC2::EIP'
Properties:
Domain: vpc
PublicSubnet01:
Type: AWS::EC2::Subnet
Metadata:
Comment: Public Subnet 01
Properties:
MapPublicIpOnLaunch: true
AvailabilityZone:
Fn::Select:
- '0'
- Fn::GetAZs:
Ref: AWS::Region
CidrBlock:
Ref: PublicSubnet01Block
VpcId:
Ref: VPC
Tags:
- Key: Name
Value: !Sub "${VpcName}-PublicSubnet01"
- Key: kubernetes.io/role/elb
Value: 1
PublicSubnet02:
Type: AWS::EC2::Subnet
Metadata:
Comment: Public Subnet 02
Properties:
MapPublicIpOnLaunch: true
AvailabilityZone:
Fn::Select:
- '1'
- Fn::GetAZs:
Ref: AWS::Region
CidrBlock:
Ref: PublicSubnet02Block
VpcId:
Ref: VPC
Tags:
- Key: Name
Value: !Sub "${VpcName}-PublicSubnet02"
- Key: kubernetes.io/role/elb
Value: 1
PublicSubnet03:
Type: AWS::EC2::Subnet
Metadata:
Comment: Public Subnet 03
Properties:
MapPublicIpOnLaunch: true
AvailabilityZone:
Fn::Select:
- '2'
- Fn::GetAZs:
Ref: AWS::Region
CidrBlock:
Ref: PublicSubnet03Block
VpcId:
Ref: VPC
Tags:
- Key: Name
Value: !Sub "${VpcName}-PublicSubnet03"
- Key: kubernetes.io/role/elb
Value: 1
PrivateSubnet01:
Type: AWS::EC2::Subnet
Metadata:
Comment: Private Subnet 01
Properties:
AvailabilityZone:
Fn::Select:
- '0'
- Fn::GetAZs:
Ref: AWS::Region
CidrBlock:
Ref: PrivateSubnet01Block
VpcId:
Ref: VPC
Tags:
- Key: Name
Value: !Sub "${VpcName}-PrivateSubnet01"
- Key: kubernetes.io/role/internal-elb
Value: 1
PrivateSubnet02:
Type: AWS::EC2::Subnet
Metadata:
Comment: Private Subnet 02
Properties:
AvailabilityZone:
Fn::Select:
- '1'
- Fn::GetAZs:
Ref: AWS::Region
CidrBlock:
Ref: PrivateSubnet02Block
VpcId:
Ref: VPC
Tags:
- Key: Name
Value: !Sub "${VpcName}-PrivateSubnet02"
- Key: kubernetes.io/role/internal-elb
Value: 1
PrivateSubnet03:
Type: AWS::EC2::Subnet
Metadata:
Comment: Private Subnet 03
Properties:
AvailabilityZone:
Fn::Select:
- '2'
- Fn::GetAZs:
Ref: AWS::Region
CidrBlock:
Ref: PrivateSubnet03Block
VpcId:
Ref: VPC
Tags:
- Key: Name
Value: !Sub "${VpcName}-PrivateSubnet03"
- Key: kubernetes.io/role/internal-elb
Value: 1
PublicSubnet01RouteTableAssociation:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
SubnetId: !Ref PublicSubnet01
RouteTableId: !Ref PublicRouteTable
PublicSubnet02RouteTableAssociation:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
SubnetId: !Ref PublicSubnet02
RouteTableId: !Ref PublicRouteTable
PublicSubnet02RouteTableAssociation:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
SubnetId: !Ref PublicSubnet03
RouteTableId: !Ref PublicRouteTable
PrivateSubnet01RouteTableAssociation:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
SubnetId: !Ref PrivateSubnet01
RouteTableId: !Ref PrivateRouteTable01
PrivateSubnet02RouteTableAssociation:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
SubnetId: !Ref PrivateSubnet02
RouteTableId: !Ref PrivateRouteTable02
PrivateSubnet03RouteTableAssociation:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
SubnetId: !Ref PrivateSubnet03
RouteTableId: !Ref PrivateRouteTable03
ControlPlaneSecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupDescription: Cluster communication with worker nodes
VpcId: !Ref VPC
WorkerNodeSshSecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupDescription: SG for ssh access to worker nodes in managed nodegroup
VpcId: !Ref VPC
Outputs:
PublicSubnetIds:
Description: Public Subnets IDs in the VPC
Value: !Join [ ", ", [ !Ref PublicSubnet01, !Ref PublicSubnet02, !Ref PublicSubnet03 ] ]
PrivateSubnetIds:
Description: Private Subnets IDs in the VPC
Value: !Join [ ", ", [ !Ref PrivateSubnet01, !Ref PrivateSubnet02, !Ref PrivateSubnet03 ] ]
ControlPlaneSecurityGroups:
Description: Security group for the cluster control plane communication with worker nodes
Value: !Join [ ",", [ !Ref ControlPlaneSecurityGroup ] ]
WorkerNodeSshSecurityGroup:
Description: SG for ssh access to worker nodes in managed nodegroup
Value: !Ref WorkerNodeSshSecurityGroup
VpcId:
Description: The VPC Id
Value: !Ref VPC
IAM Roles Template:
Mappings:
ServicePrincipals:
aws-cn:
ec2: ec2.amazonaws.com.cn
aws-us-gov:
ec2: ec2.amazonaws.com
aws:
ec2: ec2.amazonaws.com
Resources:
eksClusterRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Principal:
Service:
- eks.amazonaws.com
Action:
- sts:AssumeRole
ManagedPolicyArns:
- arn:aws:iam::aws:policy/AmazonEKSClusterPolicy
NodeInstanceRole:
Type: "AWS::IAM::Role"
Properties:
AssumeRolePolicyDocument:
Version: "2012-10-17"
Statement:
- Effect: Allow
Principal:
Service:
- !FindInMap [ServicePrincipals, !Ref "AWS::Partition", ec2]
Action:
- "sts:AssumeRole"
ManagedPolicyArns:
- !Sub "arn:${AWS::Partition}:iam::aws:policy/AmazonEKSWorkerNodePolicy"
- !Sub "arn:${AWS::Partition}:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly"
Path: /
Outputs:
eksClusterRoleArn:
Description: The role that Amazon EKS will use to create AWS resources for Kubernetes clusters
Value: !GetAtt eksClusterRole.Arn
NodeInstanceRole:
Description: The node instance role
Value: !GetAtt NodeInstanceRole.Arn
EKS Cluster Template:
---
AWSTemplateFormatVersion: '2010-09-09'
Description: 'EKS Production-Grade Cluster'
Parameters:
KubernetesVersion:
Type: String
Description: The EKS supported Kubernetes version for your cluster
Default: 1.19
AllowedValues:
- 1.19
- 1.18
- 1.17
ClusterName:
Type: String
Description: The name of the cluster
ControlPlaneClusterRoleArn:
Type: String
Description: The eksClusterRole Arn to use for the eks cluster (control plane)
ControlPlaneSubnetIds:
Type: List<AWS::EC2::Subnet::Id>
Description: Register your public and private subnets with your EKS managed control plane
SecurityGroupIds:
Type: List<AWS::EC2::SecurityGroup::Id>
Description: The security group(s) for the cross-account elastic network interfaces that Amazon EKS creates to use to allow communication between your nodes and the Kubernetes control plane.
Resources:
EKSManagedControlPlane:
Type: AWS::EKS::Cluster
Properties:
KubernetesNetworkConfig:
ServiceIpv4Cidr: 172.16.0.0/12
Name: !Ref ClusterName
ResourcesVpcConfig:
SecurityGroupIds: !Ref SecurityGroupIds
SubnetIds: !Ref ControlPlaneSubnetIds
RoleArn: !Ref ControlPlaneClusterRoleArn
Version: !Ref KubernetesVersion
Managed Nodegroup Template:
---
AWSTemplateFormatVersion: '2010-09-09'
Description: 'EKS Production-Grade Nodegroup'
Parameters:
KubernetesVersion:
Type: String
Description: The EKS supported Kubernetes version for your cluster
Default: 1.19
AllowedValues:
- 1.19
- 1.18
- 1.17
ClusterName:
Type: String
Description: The name of the cluster
NodeInstanceRoleArn:
Type: String
Description: The NodeInstanceRole Arn to use for the eks nodegroup (managed data plane)
DataPlanePrivateSubnetIds:
Type: List<AWS::EC2::Subnet::Id>
Description: Private subnets for your Amazon EKS data plane nodes
WorkerNodeGroupName:
Type: String
Description: The name of the node group for the worker nodes in the data plane. Right now we are only supporting 1 node group per cluster.
WorkerNodesInstanceType:
Type: String
Description: The instance type for the worker nodes in the data plane. Right now we are only supporting 1 instance type for all worker nodes.
Default: t3.medium
AllowedValues:
- t3.small
- t3.medium
- t3.large
- m5.large
- m5.xlarge
- c5.large
- c5.xlarge
WorkerNodesEc2SshKey:
Type: AWS::EC2::KeyPair::KeyName
Description: The Amazon EC2 SSH key that provides access for SSH communication with the nodes in the managed node group
SourceSecurityGroupsForWorkerNodes:
Type: List<AWS::EC2::SecurityGroup::Id>
Description: The security groups that are allowed SSH access (port 22) to the nodes. If you specify an Amazon EC2 SSH key but do not specify a source security group when you create a managed node group, then port 22 on the nodes is opened to the internet.
Resources:
EKSManagedDataPlane:
Type: AWS::EKS::Nodegroup
Properties:
AmiType: AL2_x86_64
CapacityType: ON_DEMAND
ClusterName: !Ref ClusterName
ForceUpdateEnabled: false
InstanceTypes:
- !Ref WorkerNodesInstanceType
NodegroupName: !Ref WorkerNodeGroupName
NodeRole: !Ref NodeInstanceRoleArn
RemoteAccess:
Ec2SshKey: !Ref WorkerNodesEc2SshKey
SourceSecurityGroups: !Ref SourceSecurityGroupsForWorkerNodes
ScalingConfig:
DesiredSize: 3
MaxSize: 4
MinSize: 3
Subnets: !Ref DataPlanePrivateSubnetIds
Version: !Ref KubernetesVersion
For the EKS Cluster template, I use the eksClusterRole ARN for the EKS cluster parameter, and I pass in all 6 subnet ids (public and private) from the VPC template output when creating the cluster. The SG id I pass in from the VPC ControlPlaneSecurityGroups output field as well.
For the Managed Nodegroup template I only pass it the private subnet ids from the VPC output, the ssh security group id from the VPC output, and the NodeInstanceRole ARN. I have ensured that the cluster name matches what I have given for the EKS Cluster template when creating that.
I plan on configuring the VPC CNI plugin to use IAM Roles for Service accounts once the cluster and managed nodegroup are setup, which is why I have left the policy for the CNI off of that IAM role.
I have a Custom Lambda resource that inits my DB and then is supposed make the call to the presigned S3 url when done. It's initing the DB correctly but is timing out when making the call to S3. My guess is I did something wrong in my CloudFormation template that's causing that due to my limited networking knowledge. Would appreciate any help. Thank you in advance!
Trimmed down YAML:
AWSTemplateFormatVersion: 2010-09-09
Transform: "AWS::Serverless-2016-10-31"
Resources:
InternetGateway:
Type: "AWS::EC2::InternetGateway"
Properties:
Tags:
- Key: Name
Value: !Sub ${AWS::StackName}-InternetGateway
VPC:
Type: "AWS::EC2::VPC"
Properties:
CidrBlock: 10.0.0.0/16
EnableDnsSupport: "true"
EnableDnsHostnames: "true"
Tags:
- Key: Name
Value: !Sub ${AWS::StackName}-VPC
VPCGatewayAttachment:
Type: "AWS::EC2::VPCGatewayAttachment"
Properties:
VpcId: !Ref VPC
InternetGatewayId: !Ref InternetGateway
NATGateway:
Type: AWS::EC2::NatGateway
Properties:
AllocationId: !GetAtt ElasticIPAddress.AllocationId
SubnetId: !Ref PublicSubnet1
Tags:
- Key: Name
Value: !Sub ${AWS::StackName}-NATGateway
ElasticIPAddress:
Type: AWS::EC2::EIP
Properties:
Domain: VPC
Tags:
- Key: Name
Value: !Sub ${AWS::StackName}-EIP
PublicRouteTable:
Type: "AWS::EC2::RouteTable"
Properties:
VpcId: !Ref VPC
Tags:
- Key: Name
Value: Public
PublicRoute1:
Type: "AWS::EC2::Route"
Properties:
RouteTableId: !Ref PublicRouteTable
GatewayId: !Ref InternetGateway
DestinationCidrBlock: 0.0.0.0/0
DependsOn:
- InternetGateway
PublicSubnet1:
Type: "AWS::EC2::Subnet"
Properties:
VpcId: !Ref VPC
CidrBlock: 10.0.0.0/24
AvailabilityZone: !Select [0, !GetAZs ]
Tags:
- Key: Name
Value: !Sub ${AWS::StackName}-PublicSubnet1
CreateRDSDatabaseLambdaSG:
Type: "AWS::EC2::SecurityGroup"
Properties:
VpcId: !Ref VPC
GroupDescription: Allow Lambda to access RDS in same VPC
Tags:
- Key: Name
Value: !Sub ${AWS::StackName}-CreateRDSDatabaseLambdaSG
LambdaRDSCFNInit:
Type: AWS::Serverless::Function
DependsOn:
- InternetGateway
- VPC
- VPCGatewayAttachment
- NATGateway
- ElasticIPAddress
- PublicRouteTable
- PublicRoute1
- PublicSubnet1
- CreateRDSDatabaseLambdaSG
Properties:
CodeUri: CreateRDSDatabase/
Description: "Lambda function which will execute when this CFN template is created, updated or deleted"
Handler: app.createRDSDatabase
Runtime: nodejs12.x
Timeout: 300
VpcConfig:
SecurityGroupIds:
- !Ref CreateRDSDatabaseLambdaSG
SubnetIds:
- !Ref PublicSubnet1
Environment:
Variables:
RDS_ENDPOINT: !GetAtt RDSCluster.Endpoint.Address
RDS_DB_NAME: !Ref RDSDBName
RDS_USERNAME: !Ref RDSUserName
RDS_PASSWORD: !Ref RDSPassword
LambdaRDSCFNTrigger:
Type: Custom::ProvisionRDS
DependsOn: LambdaRDSCFNInit
Version: 1.0
Properties:
ServiceToken: !GetAtt LambdaRDSCFNInit.Arn
You are placing your lambda in PublicSubnet1. Thus, your lambda will not have internet connectivity despite your NAT or Internet gateway. You need to place your function in a private subnet, and configure your private subnet to use the NAT gateway. From docs:
To access private resources, connect your function to private subnets. If your function needs internet access, use network address translation (NAT). Connecting a function to a public subnet doesn't give it internet access or a public IP address.
Alternatively, use S3 VPC gateway endpoint and associate it with route tables of your public subnet. This way your function will access s3 using the gateway, rather then internet. No need for NAT nor private subnet in this case.
I have been following this guide to create a Kubernetes cluster via CloudFormation, but the NodeGroup never joins the cluster, and I never get an error or explanation about why is not joining.
I can see the autoscaling group and the EC2 machines are created, but EKS reports that there is not node groups.
If I create a new node group manually through the web admin tool, it works, but it assigns different security groups. It has a launch template instead of a launch configuration.
Same AMI, same IAM role, same machine type...
I am very new in both CloudFormation and EKS, and I don't know how to proceed now to find out what the problem is.
Here is the template:
Description: >
Kubernetes cluster
Parameters:
EnvironmentName:
Description: An environment name that will be prefixed to resource names
Type: String
KeyName:
Description: The EC2 Key Pair to allow SSH access to the instances
Type: AWS::EC2::KeyPair::KeyName
VpcBlock:
Type: String
Default: 192.168.0.0/16
Description: The CIDR range for the VPC. This should be a valid private (RFC 1918) CIDR range.
Subnet01Block:
Type: String
Default: 192.168.64.0/18
Description: CidrBlock for subnet 01 within the VPC
Subnet02Block:
Type: String
Default: 192.168.128.0/18
Description: CidrBlock for subnet 02 within the VPC
Subnet03Block:
Type: String
Default: 192.168.192.0/18
Description: CidrBlock for subnet 03 within the VPC. This is used only if the region has more than 2 AZs.
NodeInstanceType:
Description: EC2 instance type for the node instances
Type: String
NodeImageId:
Type: AWS::EC2::Image::Id
Description: AMI id for the node instances.
NodeAutoScalingGroupMinSize:
Type: Number
Description: Minimum size of Node Group ASG.
Default: 1
NodeAutoScalingGroupMaxSize:
Type: Number
Description: Maximum size of Node Group ASG. Set to at least 1 greater than NodeAutoScalingGroupDesiredCapacity.
Default: 3
NodeAutoScalingGroupDesiredCapacity:
Type: Number
Description: Desired capacity of Node Group ASG.
Default: 3
BootstrapArguments:
Description: Arguments to pass to the bootstrap script. See files/bootstrap.sh in https://github.com/awslabs/amazon-eks-ami
Default: ""
Type: String
Resources:
VPC:
Type: AWS::EC2::VPC
Properties:
CidrBlock: !Ref VpcBlock
EnableDnsSupport: true
EnableDnsHostnames: true
Tags:
- Key: Environment
Value: !Ref EnvironmentName
InternetGateway:
Type: "AWS::EC2::InternetGateway"
Properties:
Tags:
- Key: Environment
Value: !Ref EnvironmentName
VPCGatewayAttachment:
Type: "AWS::EC2::VPCGatewayAttachment"
Properties:
InternetGatewayId: !Ref InternetGateway
VpcId: !Ref VPC
RouteTable:
Type: AWS::EC2::RouteTable
Properties:
VpcId: !Ref VPC
Tags:
- Key: Environment
Value: !Ref EnvironmentName
Route:
DependsOn: VPCGatewayAttachment
Type: AWS::EC2::Route
Properties:
RouteTableId: !Ref RouteTable
DestinationCidrBlock: 0.0.0.0/0
GatewayId: !Ref InternetGateway
Subnet01:
Type: AWS::EC2::Subnet
Properties:
AvailabilityZone: !Select [ 0, !GetAZs '' ]
CidrBlock: !Ref Subnet01Block
VpcId: !Ref VPC
MapPublicIpOnLaunch: true
Tags:
- Key: Environment
Value: !Ref EnvironmentName
Subnet02:
Type: AWS::EC2::Subnet
Metadata:
Comment: Subnet 02
Properties:
AvailabilityZone: !Select [ 1, !GetAZs '' ]
CidrBlock: !Ref Subnet02Block
VpcId: !Ref VPC
MapPublicIpOnLaunch: true
Tags:
- Key: Environment
Value: !Ref EnvironmentName
Subnet03:
Type: AWS::EC2::Subnet
Metadata:
Comment: Subnet 03
Properties:
AvailabilityZone: !Select [ 2, !GetAZs '' ]
CidrBlock: !Ref Subnet03Block
VpcId: !Ref VPC
MapPublicIpOnLaunch: true
Tags:
- Key: Environment
Value: !Ref EnvironmentName
Subnet01RouteTableAssociation:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
SubnetId: !Ref Subnet01
RouteTableId: !Ref RouteTable
Subnet02RouteTableAssociation:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
SubnetId: !Ref Subnet02
RouteTableId: !Ref RouteTable
Subnet03RouteTableAssociation:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
SubnetId: !Ref Subnet03
RouteTableId: !Ref RouteTable
ControlPlaneSecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupDescription: Cluster communication with worker nodes
VpcId: !Ref VPC
ClusterRole:
Type: AWS::IAM::Role
Properties:
RoleName: !Sub ${EnvironmentName}KubernetesClusterRole
AssumeRolePolicyDocument:
Statement:
- Effect: Allow
Principal:
Service: eks.amazonaws.com
Action: sts:AssumeRole
ManagedPolicyArns:
- arn:aws:iam::aws:policy/AmazonEKSServicePolicy
- arn:aws:iam::aws:policy/AmazonEKSClusterPolicy
Tags:
- Key: Environment
Value: !Ref EnvironmentName
Cluster:
Type: AWS::EKS::Cluster
Properties:
Name: !Sub ${EnvironmentName}KubernetesCluster
RoleArn: !GetAtt ClusterRole.Arn
ResourcesVpcConfig:
SecurityGroupIds:
- !Ref ControlPlaneSecurityGroup
SubnetIds:
- !Ref Subnet01
- !Ref Subnet02
- !Ref Subnet03
NodeRole:
Type: AWS::IAM::Role
Properties:
RoleName: !Sub ${EnvironmentName}KubernetesNodeRole
AssumeRolePolicyDocument:
Statement:
- Effect: Allow
Principal:
Service: ec2.amazonaws.com
Action: sts:AssumeRole
ManagedPolicyArns:
- arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy
- arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy
- arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly
- arn:aws:iam::aws:policy/AmazonDynamoDBFullAccess
Path: /
Tags:
- Key: Environment
Value: !Ref EnvironmentName
NodeInstanceProfile:
Type: AWS::IAM::InstanceProfile
Properties:
Path: "/"
Roles:
- !Ref NodeRole
NodeSecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupDescription: Security group for all nodes in the cluster
VpcId: !Ref VPC
Tags:
- Key: !Sub "kubernetes.io/cluster/${EnvironmentName}KubernetesCluster"
Value: 'owned'
- Key: Environment
Value: !Ref EnvironmentName
NodeSecurityGroupIngress:
Type: AWS::EC2::SecurityGroupIngress
DependsOn: NodeSecurityGroup
Properties:
Description: Allow node to communicate with each other
GroupId: !Ref NodeSecurityGroup
SourceSecurityGroupId: !Ref NodeSecurityGroup
IpProtocol: '-1'
FromPort: 0
ToPort: 65535
NodeSecurityGroupFromControlPlaneIngress:
Type: AWS::EC2::SecurityGroupIngress
DependsOn: NodeSecurityGroup
Properties:
Description: Allow worker Kubelets and pods to receive communication from the cluster control plane
GroupId: !Ref NodeSecurityGroup
SourceSecurityGroupId: !Ref ControlPlaneSecurityGroup
IpProtocol: tcp
FromPort: 1025
ToPort: 65535
ControlPlaneEgressToNodeSecurityGroup:
Type: AWS::EC2::SecurityGroupEgress
DependsOn: NodeSecurityGroup
Properties:
Description: Allow the cluster control plane to communicate with worker Kubelet and pods
GroupId: !Ref ControlPlaneSecurityGroup
DestinationSecurityGroupId: !Ref NodeSecurityGroup
IpProtocol: tcp
FromPort: 1025
ToPort: 65535
NodeSecurityGroupFromControlPlaneOn443Ingress:
Type: AWS::EC2::SecurityGroupIngress
DependsOn: NodeSecurityGroup
Properties:
Description: Allow pods running extension API servers on port 443 to receive communication from cluster control plane
GroupId: !Ref NodeSecurityGroup
SourceSecurityGroupId: !Ref ControlPlaneSecurityGroup
IpProtocol: tcp
FromPort: 443
ToPort: 443
ControlPlaneEgressToNodeSecurityGroupOn443:
Type: AWS::EC2::SecurityGroupEgress
DependsOn: NodeSecurityGroup
Properties:
Description: Allow the cluster control plane to communicate with pods running extension API servers on port 443
GroupId: !Ref ControlPlaneSecurityGroup
DestinationSecurityGroupId: !Ref NodeSecurityGroup
IpProtocol: tcp
FromPort: 443
ToPort: 443
ClusterControlPlaneSecurityGroupIngress:
Type: AWS::EC2::SecurityGroupIngress
DependsOn: NodeSecurityGroup
Properties:
Description: Allow pods to communicate with the cluster API Server
GroupId: !Ref ControlPlaneSecurityGroup
SourceSecurityGroupId: !Ref NodeSecurityGroup
IpProtocol: tcp
ToPort: 443
FromPort: 443
NodeGroup:
Type: AWS::AutoScaling::AutoScalingGroup
Properties:
DesiredCapacity: !Ref NodeAutoScalingGroupDesiredCapacity
LaunchConfigurationName: !Ref NodeLaunchConfig
MinSize: !Ref NodeAutoScalingGroupMinSize
MaxSize: !Ref NodeAutoScalingGroupMaxSize
VPCZoneIdentifier:
- !Ref Subnet01
- !Ref Subnet02
- !Ref Subnet03
Tags:
- Key: Name
Value: !Sub "${EnvironmentName}KubernetesCluster-Node"
PropagateAtLaunch: 'true'
- Key: !Sub 'kubernetes.io/cluster/${EnvironmentName}KubernetesCluster'
Value: 'owned'
PropagateAtLaunch: 'true'
UpdatePolicy:
AutoScalingRollingUpdate:
MaxBatchSize: '1'
MinInstancesInService: !Ref NodeAutoScalingGroupDesiredCapacity
PauseTime: 'PT5M'
NodeLaunchConfig:
Type: AWS::AutoScaling::LaunchConfiguration
Properties:
AssociatePublicIpAddress: 'true'
IamInstanceProfile: !Ref NodeInstanceProfile
ImageId: !Ref NodeImageId
InstanceType: !Ref NodeInstanceType
KeyName: !Ref KeyName
SecurityGroups:
- !Ref NodeSecurityGroup
BlockDeviceMappings:
- DeviceName: /dev/xvda
Ebs:
VolumeSize: 20
VolumeType: gp2
DeleteOnTermination: true
UserData:
Fn::Base64:
!Sub |
#!/bin/bash
set -o xtrace
/etc/eks/bootstrap.sh ${EnvironmentName}KubernetesCluster ${BootstrapArguments}
/opt/aws/bin/cfn-signal --exit-code $? \
--stack ${AWS::StackName} \
--resource NodeGroup \
--region ${AWS::Region}
Outputs:
KubernetesClusterName:
Description: Cluster name
Value: !Ref Cluster
Export:
Name: KubernetesClusterName
KubernetesClusterEndpoint:
Description: Cluster endpoint
Value: !GetAtt Cluster.Endpoint
Export:
Name: KubernetesClusterEndpoint
KubernetesNodeInstanceProfile:
Description: The name of the IAM profile for K8
Value: !GetAtt NodeInstanceProfile.Arn
Export:
Name: KubernetesNodeInstanceProfileArn
There are two ways of adding Worker nodes to your EKS cluster:
Launch and register workers on your own (https://docs.aws.amazon.com/eks/latest/userguide/launch-workers.html)
Use managed node groups (https://docs.aws.amazon.com/eks/latest/userguide/managed-node-groups.html)
As I can see from your template, you are using the first approach by now. Important when doing this is, that you need to wait until the EKS Cluster is ready and in state active, before launching the worker nodes. You can achieve this by using the DependsOn Attribute. If this does not resolve your issues, have a look at the cloud init logs (/var/log/cloud-init-output.log) to check what is happening while joining the cluster.
If you would like to use Managed Node Groups, just remove the AutoScaling Group and LaunchConfiguration and use this type instead: https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-eks-nodegroup.html
The benefit is, that AWS takes care of creating the required resources (AutoScaling Group and LaunchTemplate) in your account for you and you can see the Node Group in the AWS Console.
I approached with managed node groups (https://docs.aws.amazon.com/eks/latest/userguide/managed-node-groups.html) option. It was working. But how to define the autoscaling policy. It just allows giving max and min node counts not even name.
I'm recently learning ECS from AWS documents from Module Two - Deploy the Monolith | AWS.
While I read the YAML file for the CloudFormation, the file creates two EC2 instances in the cluster and also specified two public subnets in the VPC. I'm new to the VPC, so is it because of the creation of 2 EC2 instances so two public subnets are needed?
AWSTemplateFormatVersion: '2010-09-09'
Parameters:
DesiredCapacity:
Type: Number
Default: '2'
Description: Number of instances to launch in your ECS cluster.
MaxSize:
Type: Number
Default: '2'
Description: Maximum number of instances that can be launched in your ECS cluster.
InstanceType:
Description: EC2 instance type
Type: String
Default: t2.micro
AllowedValues: [t2.micro, t2.small, t2.medium, t2.large, m3.medium, m3.large,
m3.xlarge, m3.2xlarge, m4.large, m4.xlarge, m4.2xlarge, m4.4xlarge, m4.10xlarge,
c4.large, c4.xlarge, c4.2xlarge, c4.4xlarge, c4.8xlarge, c3.large, c3.xlarge,
c3.2xlarge, c3.4xlarge, c3.8xlarge, r3.large, r3.xlarge, r3.2xlarge, r3.4xlarge,
r3.8xlarge, i2.xlarge, i2.2xlarge, i2.4xlarge, i2.8xlarge]
ConstraintDescription: Please choose a valid instance type.
Mappings:
AWSRegionToAMI:
us-east-1:
AMIID: ami-eca289fb
us-east-2:
AMIID: ami-446f3521
us-west-1:
AMIID: ami-9fadf8ff
us-west-2:
AMIID: ami-7abc111a
eu-west-1:
AMIID: ami-a1491ad2
eu-central-1:
AMIID: ami-54f5303b
ap-northeast-1:
AMIID: ami-9cd57ffd
ap-southeast-1:
AMIID: ami-a900a3ca
ap-southeast-2:
AMIID: ami-5781be34
SubnetConfig:
VPC:
CIDR: '10.0.0.0/16'
PublicOne:
CIDR: '10.0.0.0/24'
PublicTwo:
CIDR: '10.0.1.0/24'
Resources:
# VPC into which stack instances will be placed
VPC:
Type: AWS::EC2::VPC
Properties:
EnableDnsSupport: true
EnableDnsHostnames: true
CidrBlock: !FindInMap ['SubnetConfig', 'VPC', 'CIDR']
PublicSubnetOne:
Type: AWS::EC2::Subnet
Properties:
AvailabilityZone:
Fn::Select:
- 0
- Fn::GetAZs: {Ref: 'AWS::Region'}
VpcId: !Ref 'VPC'
CidrBlock: !FindInMap ['SubnetConfig', 'PublicOne', 'CIDR']
MapPublicIpOnLaunch: true
PublicSubnetTwo:
Type: AWS::EC2::Subnet
Properties:
AvailabilityZone:
Fn::Select:
- 1
- Fn::GetAZs: {Ref: 'AWS::Region'}
VpcId: !Ref 'VPC'
CidrBlock: !FindInMap ['SubnetConfig', 'PublicTwo', 'CIDR']
MapPublicIpOnLaunch: true
InternetGateway:
Type: AWS::EC2::InternetGateway
GatewayAttachement:
Type: AWS::EC2::VPCGatewayAttachment
Properties:
VpcId: !Ref 'VPC'
InternetGatewayId: !Ref 'InternetGateway'
PublicRouteTable:
Type: AWS::EC2::RouteTable
Properties:
VpcId: !Ref 'VPC'
PublicRoute:
Type: AWS::EC2::Route
DependsOn: GatewayAttachement
Properties:
RouteTableId: !Ref 'PublicRouteTable'
DestinationCidrBlock: '0.0.0.0/0'
GatewayId: !Ref 'InternetGateway'
PublicSubnetOneRouteTableAssociation:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
SubnetId: !Ref PublicSubnetOne
RouteTableId: !Ref PublicRouteTable
PublicSubnetTwoRouteTableAssociation:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
SubnetId: !Ref PublicSubnetTwo
RouteTableId: !Ref PublicRouteTable
# ECS Resources
ECSCluster:
Type: AWS::ECS::Cluster
EcsSecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupDescription: ECS Security Group
VpcId: !Ref 'VPC'
EcsSecurityGroupHTTPinbound:
Type: AWS::EC2::SecurityGroupIngress
Properties:
GroupId: !Ref 'EcsSecurityGroup'
IpProtocol: tcp
FromPort: '80'
ToPort: '80'
CidrIp: 0.0.0.0/0
EcsSecurityGroupSSHinbound:
Type: AWS::EC2::SecurityGroupIngress
Properties:
GroupId: !Ref 'EcsSecurityGroup'
IpProtocol: tcp
FromPort: '22'
ToPort: '22'
CidrIp: 0.0.0.0/0
EcsSecurityGroupALBports:
Type: AWS::EC2::SecurityGroupIngress
Properties:
GroupId: !Ref 'EcsSecurityGroup'
IpProtocol: tcp
FromPort: '31000'
ToPort: '61000'
SourceSecurityGroupId: !Ref 'EcsSecurityGroup'
CloudwatchLogsGroup:
Type: AWS::Logs::LogGroup
Properties:
LogGroupName: !Join ['-', [ECSLogGroup, !Ref 'AWS::StackName']]
RetentionInDays: 14
ECSALB:
Type: AWS::ElasticLoadBalancingV2::LoadBalancer
Properties:
Name: demo
Scheme: internet-facing
LoadBalancerAttributes:
- Key: idle_timeout.timeout_seconds
Value: '30'
Subnets:
- !Ref PublicSubnetOne
- !Ref PublicSubnetTwo
SecurityGroups: [!Ref 'EcsSecurityGroup']
ECSAutoScalingGroup:
Type: AWS::AutoScaling::AutoScalingGroup
Properties:
VPCZoneIdentifier:
- !Ref PublicSubnetOne
- !Ref PublicSubnetTwo
LaunchConfigurationName: !Ref 'ContainerInstances'
MinSize: '1'
MaxSize: !Ref 'MaxSize'
DesiredCapacity: !Ref 'DesiredCapacity'
CreationPolicy:
ResourceSignal:
Timeout: PT15M
UpdatePolicy:
AutoScalingReplacingUpdate:
WillReplace: 'true'
ContainerInstances:
Type: AWS::AutoScaling::LaunchConfiguration
Properties:
ImageId: !FindInMap [AWSRegionToAMI, !Ref 'AWS::Region', AMIID]
SecurityGroups: [!Ref 'EcsSecurityGroup']
InstanceType: !Ref 'InstanceType'
IamInstanceProfile: !Ref 'EC2InstanceProfile'
UserData:
Fn::Base64: !Sub |
#!/bin/bash -xe
echo ECS_CLUSTER=${ECSCluster} >> /etc/ecs/ecs.config
yum install -y aws-cfn-bootstrap
/opt/aws/bin/cfn-signal -e $? --stack ${AWS::StackName} --resource ECSAutoScalingGroup --region ${AWS::Region}
ECSServiceRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Statement:
- Effect: Allow
Principal:
Service: [ecs.amazonaws.com]
Action: ['sts:AssumeRole']
Path: /
Policies:
- PolicyName: ecs-service
PolicyDocument:
Statement:
- Effect: Allow
Action:
- 'elasticloadbalancing:DeregisterInstancesFromLoadBalancer'
- 'elasticloadbalancing:DeregisterTargets'
- 'elasticloadbalancing:Describe*'
- 'elasticloadbalancing:RegisterInstancesWithLoadBalancer'
- 'elasticloadbalancing:RegisterTargets'
- 'ec2:Describe*'
- 'ec2:AuthorizeSecurityGroupIngress'
Resource: '*'
EC2Role:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Statement:
- Effect: Allow
Principal:
Service: [ec2.amazonaws.com]
Action: ['sts:AssumeRole']
Path: /
Policies:
- PolicyName: ecs-service
PolicyDocument:
Statement:
- Effect: Allow
Action:
- 'ecs:CreateCluster'
- 'ecs:DeregisterContainerInstance'
- 'ecs:DiscoverPollEndpoint'
- 'ecs:Poll'
- 'ecs:RegisterContainerInstance'
- 'ecs:StartTelemetrySession'
- 'ecs:Submit*'
- 'logs:CreateLogStream'
- 'logs:PutLogEvents'
- 'ecr:GetAuthorizationToken'
- 'ecr:BatchGetImage'
- 'ecr:GetDownloadUrlForLayer'
Resource: '*'
AutoscalingRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Statement:
- Effect: Allow
Principal:
Service: [application-autoscaling.amazonaws.com]
Action: ['sts:AssumeRole']
Path: /
Policies:
- PolicyName: service-autoscaling
PolicyDocument:
Statement:
- Effect: Allow
Action:
- 'application-autoscaling:*'
- 'cloudwatch:DescribeAlarms'
- 'cloudwatch:PutMetricAlarm'
- 'ecs:DescribeServices'
- 'ecs:UpdateService'
Resource: '*'
EC2InstanceProfile:
Type: AWS::IAM::InstanceProfile
Properties:
Path: /
Roles: [!Ref 'EC2Role']
Outputs:
ClusterName:
Description: The name of the ECS cluster, used by the deploy script
Value: !Ref 'ECSCluster'
Export:
Name: !Join [':', [!Ref "AWS::StackName", "ClusterName" ]]
Url:
Description: The url at which the application is available
Value: !Join ['', [!GetAtt 'ECSALB.DNSName']]
ALBArn:
Description: The ARN of the ALB, exported for later use in creating services
Value: !Ref 'ECSALB'
Export:
Name: !Join [':', [!Ref "AWS::StackName", "ALBArn" ]]
ECSRole:
Description: The ARN of the ECS role, exports for later use in creating services
Value: !GetAtt 'ECSServiceRole.Arn'
Export:
Name: !Join [':', [!Ref "AWS::StackName", "ECSRole" ]]
VPCId:
Description: The ID of the VPC that this stack is deployed in
Value: !Ref 'VPC'
Export:
Name: !Join [':', [!Ref "AWS::StackName", "VPCId" ]]
In your example, two AZs are being used which requires two subnets (one for each AZ). This is not related to the number of EC2 instances.
A typical best practices with AWS and other cloud vendors is to use multiple availability zones (AZ) for fault tolerance. For AWS each AZ needs its own subnet. Auto scaling and load balancing will attempt to keep the number of instances the same in each AZ.
PS. If I was learning AWS, I would not start with this example. This example is very complex but very realistic for a real world deployment. There are lots of cloudformation examples that are much easy to master to start with.