I am starting an ECS task with Fargate and the container ends up in a STOPPED state after being in PENDING for a few minutes. The Status gives the following error message:
CannotPullContainerError: context canceled
I am using PrivateLink to allow the ECS host to talk to the ECR registry without having to go via the public Internet and this is how it is configured (Serverless syntax augmenting CloudFormation):
Properties:
PrivateDnsEnabled: true
ServiceName: com.amazonaws.ap-southeast-2.ecr.dkr
SubnetIds:
- { Ref: frontendSubnet1 }
- { Ref: frontendSubnet2 }
VpcEndpointType: Interface
VpcId: { Ref: frontendVpc }
Any ideas as to what is causing the error?
did you also add an S3 endpoint?
Here is a working snippet of my template, I was able to solve the issue with the aws support:
EcrDkrEndpoint:
Type: 'AWS::EC2::VPCEndpoint'
Properties:
PrivateDnsEnabled: true
SecurityGroupIds: [!Ref 'FargateContainerSecurityGroup']
ServiceName: !Sub 'com.amazonaws.${AWS::Region}.ecr.dkr'
SubnetIds: [!Ref 'PrivateSubnetOne', !Ref 'PrivateSubnetTwo']
VpcEndpointType: Interface
VpcId: !Ref 'VPC'
For S3 you need to know that a route table is necessary - normally you would like to use the same as for the internet gateway, containing the route 0.0.0.0/0
S3Endpoint:
Type: 'AWS::EC2::VPCEndpoint'
Properties:
ServiceName: !Sub 'com.amazonaws.${AWS::Region}.s3'
VpcEndpointType: Gateway
VpcId: !Ref 'VPC'
RouteTableIds: [!Ref 'PrivateRouteTable']
Without an endpoint for cloudwatch you will get another failure, it is necessary too:
CloudWatchEndpoint:
Type: 'AWS::EC2::VPCEndpoint'
Properties:
PrivateDnsEnabled: true
SecurityGroupIds: [!Ref 'FargateContainerSecurityGroup']
ServiceName: !Sub 'com.amazonaws.${AWS::Region}.logs'
SubnetIds: [!Ref 'PrivateSubnetOne', !Ref 'PrivateSubnetTwo']
VpcEndpointType: Interface
VpcId: !Ref 'VPC'
EDIT: private route table:
PrivateRoute:
Type: AWS::EC2::Route
DependsOn: InternetGatewayAttachement
Properties:
RouteTableId: !Ref 'PublicRouteTable'
DestinationCidrBlock: '0.0.0.0/0'
GatewayId: !Ref 'InternetGateway'
I found I needed not only vpc endpoints for s3, aws logs and the two ecr endpoints as detailed in #graphik_ 's answer but I also needed to ensure that the security groups on the endpoints allowed ingress access to HTTPS from the security group on the Farscape containers.
The security group on the Farscape containers need egress access via HTTPS to the vpce endpoint security group and also to the pl-7ba54012 IP group which is s3.
This and the route to pl-7ba54012 in the route table seems to be the whole picture.
There are Policies on the vpce too, which I left as the default "All Access" but you could harden these up to only allow access from the Role running the Fargate containers.
Related
Aws dynamoDb only supports Gateway endpoint but I am getting error while deployment saying:
Subnet IDs are only supported for Interface and GatewayLoadBalancer type VPC Endpoints. (Service: AmazonEC2; Status Code: 400; Error Code: InvalidParameter)
Is this a issue with subnet?
VpcEndpointSubnetIds:
Type: "List<AWS::EC2::Subnet::Id>"
Description: Select the subnet to associate with the VPC endpoint
Default: 'subnet-039c1ac2c0925fe94,subnet-0e9267fe210b042da'
VPCEndpointGateway:
Type: AWS::EC2::VPCEndpoint
Properties:
VpcId: !Ref VpcId
ServiceName: !Ref dynamoDbEndPointServiceName
VpcEndpointType: Gateway
PrivateDnsEnabled: true
SubnetIds:
-
!Ref VpcEndpointSubnetIds
SecurityGroupIds:
-
!Ref cacheSecurityGroup
When crating VPCEndpoints the SubnetIds property is only supported by Interface and Gateway Load Balancer VPC endpoint types, for Gateway type you do not need this property, so you need to remove it:
VPCEndpointGateway:
Type: AWS::EC2::VPCEndpoint
Properties:
VpcId: !Ref VpcId
ServiceName: !Ref dynamoDbEndPointServiceName
VpcEndpointType: Gateway
PrivateDnsEnabled: true
SecurityGroupIds:
- !Ref cacheSecurityGroup
For more information check the AWS::EC2::VPCEndpoint doc
I have a minimal stack for creating a simple service with a listener. The listener gets created first and succeeds. The service gets initiated next but gets stuck on "CREATE_IN_PROGRESS". Now I have seen this issue on SO but that has a clear reason for it failing. In my occasion the Cloudtrail logs simple show the initiation and 10 minutes later (custom timeout) the delete but nothing in between. The Cloudformation dashboard events also just show initiation and delete thereafter.
The service does not get created during this time either. This I visually checked by going over to the services and having other services there but not my own.
I have trimmed down the cloudformation template to the bare (i.e. only listener and service with reference to existing resources) but it still gets stuck.
Apart from the usual cloudtrail and cloudformation logs, what could I do to identify the problem?
[EDIT]
Here is the template I use. The parameters are based on my current setup.
AWSTemplateFormatVersion: "2010-09-09"
Description: "The Script to configure the RDS services."
Parameters:
ClusterNameARN:
Default: "arn:aws:ecs:eu-central-1:<NR_HERE>:cluster/AmsCluster"
Type: String
StaLBARN:
Default: "arn:aws:elasticloadbalancing:eu-central-1:<NR_HERE>:loadbalancer/app/StaPostgrestLoadBalancer/<ID_HERE>"
Type: String
StaTargetGroupARN:
Default: "arn:aws:elasticloadbalancing:eu-central-1:<NR_HERE>:targetgroup/LBTargetGroupSta/<ID_HERE>"
Type: String
LoadBalancerSG:
Type: 'AWS::EC2::SecurityGroup::Id'
LoadBalancerSubnet1:
Description: Subnet instance.
Type: 'AWS::EC2::Subnet::Id'
LoadBalancerSubnet2:
Description: Subnet region B instance.
Type: 'AWS::EC2::Subnet::Id'
LoadBalancerSubnet3:
Description: Subnet region for public.
Type: 'AWS::EC2::Subnet::Id'
StaTaskDefinitionARN:
Default: "arn:aws:ecs:eu-central-1:<NR_HERE>:task-definition/RDSPostgrestFamily:2"
Type: String
CertificateARN:
Default: "arn:aws:acm:eu-central-1:<NR_HERE>:certificate/<ID_HERE>"
Type: String
Resources:
LBListenerSta:
Type: 'AWS::ElasticLoadBalancingV2::Listener'
Properties:
Certificates:
- CertificateArn: !Ref CertificateARN
DefaultActions:
- Type: forward
TargetGroupArn: !Ref StaTargetGroupARN
LoadBalancerArn: !Ref StaLBARN
Port: 443
Protocol: HTTPS
StaService:
Type: 'AWS::ECS::Service'
Properties:
Cluster: !Ref ClusterNameARN
DesiredCount: 2
LaunchType: 'FARGATE'
LoadBalancers:
- ContainerName: 'Postgrest'
ContainerPort: 3000
TargetGroupArn: !Ref StaTargetGroupARN
NetworkConfiguration:
AwsvpcConfiguration:
SecurityGroups:
- !Ref LoadBalancerSG
Subnets:
- !Ref LoadBalancerSubnet1
- !Ref LoadBalancerSubnet2
- !Ref LoadBalancerSubnet3
ServiceName: StaPostgrestService
TaskDefinition: !Ref StaTaskDefinitionARN
DependsOn:
- LBListenerSta
Outputs:
StaServices:
Description: "The ARN of the service for the STA tasks."
Value: !Ref StaService
Based on the comments.
The issue is with the StaService ECS service. To get more information of possible reason why it fails, one can go to:
ECS Console -> Cluster -> Service -> Events
Based on this, the Events showed that the role used for ECS has incorrect permissions.
Trying to create a ECS Service (on Fargate) with cloudformation but got error:
Invalid service in ARN (Service: AmazonECS; Status Code: 400; Error
Code: InvalidParameterException; Request ID: xxx).
According to error message seems some ARN is wrong, but I didn't find the reason, I checked ARN of IAM roles and its ok. The other ARN are passed with !Ref function (so not a typo error)
All Resources (including from all others nested templates, vpc, cluster, alb etc) are created, except the "Service" resouce (the ECS service).
Below is the template used (nested template). All parameters are ok (passed from root template). Parameters TaskExecutionRole and ServiceRole are ARNs from IAM roles created by ECS wizard:
Description: >
Deploys xxx ECS service, with load balancer listener rule,
target group, task definition, service definition and auto scaling
Parameters:
EnvironmentName:
Description: An environment name that will be prefixed to resource names
Type: String
EnvironmentType:
Description: See master template
Type: String
VpcId:
Type: String
PublicSubnet1:
Type: String
PublicSubnet2:
Type: String
ALBListener:
Description: ALB listener
Type: String
Cluster:
Description: ECS Cluster
Type: String
TaskExecutionRole:
Description: See master template
Type: String
ServiceRole:
Description: See master template
Type: String
ServiceName:
Description: Service name (used as a variable)
Type: String
Default: xxx
Cpu:
Description: Task size (CPU)
Type: String
Memory:
Description: Task size (memory)
Type: String
Conditions:
HasHttps: !Equals [!Ref EnvironmentType, production]
HasNotHttps: !Not [!Equals [!Ref EnvironmentType, production]]
Resources:
ServiceTargetGroup:
Type: AWS::ElasticLoadBalancingV2::TargetGroup
Properties:
Name: !Sub '${EnvironmentName}-${ServiceName}'
VpcId: !Ref VpcId
TargetType: ip
Port: 80
Protocol: HTTP
AlbListenerRule:
Type: AWS::ElasticLoadBalancingV2::ListenerRule
Properties:
Actions:
- Type: forward
TargetGroupArn: !Ref ServiceTargetGroup
Conditions:
- Field: host-header
Values: [www.mydomain.com] # test
ListenerArn: !Ref ALBListener
Priority: 1
TaskDefinition:
Type: AWS::ECS::TaskDefinition
Properties:
Family: !Sub '${EnvironmentName}-${ServiceName}-Task'
ContainerDefinitions:
- Name: !Ref ServiceName
Image: nginx
PortMappings:
- ContainerPort: 80
LogConfiguration:
LogDriver: awslogs
Options:
awslogs-group: !Ref EnvironmentName
awslogs-region: !Ref AWS::Region
awslogs-stream-prefix: !Ref ServiceName
NetworkMode: awsvpc
RequiresCompatibilities: [FARGATE]
Cpu: !Ref Cpu
Memory: !Ref Memory
ExecutionRoleArn: !Ref TaskExecutionRole
Service:
Type: AWS::ECS::Service
DependsOn: TaskDefinition
Properties:
Cluster: !Ref Cluster
ServiceName: !Ref ServiceName
TaskDefinition: !Ref TaskDefinition
LaunchType: FARGATE
DesiredCount: 1
LoadBalancers:
- ContainerName: !Ref ServiceName
ContainerPort: 80
TargetGroupArn: !Ref ServiceTargetGroup
NetworkConfiguration:
AwsvpcConfiguration:
AssignPublicIp: ENABLED
Subnets:
- !Ref PublicSubnet1
- !Ref PublicSubnet2
Role: !Ref ServiceRole
I lost a few hours in this and could not solve it, I reviewed a lot in the documentation but nothing, if someone knows how to help.
Thanks!
The error message is confusing because it does not explain which parameter is wrong. Amazon API expects resource ARNs in several parameters including Cluster, TaskDefinition and TargetGroup. The error happens when one of these parameters are wrong. Please check carefully these parameters and make sure they are valid ARNs.
I had exactly the same error and in my case I made a mistake and provided wrong Cluster value.
And I am posting an answer here because this was the first search result for this error message and it had no answer.
The problem for me was that the default AWS region was set to the wrong one. To fix that, run the following command (using the correct region).
$ aws configure set default.region us-west-2
I was wondering if it is possible to create a Resource in my CloudFormation file to create a VPC Endpoint for SQS. I was able to do this for SQS and DynamoDB, but I believe it is because they were Gateway endpoints.
For now I have defined my SQS resource as:
SQSEndpoint:
Type: AWS::EC2::VPCEndpoint
Properties:
PolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Principal: '*'
Action:
- 'sqs:*'
Resource:
- '*'
ServiceName: !Join
- ''
- - com.amazonaws.
- !Ref 'AWS::Region'
- .sqs
SubnetIds:
- !Ref PrivateSubnet
- !Ref PublicSubnet
VpcId: !Ref 'VPC'
VpcEndpointType: Interface
Though, when I try to create the stack I get the error:
It seems like it is possible from reading this blog post from AWS. Though I can't find any examples or documentation. Any ideas?
So to sum it all up:
SQSEndpoint:
Type: AWS::EC2::VPCEndpoint
Properties:
ServiceName: !Join
- ''
- - com.amazonaws.
- !Ref 'AWS::Region'
- .sqs
SubnetIds:
- !Ref PrivateSubnet
VpcId: !Ref 'VPC'
VpcEndpointType: Interface
SecurityGroupIds:
- !Ref PrivateSubnetInstanceSG # has to allow traffic from your VPC
PrivateDnsEnabled: true
Background
I figured it out, for DynamoDB and S3, which use Gateway Endpoints, the PolicyDocument property has to be defined. For all other services, This doesn't need to be defined. So for SQS first I thought all that is needed is:
SQSEndpoint:
Type: AWS::EC2::VPCEndpoint
Properties:
ServiceName: !Join
- ''
- - com.amazonaws.
- !Ref 'AWS::Region'
- .sqs
SubnetIds:
- !Ref PrivateSubnet
- !Ref PublicSubnet
VpcId: !Ref 'VPC'
VpcEndpointType: Interface
But that still wasn't working, even though the interface endpoint was setup, I had to:
set the PrivateDnsEnabled property to true so that you can use the AWS CLI as the AWS CLI uses the public endpoint, and setting the PrivateDnsEnabled allows the private endpoint to automatically be mapped to the public one
set the SecurityGroupsIds to have a security group that allows Inbound traffic from your VPC. If this instance set, the default security group is used and it only allows inbound traffic from sources that have the default security group, meaning that SQS won't be able to send traffic back to your instance
I have a CloudFormation template that creates a custom VPC.
The template creates the following resources - a VPC, an Internet Gateway, attaches the IGW to the VPC, and creates a Public Subnet.
I want to add a route (destination 0.0.0.0/0, target IGW) to the Route Table that gets created as part of the VPC.
I have read through the cloudformation documentation for routes, route tables to figure out how to do this, but to no avail.
I can use the Fn::Ref function to refer to resources or parameters that are explicitly created as part of the template, but how do I refer to resources that get created inherently with the VPC?
Any insights on how to re-use the existing route table, NACL and Security Group are much appreciated.
Thanks,
Good job so far - you have your internet gateway, route table, and a public subnet. Now you need to create the route and attach the route table to the subnet if you haven't already done so. If you're using YAML it might look something like this:
InternetGateway:
Type: AWS::EC2::InternetGateway
Properties:
Tags:
- Key: Name
Value: !Ref EnvironmentName
InternetGatewayAttachment:
Type: AWS::EC2::VPCGatewayAttachment
Properties:
InternetGatewayId: !Ref InternetGateway
VpcId: !Ref VPC
PublicSubnet1:
Type: AWS::EC2::Subnet
Properties:
VpcId: !Ref VPC
AvailabilityZone: !Select [ 0, !GetAZs '' ]
CidrBlock: !Ref PublicSubnet1CIDR
MapPublicIpOnLaunch: true
Tags:
- Key: Name
Value: !Sub ${EnvironmentName} Public Subnet (AZ1)
PublicRouteTable:
Type: AWS::EC2::RouteTable
Properties:
VpcId: !Ref VPC
Tags:
- Key: Name
Value: !Sub ${EnvironmentName} Public Routes
DefaultPublicRoute:
Type: AWS::EC2::Route
DependsOn: InternetGatewayAttachment
Properties:
RouteTableId: !Ref PublicRouteTable
DestinationCidrBlock: 0.0.0.0/0
GatewayId: !Ref InternetGateway
PublicSubnet1RouteTableAssociation:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
RouteTableId: !Ref PublicRouteTable
SubnetId: !Ref PublicSubnet1
Don't use the default route table (see https://serverfault.com/questions/588904/aws-vpc-default-route-table-in-cloudformation)
You can get default security group as per https://serverfault.com/questions/544439/aws-cloudformation-vpc-default-security-group
And finally you can also get the DefaultNetworkAcl in the same as DefaultSecurityGroup above. See also https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-ec2-vpc.html)