Cloudformation service stuck without log

Cloudformation service stuck without log - amazon-web-services

I have a minimal stack for creating a simple service with a listener. The listener gets created first and succeeds. The service gets initiated next but gets stuck on "CREATE_IN_PROGRESS". Now I have seen this issue on SO but that has a clear reason for it failing. In my occasion the Cloudtrail logs simple show the initiation and 10 minutes later (custom timeout) the delete but nothing in between. The Cloudformation dashboard events also just show initiation and delete thereafter.
The service does not get created during this time either. This I visually checked by going over to the services and having other services there but not my own.
I have trimmed down the cloudformation template to the bare (i.e. only listener and service with reference to existing resources) but it still gets stuck.
Apart from the usual cloudtrail and cloudformation logs, what could I do to identify the problem?
[EDIT]
Here is the template I use. The parameters are based on my current setup.
AWSTemplateFormatVersion: "2010-09-09"
Description: "The Script to configure the RDS services."
Parameters:
ClusterNameARN:
Default: "arn:aws:ecs:eu-central-1:<NR_HERE>:cluster/AmsCluster"
Type: String
StaLBARN:
Default: "arn:aws:elasticloadbalancing:eu-central-1:<NR_HERE>:loadbalancer/app/StaPostgrestLoadBalancer/<ID_HERE>"
Type: String
StaTargetGroupARN:
Default: "arn:aws:elasticloadbalancing:eu-central-1:<NR_HERE>:targetgroup/LBTargetGroupSta/<ID_HERE>"
Type: String
LoadBalancerSG:
Type: 'AWS::EC2::SecurityGroup::Id'
LoadBalancerSubnet1:
Description: Subnet instance.
Type: 'AWS::EC2::Subnet::Id'
LoadBalancerSubnet2:
Description: Subnet region B instance.
Type: 'AWS::EC2::Subnet::Id'
LoadBalancerSubnet3:
Description: Subnet region for public.
Type: 'AWS::EC2::Subnet::Id'
StaTaskDefinitionARN:
Default: "arn:aws:ecs:eu-central-1:<NR_HERE>:task-definition/RDSPostgrestFamily:2"
Type: String
CertificateARN:
Default: "arn:aws:acm:eu-central-1:<NR_HERE>:certificate/<ID_HERE>"
Type: String
Resources:
LBListenerSta:
Type: 'AWS::ElasticLoadBalancingV2::Listener'
Properties:
Certificates:
- CertificateArn: !Ref CertificateARN
DefaultActions:
- Type: forward
TargetGroupArn: !Ref StaTargetGroupARN
LoadBalancerArn: !Ref StaLBARN
Port: 443
Protocol: HTTPS
StaService:
Type: 'AWS::ECS::Service'
Properties:
Cluster: !Ref ClusterNameARN
DesiredCount: 2
LaunchType: 'FARGATE'
LoadBalancers:
- ContainerName: 'Postgrest'
ContainerPort: 3000
TargetGroupArn: !Ref StaTargetGroupARN
NetworkConfiguration:
AwsvpcConfiguration:
SecurityGroups:
- !Ref LoadBalancerSG
Subnets:
- !Ref LoadBalancerSubnet1
- !Ref LoadBalancerSubnet2
- !Ref LoadBalancerSubnet3
ServiceName: StaPostgrestService
TaskDefinition: !Ref StaTaskDefinitionARN
DependsOn:
- LBListenerSta
Outputs:
StaServices:
Description: "The ARN of the service for the STA tasks."
Value: !Ref StaService

Based on the comments.
The issue is with the StaService ECS service. To get more information of possible reason why it fails, one can go to:
ECS Console -> Cluster -> Service -> Events
Based on this, the Events showed that the role used for ECS has incorrect permissions.

Related

Create AWS DC Proxy Target Group timeout

I want to create a simple rds proxy. I use the attached cloudformation template. However aws cannot create resource "AWS::RDS::DBProxyTargetGroup". The error info is not enough for debugging: "Resource timed out waiting for completion". Please anyone provide me anwser?
Target group was created, but info was not updated in cloudformation event
rds_proxy_console
CF_event
failed event
Resources:
RDSProxy:
Type: "AWS::RDS::DBProxy"
Properties:
Auth:
- AuthScheme: SECRETS
IAMAuth: DISABLED
SecretArn: !Sub "arn:aws:secretsmanager:${AWS::Region}:${AWS::AccountId}:secret:${SecretsManagerName}"
DBProxyName: !Ref ProxyName
EngineFamily: !Ref ProxyEngineFamily
RoleArn: !GetAtt SecretsManagerRole.Arn
VpcSecurityGroupIds: !Ref ProxyVpcSecurityGroupIds
VpcSubnetIds: !Ref ProxyVpcSubnetIds
RDSProxyTargetGroup:
Type: "AWS::RDS::DBProxyTargetGroup"
Properties:
DBClusterIdentifiers: !Ref ProxyTargetDBClusterIdentifiers
DBProxyName: !Ref RDSProxy
TargetGroupName: default

AWS CloudFormation Create-Stack Service Resource Hanging at 'CREATE_IN_PROGRESS'

I have the below cloudformation script that is running fine with my create-stack command other than the service resource hanging at 'CREATE_IN_PROGRESS.' Hoping you all can see some kind of glaring issue that I'm missing.
I'm not seeing any way to dig deeper into details on where it's at in the process other than the 'Events' page which just shows this hung status line, but happy to provide more info if I'm able.
AWSTemplateFormatVersion: '2010-09-09'
Description: container on ecs cluster
Resources:
# Defines container. This is a simple metadata description of what
# container to run, and what resource requirements it has.
Task:
Type: AWS::ECS::TaskDefinition
Properties:
Family: apis
Cpu: 256
Memory: 512
NetworkMode: awsvpc
RequiresCompatibilities:
- FARGATE
ExecutionRoleArn: 'iamRoleHere'
ContainerDefinitions:
- Name: booksapi
# this is the image name from our repo that we made early on: aws ecr describe-repositories
Image: 'imageHere'
Cpu: 256
Memory: 512
PortMappings:
- ContainerPort: 50577
Protocol: tcp
# The service. The service is a resource which allows you to run multiple
# copies of a type of task, and gather up their logs and metrics, as well
# as monitor the number of running tasks and replace any that have crashed.
# defines how the task or container will be scheduled and deployed in the cluster and how the container instances will be registered with load balancer
Service:
Type: AWS::ECS::Service
DependsOn: ListenerRule
Properties:
#if using param for servicename: !Ref 'ServiceName'
ServiceName: booksapi
TaskDefinition: !Ref 'Task'
Cluster: !ImportValue 'ECSCluster'
LaunchType: FARGATE
DesiredCount: 2
DeploymentConfiguration:
MaximumPercent: 200
MinimumHealthyPercent: 70
NetworkConfiguration:
AwsvpcConfiguration:
AssignPublicIp: ENABLED
Subnets:
- 'subnet-abctyui'
- 'subnet-poyfdha'
SecurityGroups:
- !ImportValue ContainerSecurityGroup
LoadBalancers:
- ContainerName: booksapi
ContainerPort: 50577
TargetGroupArn: !Ref TargetGroup
# A target group. This is used for keeping track of all the tasks, and
# what IP addresses / port numbers they have. You can query it yourself,
# to use the addresses yourself, but most often this target group is just
# connected to an application load balancer, or network load balancer, so
# it can automatically distribute traffic across all the targets.
# add 443 after POC. remove health check for now as it is buggy at the moment in our template
TargetGroup:
Type: AWS::ElasticLoadBalancingV2::TargetGroup
Properties:
Name: books-tg
VpcId: 'vpc-ljhdfrr'
Port: 80
Protocol: HTTP
Matcher:
HttpCode: 200-299
HealthCheckIntervalSeconds: 10
HealthCheckPath: /stat
HealthCheckProtocol: HTTP
HealthCheckTimeoutSeconds: 5
HealthyThresholdCount: 10
TargetType: ip
ListenerRule:
Type: AWS::ElasticLoadBalancingV2::ListenerRule
Properties:
ListenerArn: !ImportValue Listener
Priority: 2
Conditions:
- Field: path-pattern
Values:
- /v1/books*
Actions:
- TargetGroupArn: !Ref TargetGroup
Type: forward
Outputs:
ApiEndpoint:
Description: Tests API Endpoint
Value: !Join ['', ['http://', !ImportValue DomainName, '/v1/books']]
Export:
Name: 'BooksApiEndpoint'

Ah, I was able to go to the service in ecs and look at the events tab there:
service booksapi failed to launch a task with (error ECS was unable to assume the role 'iamRoleHere' that was provided for this task. Please verify that the role being passed has the proper trust relationship and permissions and that your IAM user has permissions to pass this role.).

CannotPullContainerError: context canceled error when starting ECS task

I am starting an ECS task with Fargate and the container ends up in a STOPPED state after being in PENDING for a few minutes. The Status gives the following error message:
CannotPullContainerError: context canceled
I am using PrivateLink to allow the ECS host to talk to the ECR registry without having to go via the public Internet and this is how it is configured (Serverless syntax augmenting CloudFormation):
Properties:
PrivateDnsEnabled: true
ServiceName: com.amazonaws.ap-southeast-2.ecr.dkr
SubnetIds:
- { Ref: frontendSubnet1 }
- { Ref: frontendSubnet2 }
VpcEndpointType: Interface
VpcId: { Ref: frontendVpc }
Any ideas as to what is causing the error?

did you also add an S3 endpoint?
Here is a working snippet of my template, I was able to solve the issue with the aws support:
EcrDkrEndpoint:
Type: 'AWS::EC2::VPCEndpoint'
Properties:
PrivateDnsEnabled: true
SecurityGroupIds: [!Ref 'FargateContainerSecurityGroup']
ServiceName: !Sub 'com.amazonaws.${AWS::Region}.ecr.dkr'
SubnetIds: [!Ref 'PrivateSubnetOne', !Ref 'PrivateSubnetTwo']
VpcEndpointType: Interface
VpcId: !Ref 'VPC'
For S3 you need to know that a route table is necessary - normally you would like to use the same as for the internet gateway, containing the route 0.0.0.0/0
S3Endpoint:
Type: 'AWS::EC2::VPCEndpoint'
Properties:
ServiceName: !Sub 'com.amazonaws.${AWS::Region}.s3'
VpcEndpointType: Gateway
VpcId: !Ref 'VPC'
RouteTableIds: [!Ref 'PrivateRouteTable']
Without an endpoint for cloudwatch you will get another failure, it is necessary too:
CloudWatchEndpoint:
Type: 'AWS::EC2::VPCEndpoint'
Properties:
PrivateDnsEnabled: true
SecurityGroupIds: [!Ref 'FargateContainerSecurityGroup']
ServiceName: !Sub 'com.amazonaws.${AWS::Region}.logs'
SubnetIds: [!Ref 'PrivateSubnetOne', !Ref 'PrivateSubnetTwo']
VpcEndpointType: Interface
VpcId: !Ref 'VPC'
EDIT: private route table:
PrivateRoute:
Type: AWS::EC2::Route
DependsOn: InternetGatewayAttachement
Properties:
RouteTableId: !Ref 'PublicRouteTable'
DestinationCidrBlock: '0.0.0.0/0'
GatewayId: !Ref 'InternetGateway'

I found I needed not only vpc endpoints for s3, aws logs and the two ecr endpoints as detailed in #graphik_ 's answer but I also needed to ensure that the security groups on the endpoints allowed ingress access to HTTPS from the security group on the Farscape containers.
The security group on the Farscape containers need egress access via HTTPS to the vpce endpoint security group and also to the pl-7ba54012 IP group which is s3.
This and the route to pl-7ba54012 in the route table seems to be the whole picture.
There are Policies on the vpce too, which I left as the default "All Access" but you could harden these up to only allow access from the Role running the Fargate containers.

AWS ECS: Invalid service in ARN (Service: AmazonECS; ...)

Trying to create a ECS Service (on Fargate) with cloudformation but got error:
Invalid service in ARN (Service: AmazonECS; Status Code: 400; Error
Code: InvalidParameterException; Request ID: xxx).
According to error message seems some ARN is wrong, but I didn't find the reason, I checked ARN of IAM roles and its ok. The other ARN are passed with !Ref function (so not a typo error)
All Resources (including from all others nested templates, vpc, cluster, alb etc) are created, except the "Service" resouce (the ECS service).
Below is the template used (nested template). All parameters are ok (passed from root template). Parameters TaskExecutionRole and ServiceRole are ARNs from IAM roles created by ECS wizard:
Description: >
Deploys xxx ECS service, with load balancer listener rule,
target group, task definition, service definition and auto scaling
Parameters:
EnvironmentName:
Description: An environment name that will be prefixed to resource names
Type: String
EnvironmentType:
Description: See master template
Type: String
VpcId:
Type: String
PublicSubnet1:
Type: String
PublicSubnet2:
Type: String
ALBListener:
Description: ALB listener
Type: String
Cluster:
Description: ECS Cluster
Type: String
TaskExecutionRole:
Description: See master template
Type: String
ServiceRole:
Description: See master template
Type: String
ServiceName:
Description: Service name (used as a variable)
Type: String
Default: xxx
Cpu:
Description: Task size (CPU)
Type: String
Memory:
Description: Task size (memory)
Type: String
Conditions:
HasHttps: !Equals [!Ref EnvironmentType, production]
HasNotHttps: !Not [!Equals [!Ref EnvironmentType, production]]
Resources:
ServiceTargetGroup:
Type: AWS::ElasticLoadBalancingV2::TargetGroup
Properties:
Name: !Sub '${EnvironmentName}-${ServiceName}'
VpcId: !Ref VpcId
TargetType: ip
Port: 80
Protocol: HTTP
AlbListenerRule:
Type: AWS::ElasticLoadBalancingV2::ListenerRule
Properties:
Actions:
- Type: forward
TargetGroupArn: !Ref ServiceTargetGroup
Conditions:
- Field: host-header
Values: [www.mydomain.com] # test
ListenerArn: !Ref ALBListener
Priority: 1
TaskDefinition:
Type: AWS::ECS::TaskDefinition
Properties:
Family: !Sub '${EnvironmentName}-${ServiceName}-Task'
ContainerDefinitions:
- Name: !Ref ServiceName
Image: nginx
PortMappings:
- ContainerPort: 80
LogConfiguration:
LogDriver: awslogs
Options:
awslogs-group: !Ref EnvironmentName
awslogs-region: !Ref AWS::Region
awslogs-stream-prefix: !Ref ServiceName
NetworkMode: awsvpc
RequiresCompatibilities: [FARGATE]
Cpu: !Ref Cpu
Memory: !Ref Memory
ExecutionRoleArn: !Ref TaskExecutionRole
Service:
Type: AWS::ECS::Service
DependsOn: TaskDefinition
Properties:
Cluster: !Ref Cluster
ServiceName: !Ref ServiceName
TaskDefinition: !Ref TaskDefinition
LaunchType: FARGATE
DesiredCount: 1
LoadBalancers:
- ContainerName: !Ref ServiceName
ContainerPort: 80
TargetGroupArn: !Ref ServiceTargetGroup
NetworkConfiguration:
AwsvpcConfiguration:
AssignPublicIp: ENABLED
Subnets:
- !Ref PublicSubnet1
- !Ref PublicSubnet2
Role: !Ref ServiceRole
I lost a few hours in this and could not solve it, I reviewed a lot in the documentation but nothing, if someone knows how to help.
Thanks!

The error message is confusing because it does not explain which parameter is wrong. Amazon API expects resource ARNs in several parameters including Cluster, TaskDefinition and TargetGroup. The error happens when one of these parameters are wrong. Please check carefully these parameters and make sure they are valid ARNs.
I had exactly the same error and in my case I made a mistake and provided wrong Cluster value.
And I am posting an answer here because this was the first search result for this error message and it had no answer.

The problem for me was that the default AWS region was set to the wrong one. To fix that, run the following command (using the correct region).
$ aws configure set default.region us-west-2

How to use ECS placementConstraints in CloudFormation

I am trying to use placementConstraints in my service definition using CloudFormation, but it does not exist as property in the AWS::ECS::Service resource. Is there a workaround?
ECS Service: http://docs.aws.amazon.com/AmazonECS/latest/developerguide/service_definition_paramters.html
CloudFormation ECS Service Resource: http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-ecs-service.html

PlacementConstraint exists as a property of AWS::ECS::Service at the time of this writing. Quoting from AWS docs:
PlacementConstraint is a property of the AWS::ECS::Service resource
that specifies the placement constraints for the tasks in the service
to associate with an Amazon EC2 Container Service (Amazon ECS)
service.
Ref:
[1] http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-ecs-service-placementconstraints-placementconstraint.html
[2] http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-ecs-service.html#cfn-ecs-service-placementconstraints

This works for me ! If you using yaml wrong indentation can lead to this error.
ApplicationService:
Type: AWS::ECS::Service
DependsOn:
- ALBListener
- ApplicationTaskDefinition
- ALBTargetGroup
Properties:
Cluster:
'Fn::ImportValue': !Sub ecs-${EnvironmentName}-clustername
DesiredCount: 15
LoadBalancers:
- ContainerName: !Sub ${AWS::StackName}-app
ContainerPort: 8080
TargetGroupArn: !Ref ALBTargetGroup
Role:
'Fn::ImportValue': !Sub ecs-${EnvironmentName}-servicerole-arn
TaskDefinition: !Ref ApplicationTaskDefinition
PlacementConstraints:
- Expression: attribute:ecs.instance-type == t2.medium
Type: memberOf
PlacementStrategies:
- Type: spread
Field: attribute:ecs.availability-zone
If this not working for you please post your cfn file here !

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Cloudformation service stuck without log - amazon-web-services

Based on the comments. The issue is with the StaService ECS service. To get more information of possible reason why it fails, one can go to: ECS Console -> Cluster -> Service -> Events Based on this, the Events showed that the role used for ECS has incorrect permissions.

Related

Create AWS DC Proxy Target Group timeout

AWS CloudFormation Create-Stack Service Resource Hanging at 'CREATE_IN_PROGRESS'

CannotPullContainerError: context canceled error when starting ECS task

AWS ECS: Invalid service in ARN (Service: AmazonECS; ...)

How to use ECS placementConstraints in CloudFormation

Categories

Resources