How do I delete an ECS capacity provider which is in-use? - amazon-web-services

I'm using CDK to set up an application backed by ECS. My stack is created successfully, but when I run cdk destroy the tear-down fails with the following error:
7:06:03 AM | DELETE_FAILED | AWS::ECS::ClusterCapacityProviderAssociations | sd-cluster/sd-cluster
Resource handler returned message: "The specified capacity provider is in use and cannot be removed. (Service: AmazonECS; Status Code: 400; Error Code: ResourceInUseException; Request ID:
f76555d1-5dc0-47b7-ba65-b65529c8b999; Proxy: null)" (RequestToken: 00531a51-2d95-aef9-77cd-3e06714c78b3, HandlerErrorCode: null)
I would assume obviously this is because my capacity provider is backed by an ASG (autoscaling group) which has an instance running, and there is a task currently running in my cluster.
The ASG is also defined within the stack.
But then how can I tear down this stack without manually going to the console and setting the running instances to zero?

This is a known issue in CDK - here's the issue on GitHub.
I included a workaround in the comments, here it is in Python:
#jsii.implements(cdk.IAspect)
class HotfixCapacityProviderDependencies:
# Add a dependency from capacity provider association to the cluster
# and from each service to the capacity provider association
def visit(self, node: IConstruct) -> None:
if type(node) is ecs.Ec2Service:
children = node.cluster.node.find_all()
for child in children:
if type(child) is ecs.CfnClusterCapacityProviderAssociations:
child.node.add_dependency(node.cluster)
node.node.add_dependency(child)
You would use it just as any other aspect:
# in the stack
Aspects.of(self).add(HotfixCapacityProviderDependencies())

Related

Cannot delete EC2

I'm trying to delete my EC2 instances which I believe were created when I used Amplify for my authentication. But everytime I delete my EC2 instance, it spawns another instance. I research and I found out that if it was created using ELB, then I should delete that ELB instance first. So that's what I tried to do, delete the ELB instance. But even that causes an error and won't delete my ELB instance
Now I am stuck, and I am being billed by AWS because of these running instances that I am not able to delete. Please advse,
ERROR
Stack deletion failed: The following resource(s) failed to delete:
[AWSEBSecurityGroup, AWSEBRDSDatabase, AWSEBLoadBalancerSecurityGroup].
ERROR
Deleting security group named: <...>
failed Reason: resource <...> has a dependent object (Service: AmazonEC2;
Status Code: 400;
Error Code: DependencyViolation;
Request ID: <...>;
Proxy: null)
ERROR
Deleting security group named: <...>-stack-AWSEBSecurityGroup-<...>
failed Reason: resource <...> has a dependent object (Service: AmazonEC2;
Status Code: 400;
Error Code: DependencyViolation;
Request ID: <...>;
Proxy: null)
Is it because I deleted the RDS first before deleting the ELB?
Instances in Elastic Beanstalk run in Autoscaling Group. That's why it spans new ones when you delete them.
You should delete your EB environment. This will take care of deleting the autoscaling group with the instance.

AWS EKS EBS mounting issue / ExpiredTokenException / Instance not found at the same time

AWS EKS Cluster 1.18 with AWS CSI EBS driver. Some pods had EBS volumes statically provisioned and everything was working.
Next. At some point all the pods using EBS volumes stopped responding, services had infinite waiting time and the proxy pod was killing the connection because of the timeout.
Logs (CloudWatch) for kube-controller-manager were filled with such messages:
kubernetes.io/csi: attachment for vol-00c1763<removed-by-me> failed:
rpc error:
code = NotFound desc = Instance "i-0c356612<removed-by-me>" not found
and
event.go:278] Event(v1.ObjectReference{Kind:"Pod", Namespace:"default", Name:"podname-65df9bc5c4-2vtj8", UID:"ad4d30b7-<removed-by-me>", APIVersion:"v1", ResourceVersion:"4414194", FieldPath:""}):
type: 'Warning'
reason: 'FailedAttachVolume' AttachVolume.Attach failed for volume "ebs-volumename" :
rpc error: code = NotFound desc = Instance "i-0c356<removed-by-me>" not found
The instance is there, we checked it like 20 times. We tried to kill the instance, so the CloudFormation creates a new one for us, the error persists, just the instance ID is different.
Next. We started deleting pods and unmounting volumes / deleting sc/pvc/pv.
kubectl stuck at the end of deleting pv.
We were only able to get them out of this state by patching no finalizers to both pv and volumemounts.
The logs contain the following:
csi_attacher.go:662] kubernetes.io/csi: detachment for VolumeAttachment for volume [vol-00c176<removed-by-me>] failed:
rpc error: code = Internal desc = Could not detach volume "vol-00c176<removed-by-me>" from node "i-0c3566<removed-by-me>":
error listing AWS instances:
"WebIdentityErr: failed to retrieve credentials\n
caused by: ExpiredTokenException: Token expired: current date/time 1617384213 must be before the expiration date/time1616724855\n\
tstatus code: 400, request id: c1cf537f-a14d<removed-by-me>"
I've read about the tokens for Kubernetes, but in our case we have everything being managed by EKS. Googling the ExpiredTokenException brings us to the pages of how you should solve the issues with your own applications, again, we manage everything on AWS using kubectl.

AWS CLI environment create error CREATE_FAILED, reason: resources failed to create

I used to deploy a Java web application to Elastic Beanstalk (EC2) as root user without this problem. Now I'm using a recommended way of deploying as IAM service user and I get the following errors. I suspect it's because of lack of permissions (policies) but I don't know what policies should I assign to the IAM user.
QUESTION: Could you help me in finding the right policies?
commands:
eb init --profile eb_admin
eb create --single
output of the 2nd command:
Printing Status:
2019-05-26 12:08:58 INFO createEnvironment is starting.
2019-05-26 12:08:59 INFO Using elasticbeanstalk-eu-central-1-726173845157 as Amazon S3 storage bucket for environment data.
2019-05-26 12:09:26 INFO Created security group named: awseb-e-ire9qdzahd-stack-AWSEBSecurityGroup-L5VUAQLDAA9F
2019-05-26 12:09:42 ERROR Stack named 'awseb-e-ire9qdzahd-stack' aborted operation. Current state: 'CREATE_FAILED' Reason: The following resource(s) failed to create: [MountTargetSecurityGroup, AWSEBEIP, sslSecurityGroupIngress, FileSystem].
2019-05-26 12:09:42 ERROR Creating security group failed Reason: The vpc ID 'vpc-7166611a' does not exist (Service: AmazonEC2; Status Code: 400; Error Code: InvalidVpcID.NotFound; Request ID: c1d0ce4d-830d-4b0c-9f84-85d8da4f7243)
2019-05-26 12:09:42 ERROR Creating EIP: 54.93.84.166 failed. Reason: Resource creation cancelled
2019-05-26 12:09:42 ERROR Creating security group ingress named: sslSecurityGroupIngress failed Reason: Resource creation cancelled
2019-05-26 12:09:44 INFO Launched environment: stack-overflow-dev. However, there were issues during launch. See event log for details.
Important!
I use a few .ebextensions scripts in order to initialize the environment:
nginx
https-instance-securitygroup
storage-efs-createfilesystem
storage-efs-mountfilesystem
After reviewing the logs, I also noticed that I forgot to create VPC which is required for EFS filesystem. Could it be that 1 failed script (storage-efs-createfilesystem) is the root cause of subsequent failing operations?
Yes, the lack of VPC has caused the other resources to fail to create. Elastic Beanstalk and the storage-efs-createfilesystem extension use CloudFormation underneath.
storage-efs-createfilesystem Cfn template creates MountTargetSecurityGroup SG and that failed due to lack of VPC. The AWSEBEIP, sslSecurityGroupIngress and FileSystem resource creation is then cancelled.

Can't deploy Spring Boot app on Amazon AWS

I am deploying simple Spring Boot app on Amazon Elastic Beanstalk.
It seems pretty simple.
I just created war-file and deployed it on Amazon.
However, I receives the following errors during creating the environment:
Creating Auto Scaling group named:
awseb-e-5zxuiqb7jh-stack-AWSEBAutoScalingGroup-1JVXAWPWCK3FK failed.
Reason: You have requested more instances (1) than your current
instance limit of 0 allows for the specified instance type. Please
visit http://aws.amazon.com/contact-us/ec2-request to request an
adjustment to this limit. Launching EC2 instance failed.
Stack named 'awseb-e-5zxuiqb7jh-stack' aborted operation. Current
state: 'CREATE_FAILED' Reason: The following resource(s) failed to
create: [AWSEBAutoScalingGroup].

Amazon Elastic Beanstalk TV instance start fails

I have an identity TVM on Amazon Elastic Beanstalk that when I try to start it gives
014-07-07 15:29:46 UTC+0100 ERROR Stack named 'awseb-e-ybrpewdr7z-stack' aborted operation. Current state: 'CREATE_FAILED' Reason: The following resource(s) failed to create: AWSEBInstanceLaunchWaitCondition. (Service: AmazonCloudFormation; Status Code: 400; Error Code: OperationError; Request ID: null)
Now If I go to logs and click snapshot logs all that happens is it waits for a while showing processing but then I get no logs showing up. Does anyone please have an idea what the problem is so I can see either the logs and/or sort the startup problem?
This was security issue.
Amazon ECB uses a VPC, now that may already be running especially if a RDB has been created first. Now the groups on the EB instance can show acces rights reuired. But the VPC has an underlying security ACL that is not group based. When a RDB is created first that initially created the VPC the VPC instance gets the rights just to access the DB, so will not allow for example HTTP traffic through this cannot get to the EB to set it up.