Difference between AWS Elastic Container Service's (ECS) ExecutionRole and TaskRole - amazon-web-services

I'm using AWS's CloudFormation, and I recently spent quite a bit of time trying to figure out why the role I had created and attached policies to was not enabling my ECS task to send a message to a Simple Queue Service (SQS) queue.
I realized that I was incorrectly attaching the SQS permissions policy to the Execution Role when I should have been attaching the policy to the Task Role. I cannot find good documentation that explains the difference between the two roles. CloudFormation documentation for the two of them are here: ExecutionRole and TaskRole

Referring to the documentation you can see that the execution role is the IAM role that executes ECS actions such as pulling the image and storing the application logs in cloudwatch.
The TaskRole then, is the IAM role used by the task itself. For example, if your container wants to call other AWS services like S3, SQS, etc then those permissions would need to be covered by the TaskRole.
Using a TaskRole is functionally the same as using access keys in a config file on the container instance. Using access keys in this way is not secure and is considered very bad practice. I include this in the answer because many people reading this already understand access keys.

ECS task execution role is capabilities of ECS agent (and container instance), e.g:
Pulling a container image from Amazon ECR
Using the awslogs log driver
ECS task role is specific capabilities within the task itself, e.g:
When your actual code runs

Related

AWS IAM policy to update specific ECS cluster through AWS console

We're running a staging env in a separate ECS Fargate cluster. I'm trying to allow an external developers to update tasks and services in this cluster through the AWS Console.
I've created a policy that looks OK for me based on the documentation. Updates through the AWS cli work.
However the AWS Console requires a lot of other, only loosly related permissions. Is there a way to find out which permissions are required? I'm looking at CloudTrail logs but it takes 20 min until somethin shows up. Also I'd like to avoid giving unrelated permissions, even if they are read-only.

AWS ECS service error: Task long arn format must be enabled for launching service tasks with ECS managed tags

I have a ECS service running in a cluster which has 1 task. Upon task update, the service suddenly died with error:
'service my_service_name failed to launch a task with (error Task long arn format must be enabled for launching service tasks with ECS managed tags.)'
Current running tasks are automatically drained and the above message shows up every 6 hours in the "Events" tab of the service. Any changes done to the service config does not repair the issue. Rolling back the task update also doesn't change anything.
I believe I'm already using the long ARN format. Looking for help.
This turned out to be a AWS bug acknowledged by them now. It was supposed to manifest after Jan 1 2020 but appeared early because of a workflow fault in AWS.
The resources were created by an IAM user who was later deleted and hence the issue appears.
I simply removed the following from my task JSON input: propagateTags, enableECSManagedTags
It seems like you are Tagging Your Amazon ECS Resources but you did not opt-in to this feature so you have to opt-in and I thin you are using regular expression in deployment so if your deployment mechanism uses regular expressions to parse the old format ARNs or task IDs, this may be a breaking change.
Starting today you can opt in to a new Amazon Resource Name (ARN) and
resource ID format for Amazon ECS tasks, container instances, and
services. The new format enables the enhanced ability to tag resources
in your cluster, as well as tracking the cost of services and tasks
running in your cluster.
In most cases, you don’t need to change your system beyond opting in
to the new format. However, if your deployment mechanism uses regular
expressions to parse the old format ARNs or task IDs, this may be a
breaking change. It may also be a breaking change if you were storing
the old format ARNs and IDs in a fixed-width database field or data
structure.
After you have opted in, any new ECS services or tasks have the new
ARN and ID format. Existing resources do not receive the new format.
If you decide to opt out, any new resources that you later create then
use the old format.
You can check this AWS compute blog to migrate to new ARN.
migrating-your-amazon-ecs-deployment-to-the-new-arn-and-resource-id-format-2
Tagging Your Amazon ECS Resources
To help you manage your Amazon ECS tasks, services, task definitions,
clusters, and container instances, you can optionally assign your own
metadata to each resource in the form of tags. This topic describes
tags and shows you how to create them.
Important
To use this feature, it requires that you opt-in to the new Amazon
Resource Name (ARN) and resource identifier (ID) formats. For more
information, see Amazon Resource Names (ARNs) and IDs.
ecs-using-tags

How to create an AWS policy which allows the instances to launch only if it has tags created?

How to create an AWS policy which can restrict the users to create an instance unless they create tags while they try to launch the instance?
This is not possible using an IAM policy alone. The reason being that all EC2 instances are launched without EC2 tags. Tags are added to the EC2 instance after it has launched.
The AWS Management Console hides this from you, but it's a two-step process.
The best you can do is to stop and/or terminate your EC2 instances after-the-fact if they are missing the tags.
Thanks to recent AWS changes, you can launch an EC2 instance and apply tags, all in a single, atomic operation. You can therefore write IAM polices requiring tags at launch.
More details, and a sample IAM policy, can be found at the AWS blog post announcing the changes.

Applying IAM roles to ECS instances

Is there a way to run ECS containers under certain IAM roles?
Basically if you have a code / server that depends on IAM roles to access AWS resources (like S3 buckets or Dynamo tables), when you run that code / server as a ECS container, what will happen? can you control the roles per container?
Update 2: Roles are now supported on the task level
Update: Lyft has an open source thing called 'metadataproxy' which claims to solve this problem, but its been received with some security issues.
When you launch a container host (the instance that connects to your cluster) this is called the container instance.
This instance will have an IAM role attached to it(in the guides it is ecsInstanceProfile I think is the name).
This instance runs the ecs agent (and subsequently docker). The way this works is when tasks are run, the actual containers make calls to/from AWS services, etc. This is swallowed up my the host (agent) since it is actually controlling the network in/out of the docker containers. This traffic in actuality now is coming from the agent.
So no, you cannot control on a per container basis the IAM role, you would need to do that via the instances (agents) that join the cluster.
Ie.
you join i-aaaaaaa and it has the ECS IAM policy + S3 read only to cluster.
you join i-bbbbbbb and it has the ECS IAM policy + S3 read/write to cluster.
You launch a task 'c' that needs r/w to S3. You'd want to make sure it runs on i-bbbbbb

Create AWS cache clusters in VPC with CloudFormation

I am creating an AWS stack inside a VPC using CloudFormation and need to create ElastiCache clusters on it. I have investigated and there is no support in CloudFormation to create cache clusters in VPCs.
Our "workaround" was to to create the cache cluster when some "fixed" instance (like a bastion for example) bootstrap using CloudInit and AWS AmazonElastiCacheCli tools (elasticache-create-cache-subnet-group, elasticache-create-cache-cluster). Then, when front end machines bootstrap (we are using autoscaling), they use elasticache-describe-cache-clusters to get cache cluster nodes and update configuration.
I would like to know if you have different solutions to this problem.
VPC support has now been added for Elasticache in Cloudformation Templates.
To launch a AWS::ElastiCache::CacheCluster in your VPC, create a AWS::ElastiCache::SubnetGroup that defines which subnet in your VPC you want Elasticache and assign it to the CacheSubnetGroupName property of AWS::ElastiCache::CacheCluster.
You workaround is a reasonable one (and shows that you seem to be in control of your AWS operations already).
You could improve on your custom solution eventually by means of the dedicated CustomResource type, which are special AWS CloudFormation resources that provide a way for a template developer to include resources in an AWS CloudFormation stack that are provided by a source other than Amazon Web Services. - the AWS CloudFormation Custom Resource Walkthrough provides a good overview of what this is all about, how it works and what's required to implement your own.
The benefit of using this facade for a custom resource (i.e. the Amazon ElastiCache cluster in your case) is that its entire lifecycle (create/update/delete) can be handled in a similar and controlled fashion just like any officially supported CloudFormation resource types, e.g. resource creation failures would be handled transparently from the perspective of the entire stack.
However, for the use case at hand you might actually just want to wait for official support becoming available:
AWS has announced VPC support for ElastiCache in the context of the recent major Amazon EC2 Update - Virtual Private Clouds for Everyone!, which boils down to Default VPCs for (Almost) Everyone.
We want every EC2 user to be able to benefit from the advanced networking and other features of Amazon VPC that I outlined above. To enable this, starting soon, instances for new AWS customers (and existing customers launching in new Regions) will be launched into the "EC2-VPC" platform. [...]
You don’t need to create a VPC beforehand - simply launch EC2
instances or provision Elastic Load Balancers, RDS databases, or
ElastiCache clusters like you would in EC2-Classic and we’ll create a
VPC for you at no extra charge. We’ll launch your resources into that
VPC [...] [emphasis mine]
This update sort of implies that any new services will likely be also available in VPC right away going forward (else the new EC2-VPC platform wouldn't work automatically for new customers as envisioned).
Accordingly I'd expect the CloudFormation team to follow suit and complete/amend their support for deployment to VPC going forward as well.
My solution for this has been to have a controller process that polls a message queue, which is subscribed to the SNS topic which I notify CloudFormation events to (click advanced in the console when you create a CloudFormation stack to send notifications to an SNS Topic).
I pass the required parameters as tags to AWS::EC2::Subnet and have the controller pick them up, when the subnet is created. I execute the set up when a AWS::CloudFormation::WaitConditionHandle is created, and use the PhysicalResourceId to cURL with PUT to satisfy a AWS::CloudFormation::WaitCondition.
It works somewhat, but doesn't handle resource deletion in ElastiCache, because there is no AWS::CloudFormation::WaitCondition analogue for stack deletion. That's a manual operation procedure wth my approach.
The CustomResource approach looks more polished, but requires an endpoint, which I don't have. If you can put together an endpoint, that looks like the way to go.