How and why of awslogs on ECS (fargate) - amazon-web-services

I am struggling to get a task running using ECS Fargate, and launched (ecs.runTask) from an AWS SDK script (JS/Node).
My current struggle is to get logs from the containers so that I can trouble shoot why they are stopping. I can't seem to get the Task Definition right so that they will be generated.
logConfiguration: {
logDriver: 'awslogs',
options: {
"awslogs-region": 'us-west-2',
"awslogs-group": 'myTask',
"awslogs-stream-prefix": "myTask",
"awslogs-create-group": "true"
}
}
I have set the log driver for them to awslogs, but when I try to view the logs in CloudWatch, I get various kinds of nothing:
If I specify the awslogs-create-group as "true" (it requires a string, rather than a Boolean, which is strange; I assume case doesn't matter), I nevertheless find that the group is not created.
If I create the group manually, I find that the log stream is not created.
I suspect that there may be an error in my permissions, though of course there is no error messaging to confirm. The docs (here) indicate that I need to attach certain policies to ecsInstanceRole, which seems to be a placeholder for a role that is used somewhere in the process.
But I have attached such a policy to my ECS executionRole, to the role that executes my API call to runTask, and I have looked for any other role that might be involved (an actual "instanceRole" doesn't seem to exist in the Task Def), and nothing is improving my situation.
I'd be happy to supply more information, but at this point I'm not sure where my blind spot is.
Can anyone see it?

Go to your Task Definition. You should find a section called "Task execution IAM role". The description says -
This role is required by tasks to pull container images and publish container logs to Amazon CloudWatch.
The role you attach here needs a policy like AmazonECSTaskExecutionRolePolicy (AWS managed policy), and the Trusted Entity is ecs-tasks.amazonaws.com.
Also, the awslogs option awslogs-create-group is not needed, I think.

Related

ECS logs: Fargate vs EC2

When I usually run a task in ECS using Fargate, the STDOUT is redirected automatically to cloudwatch and this application logs can be found without any complication.
To clarify, for example, in C#:
Console.WriteLine("log to write to CloudWatch")
That output is automatically redircted to CloudWatch logs when I use ECS with Fargate or Lambda functions
I would like to do the same using EC2.
The first impression using ECS with EC2 is that this is not as automatic as Fargate. Am I right?
Looking for a information I have found the following (apart of other older question or post):
In this question refers to an old post from the AWS blog, so
this could be obsolete.
In this AWS page, they describe a few steps where you need to
install some utilities to your EC2
So, summarizing, is there any way to see the STDOUT in cloudwatch when I use ECS with EC2 in the same way Fargate does?
So, summarizing, is there any way to see the STDOUT in cloudwatch when I use ECS with EC2 in the same way Fargate does?
If you mean EC2 logging as easily as Fargate does without any complex configuration, then no. You need to provide some configuration and utilities to your EC2 to allow logging to CloudWatch. As any EC2 instance we launch, ECS instances are just a virtual machine with some operational system with a default configuration, in this case, is Amazon ECS-optimized AMIs. Other services and configurations we should provide by ourself.
Besides the link above you provided, I found this CloudFormation template which configures EC2 Spot Fleet to log to CloudWatch in the same way your second link describes.
I don't think your correct. The StdOut logs from the ECS task launch are just as easily written and accessed running under EC2 as Fargate.
You just have this in your task definition which, as far as I can tell, is the same as in Fargate:
"containerDefinitions": [
{
"dnsSearchDomains": null,
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "my-log-family",
"awslogs-region": "us-east-1",
"awslogs-stream-prefix": "my-stream-name"
}
}
...
After it launches, you should see your logs under my-log-family
If you are trying to put application logs in CloudWatch, that's another matter... this is typically done using the CloudWatch logs agent which you'd have to install into the container, but the above will capture the StdOut.
This is how I did it.
Using the nugget: AWS.Logger.AspNetCore
An example of use:
static async Task Main(string[] args)
{
Logger loggerObj = new Logger();
ILogger<Program> logger = await loggerObj.CreateLogger("test", "eu-west-1");
logger.LogInformation("test info");
logger.LogError("test error");
}
public async Task<ILogger<Program>> CreateLogger(string logGroup, string region)
{
AWS.Logger.AWSLoggerConfig config = new AWS.Logger.AWSLoggerConfig();
config.Region = region;
config.LogStreamNameSuffix = "";
config.LogGroup = logGroup;
LoggerFactory logFactory = new LoggerFactory();
logFactory.AddAWSProvider(config);
return logFactory.CreateLogger<Program>();
}

Terraform command to list existing AWS resources as a Hello World

I have the AWS CLI installed on my Windows computer, and running this command "works" exactly like I want it to.
aws ec2 describe-images
I get the following output, which is exactly what I want to see, because although I have access to AWS through my corporation (e.g. to check code into CodeCommit), I can see in the AWS web console for EC2 that I don't have permission to list running instances:
An error occurred (UnauthorizedOperation) when calling the DescribeImages operation: You are not authorized to perform this operation.
I've put terraform.exe onto my computer as well, and I've created a file "example.tf" that contains the following:
provider "aws" {
region = "us-east-1"
}
I'd like to issue some sort of Terraform command that would yell at me, explaining that my AWS account is not allowed to list Amazon instances.
Most Hello World examples involve using terraform plan against a resource to do an "almost-write" against AWS.
Personally, however, I always feel more comfortable knowing that things are behaving as expected with something a bit more "truly read-only." That way, I really know the round-trip to AWS worked but I didn't modify any of my corporation's state.
There's a bunch of stuff on the internet about "data sources" and their "aws_ami" or "aws_instances" flavors, but I can't find anything that tells me how to actually use it with a Terraform command for a simple print()-type interaction (the way it's obvious that, say, "resources" go with the "terraform plan" and "terraform apply" commands).
Is there something I can do with Terraform commands to "hello world" an attempt at listing all my organization's EC2 servers and, accordingly, watching AWS tell me to buzz off because I'm not authorized?
You can use the data source for AWS instances. You create a data source similar to the below:
data "aws_instances" "test" {
instance_tags = {
Role = "HardWorker"
}
filter {
name = "instance.group-id"
values = ["sg-12345678"]
}
instance_state_names = ["running", "stopped"]
}
This will attempt to perform a read action listing your EC2 instances designated by the filter you put in the config. This will also utilize the IAM associated with the Terraform user you are performing the terraform plan with. This will result in the error you described regarding lack of authorization, which is your stated goal. You should modify the filter to target your organization's EC2 instances.

How to use AWS ECS Task Role in Node AWS SDK code

Code that uses the AWS Node SDK doesn't seem to be able to gain the role permissions of the ECS task.
If I run the code on an EC2 ECS instance, the code seems to inherit the role on the instance, not of the task.
If I run the code on Fargate, the code doesn't get any permission.
By contrast, any bash scripts that run within the instance seem to have the proper permissions.
Indeed, the documentation doesn't mention this as an option for the node sdk, just:
Loaded from IAM roles for Amazon EC2 (if running on EC2),
Loaded from the shared credentials file (~/.aws/credentials),
Loaded from environment variables,
Loaded from a JSON file on disk,
Hardcoded in your application
Is there any way to have your node code gain the permissions of the ECS task?
This seems to be the logical way to pass permissions to your code. It works beautifully with code running on an instance.
The only workaround I can think of is to create one IAM user per ECS service and pass the API Key/Secret as environmental variables in the task definition. However, that doesn't seem very secure since it would be visible in plain text to anyone with access to the task definition.
Your question is missing a lot of details on how you setup your ECS Cluster plus I am not sure if the question is for ECS or for Fargate specifically.
Make sure that you are using the latest version of the SDK. Javascript supports ECS and Fargate task credentials.
Often there is confusion about credentials on ECS. There is the IAM role that is assigned to the Cluster EC2 instances and the IAM role that is assigned to ECS tasks.
The most common problem is the "Trust Relationship" has not been setup on the ECS Task Role. Select your IAM role and then the "Trust Relationships" tab and make sure that it looks like this:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "",
"Effect": "Allow",
"Principal": {
"Service": "ecs-tasks.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}
In addition to the standard Amazon ECS permissions required to run tasks and services, IAM users also require iam:PassRole permissions to use IAM roles for tasks.
Next verify that you are using the IAM role in the task definition. Specify the correct IAM role ARN in the Task Role field. Note that this different than Task Execution Role (which allows containers to pull images and publish logs).
Next make sure that your ECS Instances are using the latest version of the ECS Agent. The agent version is listed on the "ECS Instances" tab under the right hand side column "Agent version". The current version is 1.20.3.
Are you using an ECS optimized AMI? If not, add --net=host to your docker run command that starts the agent. Review this link for more information.
I figured it out. This was a weird one.
A colleague thought it would be "safer" if we call Object.freeze on proccess.env. This was somehow interfering with the SDK's ability to access the credentials.
Removed that "improvement" and all is fine again. I think the lesson is "do not mess with process.env".

Useless Amazon ECS Error Message when creating tasks

Using the ecs agent container on an Ubuntu instance, I am able to register the agent with my cluster.
I also have a service created in that cluster and task definitions as well. When I try to add a task to the cluster I get the useless error message:
Run tasks failed
Reasons : ["ATTRIBUTE"]
The ecs agent log has no related error message. Any thoughts on how I can get better debugging or what the issue might be?
The cli also returns the same useless error message
{
"tasks": [],
"failures": [
{
"arn": "arn:aws:ecs:us-east-1:sssssss:container-instance/sssssssssssss",
"reason": "ATTRIBUTE"
}
]
}
From the troubleshooting guide:
ATTRIBUTE (container instance ID)
Your task definition contains a parameter that requires a specific container instance attribute that is not available on your container instances. For more information on which attributes are required for specific task definition parameters and agent configuration variables, see Task Definition Parameters and Amazon ECS Container Agent Configuration.
You can find the attributes required for your task definition by looking at the requiredAttributes field. You can find the attributes that are present for your container instances in the result of the DescribeContainerInstances API call.
The ECS console webpage does not provide enough information, but you can connect to the EC2 instance to retrieve more logs.
You can try by manually restart ecs agent daemon, ecs agent docker.
Sometimes, you need to manually delete the checkpoint file
A cheatsheet with location of logs, commands can be found at
ecs-agent troubleshoot

aws elasticbeanstalk: cannot deploy to worker environment via eb cli

I've created a worker environment for my eb application in order to take advantage of its "periodic tasks" capabilities using cron.yaml (located in the root of my application). It's a simple sinatra app (for now) that I would like to use to use to issue requests to my corresponding web server environment.
However, I'm having trouble deploying via the eb cli. Below is what happens I run eb deploy.
╰─➤ eb deploy
Creating application version archive "4882".
Uploading myapp/4882.zip to S3. This may take a while.
Upload Complete.
INFO: Environment update is starting.
ERROR: Service:AmazonCloudFormation, Message:Stack named 'awseb-e-1a2b3c4d5e-stack'
aborted operation. Current state: 'UPDATE_ROLLBACK_IN_PROGRESS'
Reason: The following resource(s) failed to create: [AWSEBWorkerCronLeaderRegistry].
I've looked around the CloudFormation dashboard to see to check for possible errors. After reading a bit of about what I could find regarding AWSEBWorkerCronLeaderRegistry, I found it that it's most likely a DynamoDB table that gets updated/created. However, when I look in the DynamoDB dashboard, there are no tables listed.
As always, any help, feedback, or guidance is appreciated.
If you are reluctant to add full DynamoDB access (like I was), Beanstalk now provides a Managed Policy for Worker environment permissions (AWSElasticBeanstalkWorkerTier). You can try adding one of these to your instance profile role instead.
See http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/iam-instanceprofile.html
We had the same issue and fixed it by attaching AmazonDynamoDBFullAccess to Elastic Beanstalk role (which was named aws-elasticbeanstalk-ec2-role in our case).
I was using Codepipeline to deploy my worker and was getting the same error. Eventually I tried giving AWS-CodePipeline-Service the AmazonDynamoDBFullAccess policy and that seemed to resolve the issue.
As Anthony suggested, when triggering the deploy from other services such as CodePileline, its service role needs the dynamodb:CreateTable permission to create the Leader Registry table (more info below) in DynamoDB.
Adding Full Access permission is a bad practice and should be avoided. Also, the managed policy AWSElasticBeanstalkWorkerTier does not have the appropriate permissions since it is for the worker to access DynamoDB and check if they are the current leader.
1. Find the Role that is trying to create the table:
Go to CloudTrail > Event History
Filter Event Name: CreateTable
Make sure the error code is AccessDenied
Locate the role name (i.e. AWSCodePipelineServiceRole-us-east-1-dev):
2. Add the permissions:
Go to IAM > Roles
Find the role in the list
Attach a policy with:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "CreateCronLeaderTable",
"Effect": "Allow",
"Action": "dynamodb:CreateTable",
"Resource": "arn:aws:dynamodb:*:*:table/*-stack-AWSEBWorkerCronLeaderRegistry*"
}
]
}
3. Check results:
Redeploy by triggering the pipeline
Check Elasticbeanstalk for errors
Optionally go to CloudTrail and make sure the request succeded this time.
You may use this technique any time you are not sure of what permission should be attached to what.
About the Cron Leader Table
From the Periodic Tasks Documentation:
Elastic Beanstalk uses leader election to determine which instance in your worker environment queues the periodic task. Each instance attempts to become leader by writing to an Amazon DynamoDB table. The first instance that succeeds is the leader, and must continue to write to the table to maintain leader status. If the leader goes out of service, another instance quickly takes its place.
For those wondering, this DynamoDB table uses 10 RCU and 5 WCU which covered by the always free tier.