Similar to this question How to get Task ID from within ECS container? but I want to get the TaskId for my Fargate task. How can you do this? Like others I want this for logging information.
I'm running a Spring App with ELK stack for logging and would like if possible to include the TaskId in the logs if possible.
Edit
I actually never got this to work by the way, here is my code:
private String getTaskIdInternal() {
String url = System.getenv("ECS_CONTAINER_METADATA_URI_V4") + "/task";
logger.info("Getting ecsMetaDataURL={}", url);
if (url == null) {
throw new RuntimeException("ECS_CONTAINER_METADATA_URI_V4 env variable not defined");
}
RestTemplate restTemplate = new RestTemplate();
ResponseEntity<JsonNode> response = restTemplate.getForEntity(url, JsonNode.class);
logger.info("ecsMetaData={}", response);
JsonNode map = response.getBody();
String taskArn = map.get("TaskARN").asText();
String[] splitTaskArn = taskArn.split("/");
String taskId = splitTaskArn[splitTaskArn.length - 1];
logger.info("ecsTaskId={}", taskId);
return taskId;
}
But I always get this stack trace:
Could not get the taskId from ECS. exception=org.springframework.web.client.HttpClientErrorException: 403 Forbidden
at org.springframework.web.client.DefaultResponseErrorHandler.handleError(DefaultResponseErrorHandler.java:118)
at org.springframework.web.client.DefaultResponseErrorHandler.handleError(DefaultResponseErrorHandler.java:103)
at org.springframework.web.client.ResponseErrorHandler.handleError(ResponseErrorHandler.java:63)
at org.springframework.web.client.RestTemplate.handleResponse(RestTemplate.java:732)
at org.springframework.web.client.RestTemplate.doExecute(RestTemplate.java:690)
at org.springframework.web.client.RestTemplate.execute(RestTemplate.java:646)
at org.springframework.web.client.RestTemplate.getForEntity(RestTemplate.java:325)
If you're trying to get the task id in Fargate for ECS you make use of metadata endpoints.
Assuming you're using version 1.4.0 of Fargate you can get this via a http request to ${ECS_CONTAINER_METADATA_URI_V4}/task.
An example response from this endpoint is below
{
"Cluster": "arn:aws:ecs:us-west-2:&ExampleAWSAccountNo1;:cluster/default",
"TaskARN": "arn:aws:ecs:us-west-2:&ExampleAWSAccountNo1;:task/default/febee046097849aba589d4435207c04a",
"Family": "query-metadata",
"Revision": "7",
"DesiredStatus": "RUNNING",
"KnownStatus": "RUNNING",
"Limits": {
"CPU": 0.25,
"Memory": 512
},
"PullStartedAt": "2020-03-26T22:25:40.420726088Z",
"PullStoppedAt": "2020-03-26T22:26:22.235177616Z",
"AvailabilityZone": "us-west-2c",
"Containers": [
{
"DockerId": "febee046097849aba589d4435207c04aquery-metadata",
"Name": "query-metadata",
"DockerName": "query-metadata",
"Image": "mreferre/eksutils",
"ImageID": "sha256:1b146e73f801617610dcb00441c6423e7c85a7583dd4a65ed1be03cb0e123311",
"Labels": {
"com.amazonaws.ecs.cluster": "arn:aws:ecs:us-west-2:&ExampleAWSAccountNo1;:cluster/default",
"com.amazonaws.ecs.container-name": "query-metadata",
"com.amazonaws.ecs.task-arn": "arn:aws:ecs:us-west-2:&ExampleAWSAccountNo1;:task/default/febee046097849aba589d4435207c04a",
"com.amazonaws.ecs.task-definition-family": "query-metadata",
"com.amazonaws.ecs.task-definition-version": "7"
},
"DesiredStatus": "RUNNING",
"KnownStatus": "RUNNING",
"Limits": {
"CPU": 2
},
"CreatedAt": "2020-03-26T22:26:24.534553758Z",
"StartedAt": "2020-03-26T22:26:24.534553758Z",
"Type": "NORMAL",
"Networks": [
{
"NetworkMode": "awsvpc",
"IPv4Addresses": [
"10.0.0.108"
],
"AttachmentIndex": 0,
"IPv4SubnetCIDRBlock": "10.0.0.0/24",
"MACAddress": "0a:62:17:7a:36:68",
"DomainNameServers": [
"10.0.0.2"
],
"DomainNameSearchList": [
"us-west-2.compute.internal"
],
"PrivateDNSName": "ip-10-0-0-108.us-west-2.compute.internal",
"SubnetGatewayIpv4Address": ""
}
]
}
]
}
As you can see you would need to parse the TaskARN to get the TaskID (it is the last part of the ARN if you split by "/".
Amazon do specify the following in the documentation that should be noted.
For tasks using the Fargate launch type and platform versions prior to 1.4.0, the task metadata version 3 and 2 endpoint are supported. For more information, see Task Metadata Endpoint version 3 or Task Metadata Endpoint version 2.
The link in the accepted answer is for EC2 launch type. The direct doc link for Fargate is: https://docs.aws.amazon.com/AmazonECS/latest/userguide/task-metadata-endpoint-v4-fargate.html. The json content seems to be pretty much the same though.
Related
AWS ECS cluster services do not start new tasks.
Already checked:
ECS EC2 instances are registered, active, full CPU and memory available, ECS agent is connected.
there are no events in ECS service "Events" tab, nothing about registering, starting, stopping, no errors, it's just empty.
Registered EC2 instances are set up correctly, in other cluster the same AMI is working perfect.
Task definition is correct, it used to work a day before and since then no changes happened.
Checked Service role contains all relevant policies
Querying ECS with AWS CLI aws ecs describe-services --services my-service --cluster my-cluster yields that deployment rollout is constantly IN_PROGRESS and stays like this.
Full response with configuration is here (I've substituted real names and IDs):
{
"serviceArn": "arn:aws:ecs:eu-central-1:my-account-id:service/my-cluster/my-service",
"serviceName": "my-service",
"clusterArn": "arn:aws:ecs:eu-central-1:my-account-id:cluster/my-cluster",
"loadBalancers": [
{
"targetGroupArn": "arn:aws:elasticloadbalancing:eu-central-1:my-account-id:targetgroup/my-service-lb/load-balancer-id",
"containerName": "my-service",
"containerPort": 8065
}
],
"serviceRegistries": [
{
"registryArn": "arn:aws:servicediscovery:eu-central-1:my-account-id:service/srv-srv_id",
"containerName": "my-service",
"containerPort": 8065
}
],
"status": "ACTIVE",
"desiredCount": 1,
"runningCount": 0,
"pendingCount": 0,
"launchType": "EC2",
"taskDefinition": "arn:aws:ecs:eu-central-1:my-account-id:task-definition/my-service:76",
"deploymentConfiguration": {
"deploymentCircuitBreaker": {
"enable": false,
"rollback": false
},
"maximumPercent": 200,
"minimumHealthyPercent": 100
},
"deployments": [
{
"id": "ecs-svc/deployment_id",
"status": "PRIMARY",
"taskDefinition": "arn:aws:ecs:eu-central-1:my-account-id:task-definition/my-service:76",
"desiredCount": 1,
"pendingCount": 0,
"runningCount": 0,
"failedTasks": 0,
"createdAt": "2022-06-28T09:15:08.241000+02:00",
"updatedAt": "2022-06-28T09:15:08.241000+02:00",
"launchType": "EC2",
"rolloutState": "IN_PROGRESS",
"rolloutStateReason": "ECS deployment ecs-svc/deployment_id in progress."
}
],
"roleArn": "arn:aws:iam::my-account-id:role/aws-service-role/ecs.amazonaws.com/AWSServiceRoleForECS",
"events": [],
"createdAt": "2022-06-28T09:15:08.241000+02:00",
"placementConstraints": [],
"placementStrategy": [
{
"type": "spread",
"field": "attribute:ecs.availability-zone"
}
],
"healthCheckGracePeriodSeconds": 120,
"schedulingStrategy": "REPLICA",
"createdBy": "arn:aws:iam::my-account-id:role/my-role",
"enableECSManagedTags": false,
"propagateTags": "NONE",
"enableExecuteCommand": false
}
The ECS service and service discovery entry is created using Terraform, and the service definition is
resource "aws_service_discovery_service" "ecs_discovery_service" {
name = var.service_name
dns_config {
namespace_id = var.service_discovery_hosted_zone_id
dns_records {
ttl = 10
type = "SRV"
}
}
health_check_custom_config {
failure_threshold = 1
}
}
resource "aws_ecs_service" "ecs_service" {
name = var.service_name
cluster = var.ecs_cluster_id
task_definition = var.task_definition_arn
desired_count = var.desired_count
deployment_minimum_healthy_percent = 100
deployment_maximum_percent = 200
health_check_grace_period_seconds = var.health_check_grace_period_seconds
target_group_arn = aws_lb_target_group.target_group.arn
container_name = var.service_name
container_port = var.service_container_port
ordered_placement_strategy {
type = "spread"
field = "attribute:ecs.availability-zone"
}
service_registries {
registry_arn = aws_service_discovery_service.ecs_discovery_service.arn
container_name = var.service_name
container_port = var.service_container_port
}
}
This code used to work pretty fine, and without any changes in infrastructure, after destroying and applying the infrastructure code, ECS does not start any new tasks.
I could narrow problem to the service discovery, as if I remove the service_registries section, the tasks are started as normal.
Removing the service discovery solves the issue, however it's not the proper solution and I don't understand what is the reason of the problem.
Again, the Service Role has the permissions for the service discovery.
"servicediscovery:DeregisterInstance",
"servicediscovery:Get*",
"servicediscovery:List*",
"servicediscovery:RegisterInstance",
"servicediscovery:UpdateInstanceCustomHealthStatus"
I can't find any ways to trace this strange behaviour and want to ask you guys for help:
could you give me any hints what / where I could check. I've checked multiple troubleshooting guides, however all of them rely on events in ECS service and I don't have any there, anything else I had in mind is checked.
maybe you know what could be the problem that the service discovery blocks the ECS to start new tasks? I thought ECS adds a SRV record to the registry when it starts the container and the container is healthy, however I could not see that any containers have been started at all.
I would be very thankful for any hints and let me know if you need any details.
Have a nice day and best regards.
I am trying to create an EC2 instance using boto3 client.run_instances(**parameters) method.
This is the value of my parameters:
{
"ImageId":"ami-XXXXXXXXX",
"InstanceType":"m4.large",
"KeyName":"my_key",
"UserData":"Content-Type: multipart/mixed",
"Monitoring":{
"Enabled":false
},
"MaxCount":1,
"MinCount":1,
"IamInstanceProfile":{
"Name":"proxyIp-YYYYYYYY"
},
"NetworkInterfaces":[
{
"DeviceIndex":0,
"AssociatePublicIpAddress":true,
"Groups":[
"sg-09999999fe111"
],
"SubnetId":"subnet-06XXXXXXXXX"
}
],
"PrivateIpAddress":"AA.BB.C.DDD",
"EbsOptimized":true
}
However the stack is failing on creation of an ec2 isntance with an error :
An error occurred (InvalidParameterCombination) when calling the RunInstances operation: Network interfaces and an instance-level private IP address may not be specified on the same request
Could let me know what is missing. I have checked the parameters they all looks to be fine.
It looks like you can request a Private IP address within the NetworkInterfaces block:
"NetworkInterfaces": [
{
"PrivateDnsName": "ip-10-0-0-157.us-east-2.compute.internal",
"PrivateIpAddress": "10.0.0.157",
"SourceDestCheck": true,
"Status": "in-use",
"SubnetId": "subnet-04a636d18e83cfacb",
"VpcId": "vpc-1234567890abcdef0",
}
],
So I have a nodejs webservice which I push into Cloud Foundry (PCF), then I am storing some credentials in Vault so when a user hits my web service endpoint with some credentials I extract the credentials from the Vault, compare them against the credentials from the request and if the match I allow the request to be processed else I reject the request.
So to install Vault in PCF I use the next command:
cf create-service hashicorp-vault shared foo-vault
Then I create a key using this command:
create-service-key foo-vault foo-vault-key
Then I bind the service to the app like this:
cf bind-service foo-ws foo-vault
I restage the web service and when I print the environmental variables using this command:
cf restage foo-ws
I get this values:
{
"hashicorp-vault": [{
"credentials": {
"address": "http://somehost:433/",
"auth": {
"accessor": "kMr3iCSlekSN2d1vpPjbjzUk",
"token": "some token"
},
"backends": {
"generic": [
"cf/7f1a12a9-4a52-4151-bc96-874380d30182/secret",
"cf/c4073566-baee-48ae-88e9-7c7c7e0118eb/secret"
],
"transit": [
"cf/7f1a12a9-4a52-4151-bc96-874380d30182/transit",
"cf/c4073566-baee-48ae-88e9-7c7c7e0118eb/transit"
]
},
"backends_shared": {
"organization": "cf/8d4b992f-cca3-4876-94e0-e49170eafb67/secret",
"space": "cf/bdace353-e813-4efb-8122-58b9bd98e3ab/secret"
}
},
"label": "hashicorp-vault",
"name": "my-vault",
"plan": "shared",
"provider": null,
"syslog_drain_url": null,
"tags": [],
"volume_mounts": []
}]
}
So my question is if there is a way to define the backends, token and address?
Thanks in advance for your help.
Greetings
I was able to setup AutoScaling events as rules in EventBridge to trigger SSM Commands, but I've noticed that with my chosen Target Value the event is passed to all my active EC2 Instances. My Target key is a tag shared by those instances, so my mistake makes sense now.
I'm pretty new to EventBridge, so I was wondering if there's a way to actually target the instance that triggered the AutoScaling event (as in extracting the "InstanceId" that's present in the event data and use that as my new Target Value). I saw the Input Transformer, but I think that just transforms the event data to pass to the target.
Thanks!
EDIT - help with js code for Lambda + SSM RunCommand
I realize I can achieve this by setting EventBridge to invoke a Lambda function instead of the SSM RunCommand directly. Can anyone help with the javaScript code to call a shell command on the ec2 instance specified in the event data (event.detail.EC2InstanceId)? I can't seem to find a relevant and up-to-date base template online, and I'm not familiar enough with js or Lambda. Any help is greatly appreciated! Thanks
Sample of Event data, as per aws docs
{
"version": "0",
"id": "12345678-1234-1234-1234-123456789012",
"detail-type": "EC2 Instance Launch Successful",
"source": "aws.autoscaling",
"account": "123456789012",
"time": "yyyy-mm-ddThh:mm:ssZ",
"region": "us-west-2",
"resources": [
"auto-scaling-group-arn",
"instance-arn"
],
"detail": {
"StatusCode": "InProgress",
"Description": "Launching a new EC2 instance: i-12345678",
"AutoScalingGroupName": "my-auto-scaling-group",
"ActivityId": "87654321-4321-4321-4321-210987654321",
"Details": {
"Availability Zone": "us-west-2b",
"Subnet ID": "subnet-12345678"
},
"RequestId": "12345678-1234-1234-1234-123456789012",
"StatusMessage": "",
"EndTime": "yyyy-mm-ddThh:mm:ssZ",
"EC2InstanceId": "i-1234567890abcdef0",
"StartTime": "yyyy-mm-ddThh:mm:ssZ",
"Cause": "description-text"
}
}
Edit 2 - my Lambda code so far
'use strict'
const ssm = new (require('aws-sdk/clients/ssm'))()
exports.handler = async (event) => {
const instanceId = event.detail.EC2InstanceId
var params = {
DocumentName: "AWS-RunShellScript",
InstanceIds: [ instanceId ],
TimeoutSeconds: 30,
Parameters: {
commands: ["/path/to/my/ec2/script.sh"],
workingDirectory: [],
executionTimeout: ["15"]
}
};
const data = await ssm.sendCommand(params).promise()
const response = {
statusCode: 200,
body: "Run Command success",
};
return response;
}
Yes, but through Lambda
EventBridge -> Lambda (using SSM api) -> EC2
Thank you #Sándor Bakos for helping me out!! My JavaScript ended up not working for some reason, so I ended up just using part of the python code linked in the comments.
1. add ssm:SendCommand permission:
After I let Lambda create a basic role during function creation, I added an inline policy to allow Systems Manager's SendCommand. This needs access to your documents/*, instances/* and managed-instances/*
2. code - python 3.9
import boto3
import botocore
import time
def lambda_handler(event=None, context=None):
try:
client = boto3.client('ssm')
instance_id = event['detail']['EC2InstanceId']
command = '/path/to/my/script.sh'
client.send_command(
InstanceIds = [ instance_id ],
DocumentName = 'AWS-RunShellScript',
Parameters = {
'commands': [ command ],
'executionTimeout': [ '60' ]
}
)
You can do this without using lambda, as I just did, by using eventbridge's input transformers.
I specified a new automation document that called the document I was trying to use (AWS-ApplyAnsiblePlaybooks).
My document called out the InstanceId as a parameter and is passed this by the input transformer from EventBridge. I had to pass the event into lambda just to see how to parse the JSON event object to get the desired instance ID - this ended up being
$.detail.EC2InstanceID
(it was coming from an autoscaling group).
I then passed it into a template that was used for the runbook
{"InstanceId":[<instance>]}
This template was read in my runbook as a parameter.
This was the SSM playbook inputs I used to run the AWS-ApplyAnsiblePlaybook Document, I just mapped each parameter to the specified parameters in the nested playbook:
"inputs": {
"InstanceIds": ["{{ InstanceId }}"],
"DocumentName": "AWS-ApplyAnsiblePlaybooks",
"Parameters": {
"SourceType": "S3",
"SourceInfo": {"path": "https://testansiblebucketab.s3.amazonaws.com/"},
"InstallDependencies": "True",
"PlaybookFile": "ansible-test.yml",
"ExtraVariables": "SSM=True",
"Check": "False",
"Verbose": "-v",
"TimeoutSeconds": "3600"
}
See the document below for reference. They used a document that was already set up to receive the variable
https://docs.aws.amazon.com/systems-manager/latest/userguide/automation-tutorial-eventbridge-input-transformers.html
This is the full automation playbook I used, most of the parameters are defaults from the nested playbook:
{
"description": "Runs Ansible Playbook on Launch Success Instances",
"schemaVersion": "0.3",
"assumeRole": "<Place your automation role ARN here>",
"parameters": {
"InstanceId": {
"type": "String",
"description": "(Required) The ID of the Amazon EC2 instance."
}
},
"mainSteps": [
{
"name": "RunAnsiblePlaybook",
"action": "aws:runCommand",
"inputs": {
"InstanceIds": ["{{ InstanceId }}"],
"DocumentName": "AWS-ApplyAnsiblePlaybooks",
"Parameters": {
"SourceType": "S3",
"SourceInfo": {"path": "https://testansiblebucketab.s3.amazonaws.com/"},
"InstallDependencies": "True",
"PlaybookFile": "ansible-test.yml",
"ExtraVariables": "SSM=True",
"Check": "False",
"Verbose": "-v",
"TimeoutSeconds": "3600"
}
}
}
]
}
I am trying to get AWS EC2 instance details using RunInstancesRequest. For that I followed AWS doc https://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/examples-ec2-instances.html.
RunInstancesRequest runInstancesRequest = new RunInstancesRequest();
runInstancesRequest.withImageId(imageId).withInstanceType(instanceType).withMinCount(1).withMaxCount(count).withSecurityGroups(securityGroupName);
RunInstancesResult runInstancesResult = amazonEC2.runInstances(runInstancesRequest);
String instance_id = runInstancesResult.getReservation().getReservationId();
//waiting for 2 minute
DescribeInstancesRequest describeInstancesRequest = new DescribeInstancesRequest();
describeInstancesRequest.setInstanceIds(Arrays.asList(instance_id));
DescribeInstancesResult describeInstancesResult = amazonEC2.describeInstances(describeInstancesRequest);
for(Reservation reservation : describeInstancesResult.getReservations()){
for(Instance instance : reservation.getInstances()) {
System.out.println(instance.getPublicDnsName());
}
}
Here I am able to get AWS EC2 instance up and running but the problem I am facing is I am not able to get the EC2 details using the RunInstancesResult object. As per AWS documentation it seems like instance_id is reservation_id but I believe it is not so. As instance_id start with "i-" and reservation_id with "r-".
How I can get the details of only one EC2 which I created using API? As I got RunInstancesResult object as output of the previous API hence the question: How I can get AWS EC2 instance details using RunInstancesRequest?
Reservations are the request to launch instances. For example, you could use one launch request to create two instances. Thus, the Reservation contains multiple Instances.
If you look in the response object, you will see that the Reservation does indeed contain multiple instances, eg:
{
"OwnerId": "123456789012",
"ReservationId": "r-08626e73c547023b1",
"Groups": [
{
"GroupName": "MySecurityGroup",
"GroupId": "sg-903004f8"
}
],
"Instances": [
{
"Monitoring": {
"State": "disabled"
},
"PublicDnsName": null,
"RootDeviceType": "ebs",
"State": {
"Code": 0,
"Name": "pending"
},
"EbsOptimized": false,
"LaunchTime": "2013-07-19T02:42:39.000Z",
"ProductCodes": [],
"StateTransitionReason": null,
"InstanceId": "i-1234567890abcdef0",
"ImageId": "ami-1a2b3c4d",
"PrivateDnsName": null,
"KeyName": "MyKeyPair",
etc.
There is small confusion created in AWS doc. They are refering reservation id as instance_id. After changing following changes in my code, was able to filter out the instances:
String instance_id = runInstancesResult.getReservation().getInstances()..get(0).getInstanceId();