AWS Batch job triggering multiple times for same event - amazon-web-services

I have an AWS Batch job queue that is targeted by an Eventbridge rule. The rule triggers upon completion of a Sagemaker training job. Here's the rule's event pattern:
{
"source": ["aws.sagemaker"],
"detail-type": ["SageMaker Training Job State Change"],
"detail": {
"TrainingJobStatus": ["Completed"],
"Tags": {
"model": ["model_tag"]
}
}
}
Every time the Sagemaker training job completes, the Eventbridge triggers twice and the Batch job runs twice. Here's a screenshot from the rule's monitoring tab:
Why is this happening and how can I get the Eventbridge rule to trigger only once for each matching event?

Related

Sending the inputs to target from a different event and not the triggered event in Amazon Eventbridge

HI i am still learning how to set up event rules in AWS Event bridge.
I have the following setup
Release pipeline in code pipeline
Test pipeline in code pipeline
Event bridge rule which gets triggered on success of release pipeline
Currently i have the following event pattern
{
"source": ["aws.codepipeline"],
"detail-type": ["CodePipeline Pipeline Execution State Change"],
"resources": ["arn:aws:codepipeline:"],
"detail": {
"state": ["SUCCEEDED"]
}
}
My input to target is Matched event.
But when this rule triggers I want to send the inputs from not the matched event but say, cloudcommit event, to the target. is this possible?
So, what i wish to achieve essentially is that, the rule will be triggered by codepipeline being successful and the rule will send the details from codecommit to lambda.

Is there a way to find the average runtime of an ECS scheduled task on AWS?

I have a scheduled task running on ECS with a launch type of Fargate. Is there a way to find the average runtime of the task over the past X amount of time?
As you're no doubt already aware, ECS CloudWatch metrics are limited to cluster and service. I was thinking that there might be task-level metrics for AWS Batch, and was going to recommend using that, but it seems there aren't any metrics for it.
So that means generating the metrics yourself. I think that the best approach will be to implement a Lambda that is triggered by an EventBridge event.
As it happens, I just did this (not for generating metrics, sorry), so can outline the basic steps.
The first one is that you need to set up the EventBridge rule. Here's the rule that I used, which triggered the Lambda whenever a specific task definition was completed:
{
"detail-type": ["ECS Task State Change"],
"source": ["aws.ecs"],
"detail": {
"taskDefinitionArn": [{
"prefix": "arn:aws:ecs:us-east-1:123456789012:task-definition/HelloWorld:"
}],
"lastStatus": ["STOPPED"]
}
}
This looks for a specific task definition. If you delete HelloWorld: it will trigger on every task definition.
Your Lambda will be invoked with the following event (I've removed everything that isn't relevant to this answer):
{
"version": "0",
"id": "2e08a760-c304-9681-9509-e6c9ca88ee36",
"detail-type": "ECS Task State Change",
"source": "aws.ecs",
"detail": {
"createdAt": "2022-03-10T15:18:57.782Z",
"pullStartedAt": "2022-03-10T15:19:10.488Z",
"pullStoppedAt": "2022-03-10T15:19:26.541Z",
"startedAt": "2022-03-10T15:19:26.846Z",
"stoppingAt": "2022-03-10T15:19:36.946Z",
"stoppedAt": "2022-03-10T15:19:50.213Z",
"stoppedReason": "Essential container in task exited",
"stopCode": "EssentialContainerExited",
"taskDefinitionArn": "arn:aws:ecs:us-east-1:123456789012:task-definition/HelloWorld:1",
}
}
So, you've got a bunch of ISO-8601 timestamps, which should be easy to translate into elapsed time values for whatever metrics you want to track. And you've got the task definition ARN, from which you can extract the task definition name. From that you can call the PutMetricData API call. I also left stopCode in the example, because you might want to track successful versus non-successful executions.
I recommend that you name the metric after the elapsed time value (eg, RunTime, TotalTime), and dimension by the task definition name.

AWS EventBridge ECS task status change event

I want to trigger a lambda function when a fargate task is deprovisionning, I created this EventBridge rule :
{
"source": ["aws.ecs"],
"detail-type": ["ECS Task State Change"],
"detail": {
"clusterArn": ["arn:aws:ecs:eu-west-3:xxx"],
"lastStatus": ["DEPROVISIONING"]
}
}
It does not seem to be working all the time, ie sometimes cloudwatch receives it and sometimes it doesn't (no logs are generated from the lambda function).
What could cause this issue ?
So it seems the error was comming from my lambda function and because it was failing so often Event Bridge was blocking some of the calls of the lambda.
Not that big of a deal afterall...

How to get the event content in ECS when it is invoked by cloudwatch/eventbridge event?

We can set up event rules to trigger an ECS task, but I don't see if the triggering event is passed to the runing ECS task and in the task how to fetch the content of this event. If a Lambda is triggered, we can get it from the event variable, for example, in Python:
def lambda_handler(event, context):
...
But in ECS I don't see how I can do things similar. Going to the cloudtrail log bucket doesn't sound to be a good way because it has around 5 mins delay for the new log/event to show up, which requires ECS to be waiting and additional logic to talk to S3 and find & read the log. And when the triggering events are frequent, this sounds hard to handle.
One way to handle this is to set two targets In the Cloud watch rule.
One target will launch the ECS task
One target will push same event to SQS
So the SQS will contain info like
{
"version": "0",
"id": "89d1a02d-5ec7-412e-82f5-13505f849b41",
"detail-type": "Scheduled Event",
"source": "aws.events",
"account": "123456789012",
"time": "2016-12-30T18:44:49Z",
"region": "us-east-1",
"resources": [
"arn:aws:events:us-east-1:123456789012:rule/SampleRule"
],
"detail": {}
}
So when the ECS TASK up, it will be able to read event from the SQS.
For example in Docker entrypoint
#!/bin/sh
echo "Starting container"
echo "Process SQS event"
node process_schdule_event.sj
#or if you need process at run time
schdule_event=$(aws sqs receive-message --queue-url https://sqs.us-west-2.amazonaws.com/123456789/demo --attribute-names All --message-attribute-names All --max-number-of-messages 1)
echo "Schdule Event: ${schdule_event}"
# one process done, start the main process of the container
exec "$#"
After further investigation, I finally worked out another solution that is to use S3 to invoke Lambda and then in that Lambda I use ECS SDK (boto3, I use Python) to run my ECS task. By this way I can easily pass the event content to ECS and it is nearly real-time.
But I still give credit to #Adiii because his solution also works.

How to test lambda using test event

I have lambda which is triggered by cloudwatch event when VPN tunnels are down or up. I searched online but can't find a way to trigger this cloudwatch event.
I see an option for test event but what can I enter in here for it to trigger an event that tunnel is up or down?
You can look into CloudWatchEventsandEventPatterns
Events in Amazon CloudWatch Events are represented as JSON objects.
For more information about JSON objects, see RFC 7159. The following
is an example event:
{
"version": "0",
"id": "6a7e8feb-b491-4cf7-a9f1-bf3703467718",
"detail-type": "EC2 Instance State-change Notification",
"source": "aws.ec2",
"account": "111122223333",
"time": "2017-12-22T18:43:48Z",
"region": "us-west-1",
"resources": [
"arn:aws:ec2:us-west-1:123456789012:instance/ i-1234567890abcdef0"
],
"detail": {
"instance-id": " i-1234567890abcdef0",
"state": "terminated"
}
}
Also log based on event, you can pick your required event from AWS CW EventTypes
I believe in your scenario, you don't need to pass any input data as you must have built the logic to test the VPN tunnels connectivity within the Lamda. You can remove that JSON from the test event and then run the test.
If you need to pass in some information as part of the input event then follow the approach mentioned by #Adiii.
EDIT
The question is more clear through the comment which says
But question is how will I trigger the lambda? Lets say I want to
trigger it when tunnel is down? How will let lambda know tunnel is in
down state? – NoviceMe
This can be achieved by setting up a rule in Cloudwatch to schedule the lambda trigger at a periodic interval. More details here:
Tutorial: Schedule AWS Lambda Functions Using CloudWatch Events
Lambda does not have an invocation trigger right now that can monitor a VPN tunnel, so the only workaround is to poll the status through lamda.