PutLifecycleHook events not showing up? - amazon-web-services

I have an autoscaling group with lifecycle hooks for autoscaling:EC2_INSTANCE_LAUNCHING and autoscaling:EC2_INSTANCE_TERMINATING.
I have EventBridge configured to watch for those events and as I understand it they are supposed to go to CloudTrail. The problem is that even watching CloudTrail directly I don't seem to get near as many PutLifecycle events as it seems like I should.
Testing method:
create ASG with hooks as above
wait for it to stabilize
bump up 'desired capacity'
new member instance created
wait for event to show up in CloudTrail. Sometimes it shows up, sometimes not.
decrement 'desired capacity'.
new member instance terminates.
wait for event to show up in CloudTrail. Sometimes it shows up, sometimes not.
This almost feels like an IAM problem, but even doing all the above as full admin the results are spotty.
I've also tried the ASG -> SNS -> SQS route and gotten similar results.
Is there something in the guts of ASG events I'm not understanding? Is there somewhere else I should be looking?

As it turns out the events don't go anywhere but the event bus.
The PutLifecycleHook events I was seeing in CloudTrail were actually from me tweaking the lifecycle hook section of the ASG page. It's a notification that someone changed the lifecycle hook config, not that an actual lifecycle event happened.
To see the events I had to create a log group and then set up a rule with a cloudwatch target.

Related

AWS Eventbridge Notifications Does Not Work Using SNS topic

I want to receive notifications from AWS Eventbridge when there's a scheduled event for my Amazon Elastic Compute Cloud (Amazon EC2) instance.
I created an Eventbridge rule and set the target to an already working SNS topic. The SNS topic is subscribed to a working Lambda function that is used for other "Cloudwatch to slack" alarms already. The eventbridge setting is as follows:
{
"source": ["aws.health"],
"detail-type": ["AWS Health Event"],
"detail": {
"service": ["EC2"],
"eventTypeCategory": ["scheduledChange"]
}
}
I already got an EC2 scheduled maintenance(reboot) notification as e-mail from AWS, but this eventbridge I created did not trigger for that and did not send any notification to the slack channel.
I am unsure now if I am missing something in the setting. I am setting it for the first time and no way to simply test it with fake input. It is supposed to work even if there is a single schedule event that appears in the top bell icon(as shown in the screenshot above), correct?
In order to find out the root cause of this issue, I suggest to take a look a the CloudWatch usage metrics for SNS. SNS reports the following metrics which might be useful for you: NumberOfMessagesPublished, NumberOfNotificationsDelivered NumberOfNotificationsFailed. If you find these metrics reported an they have a value different than 0, this means that SNS receives events from Event Bridge and the problem is somewhere else.
If you are using a Lambda to send messages to Slack, you should take a look at the logs in CloudWatch to see if the Lambda did execute successfully. You might want to check out the setup for Lambda recommended by AWS: (link)
For further debugging you may want to check out test-event-pattern CLI command.
It is supposed to work even if there is a single schedule event that appears in the top bell icon(as shown in the screenshot above), correct?
Yeah, it supposed to work even if there already is an event.
I'm having a similar issue with eventbridge rule being built with cloudformation. I had to manually go into the eventbridge rule via the AWS console and go to the trigger and select the SNS topic again. It now works. It took me a while to figure out. Can you confirm that the fix did that for you as I'm not sure how to fix this...

AWS: how to determine who deleted a Lambda trigger

Since past week we recorded irregular deletions of the trigger of AWS Lambda.
We would like to find out when this exactly happened to determine the reason/cause of deletion. We tried looking for the entries in Cloudtrail, but not sure what to look for exactly ?
How to find the root cause and reasons for the deletion ?
thanks Marcin and ydaetskcoR. We found the problem. The Lambda trigger is a property of the S3 bucket. We had different lambda trigger in different projects (with different terraform states). So every time one (terraform) project will be applied, the trigger of the other project will be overwritten, because the terraform state is not aware of it. We saw PutBucketNotifications in cloudtrail,but didn't recognize the connections...
You can troubleshoot operational and security incidents over the past 90 days in the CloudTrail console by viewing Event history. You can look up events related to creation, modification, or deletion of resources. To view Events in logs follow this ;
https://docs.aws.amazon.com/awscloudtrail/latest/userguide/get-and-view-cloudtrail-log-files.html

How to extract event relayed from AWS EventBridge to ECS Fargate

I articulate the question as follows:
Is the EventBridge event relayed to the ECS Task? (I can't see how much useful it could be if the event is not relayed).
If the event is relayed, then how to able to extract it from within say a Node app running as Task.
Some Context is Due: It is possible to set an EventBridge rule to trigger ECS Fargate Tasks as the result of events sourced from, say, CodeCommit. Mind you, the issue here is the sink/target, not the source. I was able to trigger a Fargate Task as I updated my repo. I could have used other events. My challenge resides in extracting the event relayed (in this case, repository name, commitId, etc from Fargate.)
The EventBridge documentation is clear on how to set the rules to trigger events but is mum on how events can be extracted - which makes sense as the sink/target documentation would have the necessary reference. But ECS documentation is not clear on how to extract relayed events.
I was able to inspect the metadata and process.env. I could not find the event in either of the stores.
I have added a CloudWatch Log Group as a target for the same rule and was able to extract the event. So it certainly relayed to some of the targets, but not sure if events are relayed to ECS Task.
Therefore, the questions arise: is the event relayed to the ECS Task? If so, how would you access it?

Is it a correct design diagram for processing events in AWS?

I am not very experienced in AWS, so I would like to check my design diagram. The workflow for processing Inspection Events can go in the following way:
Inspection Events belonging to a certain job should be processed by EventHandlers separately for every job. Then the handling results are persisted in S3. After finishing processing and persisting for a particular job the EventConsumer should retrieve the results from S3 based on a processing finishing message.
So, for directing events to EventHandlers I show the SNS topic InspectionEvent
Since handling events for a particular job requires significant resources, we think about creating a Lambda for every job.
How to create a trigger for this Lambda? On the diagram I showed the DynamoDB, that could have trigger the EventHandler Lambda if an Inspection Event with a new job appears. Then EventHandler retrieves events from the DynamoDB for a particular job and after finishing publishes the Finishing Event to another SNS topic – EventProcessingEnds. The EventConsumer gets the message for a particular job and retrieves the results from S3.
The image is attached. Does this design have sense? What else can be suggested?
First, I'd recommend using SQS instead of SNS for starting both the job triggering circle you have there, and for the event consumer square near the bottom of your diagram.
AWS Lambda can read events from an SQS queue, S3 change, or change in dynamodb via an event, so I recommend you use this instead of SNS which is typically used for mobile push notifications, or situations where you need many-to-many messaging / notifications to a group. In this case, SQS is what you want - for a couple reasons:
It works very well for a producer/consumer pattern.
You can hook up an Elastic Beanstalk consumer app to an SQS queue and have it auto-scale up based on the size of the queue for consumers that execute on a server.
AWS Lambda can read events directly from an SQS queue, so in a serverless job processing queue, your number of asynchronously running lambda functions will scale well as queue throughput goes up.
Since situations like this is what SQS was designed for, it is full of features you can customize to tailor your solution to what you need. https://aws.amazon.com/sqs/features/
For triggering a lambda function from some of these sources using events:
(dynamodb) https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Streams.Lambda.html
(S3) https://aws.amazon.com/premiumsupport/knowledge-center/lambda-configure-s3-event-notification/
(SQS) https://docs.aws.amazon.com/lambda/latest/dg/with-sqs.html
Furthermore, for your diagram, check out this tool which is great for AWS diagrams: https://cloudcraft.co/
I'm not sure what to recommend when event processing ends - as it looks like that triggers the event consumer, so I'm not sure what requirement that satisfies. Please feel free to leave a comment and anyone here can help elaborate on best practices for how to notify certain functions/resources when the event is finished - depending on what you're trying to do.
Good luck, hope this helps.

Auto Scaling group lifecycle hook periodic notification

So I have this scenario where an Amazon EC2 instance in an Auto Scaling group will be terminated. My problem is that I don’t want it terminated until it has finished whatever it’s doing.
If I hook up a lambda, the lambda would check a metric, if this metric is > 0 then it needs to wait 60 seconds.
I have this done already, the problem is it may take more than the Max timeout for lambdas of 15 minutes, to finish the jobs it’s processing.
If I read correctly, the lifecycle notification is only sent once, so this lambda won’t work for me.
Is there any other way of doing this?
Here is how I would try to approach this problem (this is a needs a POC, the answer is theoretical):
Create an Auto Scaling Group
Put a lifecycle hook on this ASG like described here, sending notification to Amazon SNS
Create a launch script for instances which will do the following
subscribe to SNS on instance launch and start SNS listener script
SNS listener will wait for instance termination message, do whatever necessary to wait until instance is ready to terminate, including sending heartbeats if termination needs more than 1h and completing lifecycle hook (described here). This should also handle unsubscription from SNS.