I have an AWS auto scaling group. From the instances I collect a variety of metrics and placed some cloud watch alarms on these metrics. In specific scenarios I would like to add a cloud watch alarm action that terminates the entire auto scaling group. Is this possible? I am going over aws documentation but does not seem to be possible.
Thanks!!
You can do this by invoking Lambda from your custom Cloudwatch event
You will need to write a Lambda that can use STS to assume a role that permits it to issue an EC2 Terminate command
The workflow would be:
Cloudwatch event triggers
Lambda function is invoked
Lambda function assumes role via STS
Lambda function retrieves list of instances in the ASG
Lambda function cycles through instances, issuing termination commands
Related
I have been trying to create cloudwatch alarm when instance gets stopped but couldn't find direct way. From event subscriptions i can send notification when instance gets stopped. Is there any way cloud watch alarm can be triggered for the same.
Amazon EventBridge can be configured to trigger an event when a state change occurs on an EC2 instance. Use:
Event source: EC2
Event type: EC2 Instance State-change Notification
You can setup a target for the event to be a Lambda
The Lambda can API_PutMetricData for a metric you create. You can setup a CloudWatch Alarm on this metric
This tutorial shows you how to setup the EventBridge rule and the Lambda.
Hope you all doing well, I am new with SQS and Cloudwatch and I need to create a Cloudwatch that would monitor SQS and would trigger Lambda with an event every time a message enters and every time a message is left.
On another note, the lambda function should scale up and down ASG service. So if anyone has a cookbook regarding those issues it would be very helpful
thank you so much!
It appears that your requirement is to scale Amazon EC2 instances when messages are waiting to be processed in an Amazon SQS queue.
The correct architecture for this would be to configure the Auto Scaling group to use a scaling policy based on the metric ApproximateNumberOfMessagesVisible. This is a metric that Amazon SQS queues send to Amazon CloudWatch Metrics. There is no need to use an AWS Lambda function.
For reference, see:
Scaling based on Amazon SQS - Amazon EC2 Auto Scaling
Rapid Auto Scaling with Amazon SQS | AWS News Blog
could any one please help me the lambda code , whenever AWS Ec2 instances get stopped, we need to get the email notifications with sns. In the email we need instance name. I could able to get instance id but not the instance name.
AWS CloudTrail allows you to identify and track EC2 instance lifecycle API calls (launch, start, stop, terminate). See How do I use AWS CloudTrail to track API calls to my Amazon EC2 instances?
And you can trigger a Lambda function to run arbitrary code when CloudTrail logs certain events. See Triggering a Lambda function with AWS CloudTrail events.
You can also create an Amazon CloudWatch alarm that monitors an Amazon EC2 instance and triggers a Lambda via CloudWatch Events.
You can create a rule in Amazon CloudWatch Events that:
Triggers when an instance enters the Stopped state
Sends a message to an Amazon SNS Topic
Like this:
If you want to modify the message that is being sent, then configure the Rule to trigger an AWS Lambda function instead. Your function should:
Extract the instance information (eg InstanceId) from the event parameter
Call describe-instances to obtain the Name of the instance (presumably the Tag with a Key of Name)
Publish a message to the Amazon SNS Topic
I have a EC2 instance with 300 GB of data (EBS volumes attached). I would like to develop lambda function
to start/stop this EC2 during non business hours so save cloud cost. can anyone help me by sharing any sample code/function?
I think the scenario could be addressed without lambda:
Crone Expressions for CloudWatch Event with
targets of SSM Automation,
and documents of AWS-StartEC2Instance and AWS-StopEC2Instance.
Note that CouldWatch Event has target for stopping an instance. There is no target for starting it. Thus, SSM Automation is proposed.
But if lambda is a requirement, then instead of SSM Automation, just use lambda function with CloudWatch Events.
You can use Cloudwatch EventBridge Rule along with cron expressions to define a schedule on which a Lambda function runs. Within that Lambda function, you can then turn off your Ec2 instance easily.
def turn_off_instance(instance_ids):
ec2 = boto3.client('ec2', region_name=region)
ec2.stop_instances(InstanceIds=instance_ids)
logger.info(f'Instance(s) stopped')
these two guides do something very similar:
EventBridge:
https://medium.com/geekculture/enable-or-disable-aws-alarms-at-given-intervals-d2f867aa9aa4
Lambda code:
https://medium.com/geekculture/terraform-setup-for-automatically-turning-off-ec2-instances-upon-inactivity-d7f414390800
I have an autoscaling group created by CloudFormation. When a scale out or scale in event occurs, I have configured an SNS topic to trigger a lambda function. Everything works as expected, except when I delete my CloudFormation stack.
When I delete my CloudFormation stack (I use short-lived stacks for integration testing), the autoscaling group is deleted and the instances enter the Terminating:Wait stage as expected. But, the autoscaling:EC2_INSTANCE_TERMINATING lifecycle hook is never called (neither the Lambda monitoring nor the CloudWatch logs show any evidence of the lifecycle hook getting called). The autoscaling group appears to wait for the heartbeat timeout to expire, then deletes the instances and the autoscaling group.
Is there a way I can have the autoscaling:EC2_INSTANCE_TERMINATING lifecycle hook called when the EC2 instances are terminated because the ASG is deleted?
I figured this out. In my case I had a AWS::Lambda::Permission resource which granted SNS permission to invoke my lambda function. The permission was being deleted before the Autoscaling group so the SNS topic did not have permission to invoke my lambda function when the message arrived at the SNS topic.
Adding a DependsOn attribute to my ASG so it depends on the permission object solved this.