AWS lambda function for auto-scaling - amazon-web-services

I am creating a Disaster recovery solution in AWS. For second (fallback) region i want to have only 1 EC2 instance to minimise cost. In case of disaster i would like to know if it's possible to write a lambda function in the second region that increases the desired capacity of the auto Scaling group to some number.
To achieve this i can subscribe the function to the health check alarm SNS topic.
I would like to know if there is a API to autoscale a ec2 group from Lambda and what sort of roles/permissions is needed ?

Yes this is entirely possible.
In Boto3 you can use the update_autoscaling_group function and specify the MinSize, MaxSize and DesiredCapacity. By doing this you would be able to adjust the values to match what you expect them to be at.
Alternatively you could have the minimum capacity as 1 and the maximum capacity as whatever it should be, if the alarms never trigger it would never scale. You could simply then call the set_desired_capacity to set the number of instances to a specific count.
The permissions for these options are as follows:
autoscaling:SetDesiredCapacity
autoscaling:UpdateAutoScalingGroup

Related

Is this the correct way to solve this problem or have I gone wrong?

I am told to create a Simple auto-scaling policy and using CloudWatch to trigger an increase in resources based on an alarm, I have created a target tracking scaling policy within my ASG and set the target value to 50 and with the alarm I have created a SNS to send me a notification to my email when it goes above the targeted value. But i'm not entirely sure if that's what was exactly asked for.
Is that what was meant by creating a 'Simple auto-scaling policy'? Any confirmation would be helpful
As mentioned in the above comments, you need to confirm with whoever made that request.
You said:
I am told to create a Simple auto-scaling policy and using CloudWatch to trigger an increase in resources based on an alarm,
Therefore, you should create a simple auto-scaling policy which will trigger an increase (scale out) in resources.
Note: when you create a new policy, the alarms are automatically created and managed for you. Once created, you can edit them as well. When you delete the policy, the alarms get deleted. There is usually an alarm for the "scale up" (HIGH alarm) and for the "scale in" (LOW alarm). But, again, these are automatically created for you when you create the policy.
So, I'd say you just need to setup autoscaling for whichever service you're working on (EC2, ECS, other; this is not mentioned in your question) and assign a policy to that (autoscale based on a target metric: CPU, Memory, number of requests, etc or based on some other custom metric).
In the end, you'd want to test it by applying a load test and confirm that your service scales out when the load thresholds are breached (the threshold, how many datapoints must breach the threshold in what period of time: these are all defined in your policy and associated alarms automatically created after you setup the policy).
So, to answer the question, only the person who made the request can confirm but I'll say with a good certainty that the end goal is to increase the resources under load. So, no, the goal is not to send an email. You might want to send an email as well. But I would bet a benjamin that what's really wanted here is for you to make some service autoscale under load (scale out and scale in).

Is there a way to scale in "instance" (part of ASG ) on certain custom metric?

I'm using the AutoScalingGroup to launch a group of EC2 instances. These instances are acting as workers which are continuously listening to SQS for any new request.
Requirement:
Do upscale on something like throughput (i.e Total number of messages present in SQS by total number instances).
And I want to downscale whenever any instance which is part of ASG is sitting idle (CPUIdle) for let's say more than 15 mins.
Note: I am not looking for any metric which applies as whole to a particular ASG (eg: Average CPU).
One way of doing that could be defining the custom metric and allowing it to trigger a cloudwatch alarm to do that.
Is there a better way to accomplish this?
If you are defining the scaling policy at instance level, then you defeating the entire purposes of ASG. If you need to scale based on changing conditions, such as the queue size, then you can configure ASG based on the conditions specified here
https://docs.aws.amazon.com/autoscaling/ec2/userguide/as-using-sqs-queue.html
A custom metric to send to Amazon CloudWatch that measures the number of messages in the queue per EC2 instance in the Auto Scaling group.
A target tracking policy that configures your Auto Scaling group to scale based on the >custom metric and a set target value. CloudWatch alarms invoke the scaling policy.
If you know a specific time window when the queue size goes up or down, you can also scale based on schedule.
You can always start with a very low instance count in ASG and set the desired capacity as such (say 1) and scale up based on queue, so you can continue using ASG policies.

AutoScaling Based on Comparing Query Metrics

I have a not-so-complicated situation but it can be complicated on AWS cloudformations:
I would like to autoscale up and down based on the number of messages on SQS.
But I am not sure what I need to specify on AWS cloudformation, I would imagine that I would need:
some sort of lambda/cloudformation that perform query on the current number of instances on AutoScalingGroup
some sort of lambda/cloudformation that perform query on the current number of messages on SQS.
some comparison operations that compares #1 and #2.
create scale up policy when #1 < #2
create scale down policy when #1 > #2
Not sure where I should get started... can someone kind enough to show some examples?
You have several different concepts all mixed together (CloudFormation, Auto Scaling, Lambda). It is best to keep things simple, at least for an initial deployment. You can then automate it with CloudFormation later.
The most difficult part of Auto Scaling is actually determining the best Scaling Policies to use. A general rule is to quickly add capacity when it is needed, and then slowly remove capacity when it is no longer needed. This way, you can avoid churn, where instances are added and removed within short spaces of time.
The simplest setup would be:
Scale-out when the queue size is larger than X (To be determined by testing)
Scale-in when the queue is empty (You can later tweak this to be more efficient)
Use the ApproximateNumberOfMessagesVisible metric for your scaling policies. (See Amazon SQS Metrics and Dimensions). It provides a count of messages waiting to be processed. However, I have seen situations where a zero count is not actually sent as a metric, so also trigger your scale-in policy on an alarm status of INSUFFICIENT_DATA, which also means that the queue is empty.
There is no need to use AWS Lambda functions unless you have very complex requirements for when to scale.
If your requests come on a regular basis throughout the day, set the minimum to one instance to always have capacity available.
If your requests are infrequent (and there could be several hours with no requests coming in), then set the minimum to zero instances so you save money.
You will need to experiment to determine the best queue size that should trigger a scale-out event. This depends upon how frequently the messages arrive and how long they take to process. You can also experiment with the Instance Type -- figure out whether it is better to have many smaller (eg T2) instances, or fewer larger instances (eg M4 or C4, depending upon need).
If you do not need to process the requests within a short time period (that is, you can be a little late sometimes), you could consider using spot pricing that will dramatically lower your costs, with the potential to occasionally have no instances running due to a high spot price. (Or, just bid high and accept that occasionally you'll pay more than on-demand prices but in general you will save considerable costs.)
Create all of the above manually in the console, then experiment and measure results. Once it is finalized, you can then implement it as a CloudFormation stack if desired.
Update:
The Auto Scaling screens will only create an alarm based on EC2. To create an alarm on a different metric, first create the alarm, then put it in the policy.
To add a rule based on an Amazon SQS queue:
Create an SQS queue
Put a message in the queue (otherwise the metrics will not flow through to CloudWatch)
Create an alarm in CloudWatch based on the ApproximateNumberOfMessagesVisible metric (which will appear after a few minutes)
Edit your Auto Scaling policies to use the above alarm

Testing AWS spot instance provisioning

My company is looking to switch to using Spot pricing when provisioning EC2 instances. I've been tasked with writing some unit tests that test things such as:
Our Spot instance count in at a certain threshold
When that threshold isn't met on demand replacements are brought up to replace
them
I'm not an adept tester and haven't had much exposure to AWS on the whole. So my question is what approach, tools, software could I use to begin implementing this? My initial thinking is to write a bash script with AWS CLI commands and go from there.
Any pointers or recommendations would be greatly appreciated!
I thought about this a little and I would recommend you have two auto scaling groups, one for spot instances and one for on-demand instances. For the spot instances auto scaling group you would essentially set your desired capacity. For the on demand auto scaling group you would simply set the min and max to 0.
Next you would setup two cloud watch alarms. One would be for GroupInServiceInstances less than whatever maximum you declared. This would be set to on by default. Another would be GroupInServiceInstances equal to the maximum you declared. This would be set to off by default.
Now when the GroupInServiceInstancesalarm for instances less than your desired maximum goes off it would invoke a lambda function. This lambda function would do the following:
Enable the GroupInServiceInstances equal to your maximum capacity alarm
Disable the GroupInServiceInstances less than your desired capacity alarm
Call the auto scaling group API to get ( max instances - currently running instances )
Set the min and max instances in the on demand auto scaling group to whatever that value is
It also would be a good idea to setup a simple notification service topic that emails someone when the spot instance auto scaling group has an insufficient number of instances after X amount of time. That lets you decide if you need to rework the spot prices.
Now when the GroupInServiceInstances equal to your maximum desired capacity alarm goes off, it will invoke a lambda function to do the following:
Enable the GroupInServiceInstances less than desired alarm
Disable the GroupInServiceInstances equal to desired alarm
Set the min and max on demand auto scaling group instances desired to 0
This will essentially terminate all the instances in the on demand auto scaling group so you can revert back to using the (hopefully) lower cost spot instances
This solution does require knowledge of Lambda, but I think it ends up a lot more automated and reduces the additional logic a CLI script would require.

AWS EC2 Auto Scaling Groups: I get Min and Max, but what's Desired instances limit for?

When you setup an Auto Scaling groups in AWS EC2 Min and Max bounds seem to make sense:
The minimum number of instances to scale down to based on policies
The maximum number of instances to scale up to based on policies
However, I've never been able to wrap my head around what the heck Desired is intended to affect.
I've always just set Desired equal to Min, because generally, I want to pay Amazon the minimum tithe possible, and unless you need an instance to handle load it should be at the Min number of instances.
I know if you use ElasticBeanstalk and set a Min to 1 and Max to 2 it sets a Desired to 2 (of course!)--you can't choose a value for Desired.
What would be the use case for a different Desired number of instances and how does it differ? When you expect AWS to scale lower than your Desired if desired is larger than Min?
Here are the explanations for the "min, desired and max" values from AWS support:
MIN: This will be the minimum number of instances that can run in your
auto scale group. If your scale down CloudWatch alarm is triggered,
your auto scale group will never terminate instances below this number
DESIRED: If you trip a CloudWatch alarm for a scale up event, then it
will notify the auto scaler to change it's desired to a specified
higher amount and the auto scaler will start an instance/s to meet
that number. If you trip a CloudWatch alarm to scale down, then it
will change the auto scaler desired to a specified lower number and
the auto scaler will terminate instance/s to get to that number.
MAX: This will be the maximum number of instances that you can run in
your auto scale group. If your scale up CloudWatch alarm stays
triggered, your auto scale group will never create instances more than
the maximum amount specified.
Think about it like a sliding range UI element.
With min and max, you are setting the lower bound of your instance scaling. Withe desired capacity, you are setting what you'd currently like the instance count to hover.
Example:
You know your application will have heavy load due to a marketing email or product launch...simply scale up your desired capacity beforehand:
aws autoscaling set-desired-capacity --auto-scaling-group-name my-auto-scaling-group --desired-capacity 2 --honor-cooldown
Source
"Desired" is (necessarily) ambiguous.
It means the "initial" number of instances. Why not just "initial" then? Because the number may change by autoscaling events.
So it means "current" number of instance. Why not just "current" then? Because during an autoscaling event, instances will start / terminate. Those instances do not count towards "current" number of instances. By "current", a user expects instances that are operate-able.
So it means "target" number of instance. Why not just "target" then? I guess "target" is just as good (ambiguous) as "desired"...
When you expect AWS to scale lower than your Desired if desired is
larger than Min?
This happens when you set a CloudWatch alarm based on some AutoScaling policy. Whenever that alarm is triggered it will update the DesiredCount to whatever is mentioned in config.
e.g., If an AutoScalingGroup config has Min=1, Desired=3, Max=5 and there is an Alarm set on an AutoScalingPolicy which says if CPU usage is <50% for consecutive 10 mins then Remove 1 instances then it will keep reducing the instance count by 1 whenever the alarm is triggered until the DesiredCount = MinCount.
Lessons Learnt: Set the MinCount to be > 0 or = DesiredCount. This will make sure that the application is not brought down when the mincount=0 and CPU usage goes down.
In layman's terms, DesiredCapacity value is automatically updated on scale-in and scale-out events.
In other words,
Scale-in or Scale-out are done by decreasing or increasing the DesiredCapacity value.
Desired capacity simply means the number of instances that will come up / fired up when you launch the autoscaling. That means if desired capacity = 4, then 4 instances will keep on running until and unless any scale up or scale down event triggers. If scale up event occurs, the number of instances will go up till maximum capacity and if scale down event occurs it will go down till the minimum capacity.
Correct me if wrong, thanks.
I noticed that desired capacity went down and no new instance came up when
I set one of the instances to standby. It kept on running but was detached from ELB ( requests were not forwarded to that particular instance when accessed via ELB DNS ). No new instance has been initiated by AWS. Rather desired capacity was decreased by 1.
When I changed the state of instance ( from standby ) the instance was again attached to ELB ( the instance started to get requests when accessed via ELB DNS ). The desired capacity was increased by 1 and became 2.
Hence it seems no of instances attached to ELB can't cross the threshold limit set by min and max but the desired capacity is adjusted or changed automatically based on the occurrence of scale in or scale out event. It was definitely something unknown to me.
It might be a way to let AWS know that this is the desired capacity required for the respective ELB at a given point in time.
Min and max is self explanatory but desired was confusing until i have attached Target Tracking Auto scaling policy with the ASG where CPU utilization was the target metric. Here, desired instances were scaled out and scaled in based on target CPU utilization. If any desired count are placed through cloudformation/manual, for time being ASG will create same number of instances as desired count. But later ASG policy will automatically adjust the desire instances based on target CPU utilization.
Desired is what we start initially. It will go to min or max depending on the scale-in / scale-out.
I liked the analogy with a slider to understand this - https://stackoverflow.com/a/36272945/10779109
Think of min and max as the maximum allowed brightness on a screen. You probably don't want to min to be 0 in that case (sidenote). The desired quantity keeps changing based on the env (in the case of ASG, it depends on the scaling policies).
For instance, if the following check runs every hour, this is where desired quantity is required.
if low_load(<CPU or Mem etc>) and desired_capacity>= min_capacity:
desired_capacity = desired_capacity-1
Max capacity can also be understood in the same way where you'd want to keep increasing the desired quantity based on a cloudwatch_alarm (or any scaling policy) up to the max capacity.