AWS auto scaling works based on the load (number of concurrent requests). It works perfectly for web sites and web APIs. However there are situations in which the number of required EC2 instances is not related to the requests but it depends on something else such as number of items in a queue.
For example an order processing system which pulls the orders from a custom queue (and not SQS) might need to scale out to process the order quicker. How can we make this happpen?
Auto scaling groups can be configured to scale in or out by linking their scaling policies to Cloud Watch alarms. Many people use CPU utilization as a scaling trigger but you can use any Cloud Watch metric you like. In your case you could use your queue's ApproximateNumberOfMessagesVisible metric.
For example, if you create an alarm that fires when the ApproximateNumberOfMessagesVisible > 500 and link that to the scale out policy of your auto scaling group, the group will create new instances whenever the queue has more that 500 messages in it.
Related
I'm using the AutoScalingGroup to launch a group of EC2 instances. These instances are acting as workers which are continuously listening to SQS for any new request.
Requirement:
Do upscale on something like throughput (i.e Total number of messages present in SQS by total number instances).
And I want to downscale whenever any instance which is part of ASG is sitting idle (CPUIdle) for let's say more than 15 mins.
Note: I am not looking for any metric which applies as whole to a particular ASG (eg: Average CPU).
One way of doing that could be defining the custom metric and allowing it to trigger a cloudwatch alarm to do that.
Is there a better way to accomplish this?
If you are defining the scaling policy at instance level, then you defeating the entire purposes of ASG. If you need to scale based on changing conditions, such as the queue size, then you can configure ASG based on the conditions specified here
https://docs.aws.amazon.com/autoscaling/ec2/userguide/as-using-sqs-queue.html
A custom metric to send to Amazon CloudWatch that measures the number of messages in the queue per EC2 instance in the Auto Scaling group.
A target tracking policy that configures your Auto Scaling group to scale based on the >custom metric and a set target value. CloudWatch alarms invoke the scaling policy.
If you know a specific time window when the queue size goes up or down, you can also scale based on schedule.
You can always start with a very low instance count in ASG and set the desired capacity as such (say 1) and scale up based on queue, so you can continue using ASG policies.
We have a service running in aws ecs that we want to scale in and out based on 2 metrics.
Scale out when: cpu > 80% or connection_count > 9500
Scale in when: cpu < 50% and connection_count < 5000
We have access to both the cpu and connection count metrics and alarms in cloud watch. However, we can't figure out how to setup a dynamic scaling policy like this based on both of them.
Using the standard aws console interface for creating the auto scaling rules I don't see any options for multiple. Any links to a tutorial or aws docs on this would be appreciated.
Based on the responses posted in the support aws forums, nothing can be done for AND/OR/IF conditions. (https://forums.aws.amazon.com/thread.jspa?threadID=94984)
It does mention however that they already put a feature request to the cloudwatch team.
The following is mentioned as a workaround:
"In the meantime, a possible workaround can be to create a custom metric using a custom script which would run after every five minutes and get the data points from the CloudWatch metrics, then perform the AND or OR operation and then push the output to a custom metric. You can then create a CloudWatch alarm which would monitor this custom metric and then trigger actions accordingly."
I'm currently trying to get around the issue in AWS where CloudWatch alarms cannot contain more than one metric (in this case, SQS message counts).
Scenario:
I have an ASG that contains a set amount of on-demand instances for my application. I have another ASG, where I plan on using spot instances to scale out when it gets busy.
What I'm trying to achieve is, for my application that consumes from 3 SQS queues
if at least 1 queue has a message count above the threshold, scale out the spot instances ASG
if ALL queues have a message count below the threshold for at least X minutes, scale in
To get around this I'm attempting to publish a custom metric with a count of how many queues have a message count above a certain limit, and then use this metric to decide whether to scale in my auto scaling groups.
However... in Spinnaker, there doesn't seem to be a way to refer to a custom metric (from the UI at least) - I am missing something here or is it just not possible?
From what I understand as well, you can only publish metric data to your own namespaces - attempting to publish to any 'AWS/*' namespace will result in an error?
In your settings.js file for deck, include the following block:
providers: {
aws: {
// ...
metrics: {
customNamespaces: ['yourcustomnamespace'],
},
// ...
}
}
I don't think this is explicitly documented anywhere - you'd have to dig into the source code to find this bit of configuration.
I have a not-so-complicated situation but it can be complicated on AWS cloudformations:
I would like to autoscale up and down based on the number of messages on SQS.
But I am not sure what I need to specify on AWS cloudformation, I would imagine that I would need:
some sort of lambda/cloudformation that perform query on the current number of instances on AutoScalingGroup
some sort of lambda/cloudformation that perform query on the current number of messages on SQS.
some comparison operations that compares #1 and #2.
create scale up policy when #1 < #2
create scale down policy when #1 > #2
Not sure where I should get started... can someone kind enough to show some examples?
You have several different concepts all mixed together (CloudFormation, Auto Scaling, Lambda). It is best to keep things simple, at least for an initial deployment. You can then automate it with CloudFormation later.
The most difficult part of Auto Scaling is actually determining the best Scaling Policies to use. A general rule is to quickly add capacity when it is needed, and then slowly remove capacity when it is no longer needed. This way, you can avoid churn, where instances are added and removed within short spaces of time.
The simplest setup would be:
Scale-out when the queue size is larger than X (To be determined by testing)
Scale-in when the queue is empty (You can later tweak this to be more efficient)
Use the ApproximateNumberOfMessagesVisible metric for your scaling policies. (See Amazon SQS Metrics and Dimensions). It provides a count of messages waiting to be processed. However, I have seen situations where a zero count is not actually sent as a metric, so also trigger your scale-in policy on an alarm status of INSUFFICIENT_DATA, which also means that the queue is empty.
There is no need to use AWS Lambda functions unless you have very complex requirements for when to scale.
If your requests come on a regular basis throughout the day, set the minimum to one instance to always have capacity available.
If your requests are infrequent (and there could be several hours with no requests coming in), then set the minimum to zero instances so you save money.
You will need to experiment to determine the best queue size that should trigger a scale-out event. This depends upon how frequently the messages arrive and how long they take to process. You can also experiment with the Instance Type -- figure out whether it is better to have many smaller (eg T2) instances, or fewer larger instances (eg M4 or C4, depending upon need).
If you do not need to process the requests within a short time period (that is, you can be a little late sometimes), you could consider using spot pricing that will dramatically lower your costs, with the potential to occasionally have no instances running due to a high spot price. (Or, just bid high and accept that occasionally you'll pay more than on-demand prices but in general you will save considerable costs.)
Create all of the above manually in the console, then experiment and measure results. Once it is finalized, you can then implement it as a CloudFormation stack if desired.
Update:
The Auto Scaling screens will only create an alarm based on EC2. To create an alarm on a different metric, first create the alarm, then put it in the policy.
To add a rule based on an Amazon SQS queue:
Create an SQS queue
Put a message in the queue (otherwise the metrics will not flow through to CloudWatch)
Create an alarm in CloudWatch based on the ApproximateNumberOfMessagesVisible metric (which will appear after a few minutes)
Edit your Auto Scaling policies to use the above alarm
I've got my web servers set up to autoscale at certain times of the day.
I can measure the load on the box using scripts executed by Consul - and this can trigger events at certain thresholds.
I want to push these two together and trigger autoscaling at certain load levels. (Assume CPU load at 75% is the threshold).
My question is: What is the mechanism to get load on a box to trigger an autoscaling group in AWS?
Assumptions:
I was not planning to use AWS Cloudwatch - but am interested if this is the solution.
I'm more interested in the autoscale triggering interface. Is it a queue or a rest endpoint?
As #mahdi said, you can easily use AWS Cloudwatch to do this.
However, if you want Consul (or anything outside the scope of an AWS "service") to do it, you can use lambda.
You would create a lambda function that scales your instance up or down (or both). Lambda can have many triggers, such as an HTTP endpoint through API Gateway. If you already have Consul set up to do it (sounds like you do since you said can trigger events at certain thresholds.) just make it issue the HTTP request to API Gateway to scale up or down.
You can create a CloudWatch alarm with a CPUUtilization metric and set it to change state when your instance has CPU utilization more than 75%. Then in the Auto Scaling Group, you use this alarm for scaling (in/out) policy. You can also control the number of instances in an Auto Scaling Group by manually (e.g. through your application running on one the instances) changing the Desired value. This documentation can be helpful.