Dynamic auto scaling AWS based on messages in SQS - amazon-web-services

Use case:
Every day morning the SQS will be populated (only one time, and the number of messages can vary drastically), I want to spawn new instances as per the number of messages in the queue.
eg: For 200000 messages 4 Instances, 400000 8 instances.
Is there a way by which we can achieve this?

You can set up a cron-job on your server or a time-triggered Lambda to query SQS to find out the number of visible messages in the queue. If you're using the AWS CLI you would run aws sqs get-queue-attributes and read the ApproximateNumberOfMessages response field to get the number of items in the queue. You would then use that number to calculate the number of instances and either call aws ec2 run-instances --count 4 plus the rest of the parameters. Once everything is done you would terminate the instances.
Another way to do this would be to utilize auto-scaling and alarms. You can set up an ScaleOut policy that adds 1 server to your AutoScaling Group and trigger that policy with a CloudWatch alarm on SQS ApproximateNumberOfMessages >= some threshold. This option wouldn't wait for morning to process the queues, you'd have it running all the time. You could also have a ScaleIn policy to reduce the Desired Capacity (# of servers) in your AutoScaling Group when ApproximateNumberOfMessages <= some threshold.

Related

AWS Auto Scaling Fargate tasks up or down depending on SQS queue length

I have a long-running containerized process that is a Fargate service.
The service uses Squiss poller to poll messages from an AWS SQS queue and handles the messages.
I would like to have the service instances process messages and when the queue is drained, I would like the instances to be autoscaled down to zero. (Squiss already handles multiple messages in-flight and since the tasks are long-running, I currently just process one message at a time.)
When messages start arriving in the queue, I would like the instances to be auto-scaled up a maximum count of N instances.
I have implemented 1. and 2.
For implementing 3. and 4. how should I configure it? Do I need an AWS lambda function to manage the autoscale up/down or can autoscaling configuration accomplish this?
No, you don't need lambda for scale. You need to create CloudWatch metrics for heights and lower messages in the queue. The highest will be used by ECS service to scale out, the lower to scale-in. Then you have to configure your adjust policy when you create your ECS service to use the CloudWatch metrics to take actions either scale-out or scale-in. Have a read at this https://docs.aws.amazon.com/AmazonECS/latest/developerguide/service-auto-scaling.html

Is there a way to scale in "instance" (part of ASG ) on certain custom metric?

I'm using the AutoScalingGroup to launch a group of EC2 instances. These instances are acting as workers which are continuously listening to SQS for any new request.
Requirement:
Do upscale on something like throughput (i.e Total number of messages present in SQS by total number instances).
And I want to downscale whenever any instance which is part of ASG is sitting idle (CPUIdle) for let's say more than 15 mins.
Note: I am not looking for any metric which applies as whole to a particular ASG (eg: Average CPU).
One way of doing that could be defining the custom metric and allowing it to trigger a cloudwatch alarm to do that.
Is there a better way to accomplish this?
If you are defining the scaling policy at instance level, then you defeating the entire purposes of ASG. If you need to scale based on changing conditions, such as the queue size, then you can configure ASG based on the conditions specified here
https://docs.aws.amazon.com/autoscaling/ec2/userguide/as-using-sqs-queue.html
A custom metric to send to Amazon CloudWatch that measures the number of messages in the queue per EC2 instance in the Auto Scaling group.
A target tracking policy that configures your Auto Scaling group to scale based on the >custom metric and a set target value. CloudWatch alarms invoke the scaling policy.
If you know a specific time window when the queue size goes up or down, you can also scale based on schedule.
You can always start with a very low instance count in ASG and set the desired capacity as such (say 1) and scale up based on queue, so you can continue using ASG policies.

Is it possible to have an AWS EC2 scale group that defaults to 0 and only contains instances when there is work to do?

I am trying to setup a EC2 Scaling group that scales depending on how many items are in an SQS queue.
When the SQS queue has items visible I need the Scaling group to have 1 instance available and when the SQS queue is empty (e.g. there are no visible or non-visible messages) I want there to be 0 instances.
Desired instances it set to 0, min is set to 0 and max is set to 1.
I have setup cloudwatch alarms on my SQS queue to trigger when visible messages are greater than zero, and also triggers an alarm when non visible messages are less than one (i.e no more work to do).
Currently the Cloudwatch Alarm Triggers to create an instance but then the scaling group automatically kills the instance to meet the desired setting. I expected the alarm to adjust the desired instance count within the min and max settings but this seems to not be the case.
Yes, you can certainly have an Auto Scaling group with:
Minimum = 0
Maximum = 1
Alarm: When ApproximateNumberOfMessagesVisible > 0 for 1 minute, Add 1 Instance
This will cause Auto Scaling to launch an instance when there are messages waiting in the queue. It will keep trying to launch more instances, but the Maximum setting will limit it to 1 instance.
Scaling-in when there are no messages is a little bit tricker.
Firstly, it can be difficult to actually know when to scale-in. If there are messages waiting to be processed, then ApproximateNumberOfMessagesVisible will be greater than zero. However, there are no messages waiting, it doesn't necessarily mean you wish to scale-in because messages might be currently processing ("in flight"), as indicated by ApproximateNumberOfMessagesNotVisible. So, you only want to scale-in if both of these are zero. Unfortunately, a CloudWatch alarm can only reference one metric, not two.
Secondly, when an Amazon SQS queue is empty, it does not send metrics to Amazon CloudWatch. This sort of makes sense, because queues are mostly empty, so it would be continually sending a zero metric. However, it causes a problem that CloudWatch does not receive a metric when the queue is empty. Instead, the alarm will enter the INSUFFICIENT_DATA state.
Therefore, you could create your alarm as:
When ApproximateNumberOfMessagesVisible = 0 for 15 minutes, Remove 1 instance but set the action to trigger on INSUFFICIENT_DATA rather than ALARM
Note the suggested "15 minutes" delay to avoid thrashing instances. This is where instances are added and removed in rapid succession because messages are coming in regularly, but infrequently. Therefore, it is better to wait a while before deciding to scale-in.
This leaves the problem of having instances terminated while they are still processing messages. This can be avoided by taking advantage of Auto Scaling Lifecycle Hooks, which send a signal when an instance is about to be terminated, giving the application the opportunity to delay the termination until work is complete. Your application should then signal that it is ready for termination only when message processing has finished.
Bottom line
Much of the above depends upon:
How often your application receives messages
How long it takes to process a message
The cost savings involved
If your messages are infrequent and simple to process, it might be worthwhile to continuously run a t2.micro instance. At 2c/hour, the benefit of scaling-in is minor. Also, there is always the risk when adding and removing instances that you might actually pay more, because instances are charged by the hour -- running an instance for 30 minutes, terminating it, then launching another instance for 30 minutes will actually be charged as two hours.
Finally, you could consider using AWS Lambda instead of an Amazon EC2 instance. Lambda is ideal for short-lived code execution without requiring a server. It could totally remove the need to use Amazon EC2 instances, and you only pay while the Lambda function is actually running.
for simple conf, with per sec aws ami/ubuntu billing dont worry about wasted startup/shutdown time, just terminate your ec2 by yourself, w/o any asg down policy add a little bash in client startup code or preinstal it in cron and poll for process presence or cpu load and term ec2 or shutdown (termination is better if you attach volumes and need 'em to autodestruct) after processing is done. there's ane annoying thing about asg defined as 0/0/1 (min/desired/max) with defaults and ApproximateNumberOfMessagesNotVisible on sqs - after ec2 is fired somehow it switches to 1/0/1 and it start to loop firing ec2 even if there's nothing is sqs (i'm doing video transcoding, queing jobs to do to sns/sqs and firing ffmpeg nodes with asg defined on non empty sqs)

CloudWatch alarm for Amazon EC2 Service Instance Limits

Is it possible to set a CloudWatch alarm for when we are approaching the limit of EC2 instances currently allowed on our account?
For instance, if limit for EC2 instances is currently 250, when instance number 240 is provisioned, I want an alarm to trigger.
If you have an auto scaling group which launches new instances and you want to control it, you can use GroupInServiceInstances which gives you the number of instances running as part of the ASG. Read more here.
Yes, you could do this with a Lambda function, a CloudWatch Metric and a CloudWatch alarm.
Your alarm would be configured to alarm on the metric, if it exceeds some threshold (the threshold being your instance limit).
Your Lambda function, would run on a schedule e.g. every 5 mins, and would do the following:
Use the ec2:DescribeAccountAttributes API to get the account instance limit and cloudwatch:DescribeAlarms to get the current threshold of the alarm. If they differ, the alarm threshold should be updated the the instance limit via the cloudwatch:PutMetricAlarm API.
Use the ec2:DescribeInstances API and count the number of instances that are running and publish the value to a custom CloudWatch metric with the cloudwatch:PutMetricData API.
If the value published to the metric exceeds the threshold of the alarm, it will fire. The lambda function will keep the alarm threshold configured to the limit of instances and will publish datapoints to the metric based on the number of instances currently running.

AWS Autoscaling and CloudWatch with SQS

I have an application that performs long running tasks. So it decided to use AWS SQS with Autoscaling policy and CloudWatch.
I read that Amazon SQS queues sends metrics to CloudWatch every five minutes. I known that my single task takes 10 seconds. So one worker can handle 30 tasks for five minutes. I would like that message will live as short as possible in SQS. For example:
if 30 messages are added to SQS I would like to have one worker,
if 60 messages are added to SQS I would like to have two workers,
if 90 messages are added to SQS I would like to have three workers,
etc.
According to documentation I created a AutoScaling policy (that adds 1 instance) and CloudWatch alarm (that fires this policy if there is more than 30 ApproximateNumberOfMessagesVisible). So should I add a second ClouldWatch alarm if there are more than 60 messages? And third ClouldWatch alarm if there are more than 90 messages?
No. Your policy will keep adding machines repeatedly until the metric falls below 30.