Is there any way to apply an autoscaling configuration to AWS Lambda provisioned concurrency using terraform?
I want to scale it up during peak hours, and ideally maintain an N+1 hot concurrency rate.
I looked here but found no reference to Lambdas: https://www.terraform.io/docs/providers/aws/r/appautoscaling_policy.html
The feature to control the auto-scaling of lambdas was added Dez.2019 (see this blog). As long as this is not available through Terraform you have a couple of options to work around this
Use a terraform provisioner to set up the provisioning rules through the aws-cli. Instructions which commands to run can be found in the AWS-Docs.
Invoke the lambda yourself from time to time to keep it warm, see e.g. this post or this stackoverflow question
Use a different service that provides more control, like ECS
Related
I have been using sagemaker recently and am using inference with GPU-based instances.
I am thinking of turning off sagemaker inference instances at night—for example, 8 pm to 8 am.
I want to do that using cdk. Not sure if it is a crazy idea or not?
Any help?
Amazon SageMaker supports different inference options that fits various use cases. You can use SageMaker Asynchronous endpoints to save cost during idle time (after operational hours), you don't have to use AWS CDK/ AWS CloudFormation while using this option.
Amazon SageMaker supports automatic scaling (autoscaling) your asynchronous endpoint. Autoscaling dynamically adjusts the number of instances provisioned for a model in response to changes in your workload. Unlike other hosted models Amazon SageMaker supports, with Asynchronous Inference you can also scale down your asynchronous endpoints instances to zero. Requests that are received when there are zero instances are queued for processing once the endpoint scales up.
Refer documentation, samples and blogs here.
I have an application running in ECS with asg as the capacity provider and using code deploy for rolling out new images.
the scaling policy configured in ECS service triggers autoscaling based on the cpu/mem metrics through cloud-watch alarms (target tracking policy).
When i trigger blue/green rollout in code deploy, there is a need for exactly the double the instances in the asg in order to accommodate the replacement version before routing the traffic.
however at this point ASG wont trigger autoscale and hence code deploy will not get enough instances ready to get replacement version started.
i think there are few ways to achieve this (although not tried it) but I am looking for a simpler out of the box solution where i need not to maintain a lot of configuration
I am curious to know what does the model.deploy command actually does in the background when implemented in aws sagemaker notebook
for eg :
predictor = sagemaker_model.deploy(initial_instance_count=9,instance_type='ml.c5.xlarge')
and also at the time of sagemaker endpoint autoscaling what is happening in the background, it is taking to long almost 10 minutes to launch a new-instances, by which most of the requests get dropped or not processed and also getting connection timeout while load testing threw JMeter. Is there any way to fast bootup or golden AMI kind of thing in sagemaker?
are there any other means by which this issue can be solved?
The docs mention what the deploy method does: https://sagemaker.readthedocs.io/en/stable/model.html#sagemaker.model.Model.deploy
You could also take a look at the source code here: https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/model.py#L377
Essentially the deploy method hosts your model on a SageMaker Endpoint, launching the number of instances using the instance type that you specify. You can then invoke your model using a predictor: https://sagemaker.readthedocs.io/en/stable/predictors.html
For autoscaling, you may want to consider lowering your threshold for scaling out so that the additional instances start to be launched earlier. This page offers some good advice on how to determine the RPS your endpoint can handle. Specifically, you may want to have a lower SAFETY_FACTOR to ensure new instances are provisioned in time to handle your expected traffic.
I have been using AWS for a while now. I always have the difficulty tracking AWS resources and how they are interconnected. Obviously, I am using Terraform but still, there is always ad-hoc operations that cut down my visibility.
Since I have been charged multiple times for resources/services that are present but not used by me.
Unused services include resources that are not pointing to other services but present in the AWS environment.
Tools suggestions are also welcome.
Also, posted on DevOps. Posting here since there are fewer people there.
I have used Janitor Monkey, Cloud Custodian and we do have a bunch of AWS Config + Lambda for cleaning up.
Janitor Monkey determines whether a resource should be a cleanup
candidate by applying a set of rules on it. If any of the rules
determines that the resource is a cleanup candidate, Janitor Monkey
marks the resource and schedules a time to clean it up.
I think that a viable answer here is the same as the popular answer for when to auto-scale - use CloudWatch alarms.
Whenever you have a service that you need to auto-scale up, you do something like monitor for high CPU. If the CPU usage trips some threshold, the alarm can be configured to scale up your fleet. Correspondingly, if CPU usage goes below some threshold, the alarm can be configured to scale down the fleet. Similar alarms can be configured other alerts like memory, disk usage, etc.
So, instead of configuring CloudWatch alarms to scale up or scale down your fleet, you can just configure a CloudWatch alarm to email you when a host becomes idle (e.g. it's CPU usage is too low).
Similar to Janitor Monkey, I've created a tool to track different types of unused resources (ELB, EBS, AMI, Security groups, etc) : https://github.com/romibuzi/majordome
Here's what I have in AWS:
Application ELB
Auto Scaling Group with 2 instances in different regions (Windows IIS servers)
Launch Config pointing to AMI_A
all associated back end stuff configured (VPC, subnets, security groups, ect)
Everything works. However, when I need to make an update or change to the servers, I am currently manually creating a new AMI_B, creating a new LaunchConfig using AMI_B, updating the AutoScalingGroup to use the new LaunchConfig, increasing min number of instances to 4, waiting for them to become available, then decreasing the number back to 2 to kill off the old instances.
I'd really love to automate this process. Amazon gave me some links to CLI stuff, and I'm able to script the AMI creation, create the LaunchConfig, and update the AutoScalingGroup...but I don't see an easy way to script spinning up the new instances.
After some searching, I found some CloudFormation templates that look like they'd do what I want, but most do more, and it's a bit confusing to me.
Should I be exploring CloudFormation? Is there a simple guide I can follow to get started? Or should I stay with the scripting I have started?
PS - sorry if this is a repeated question. Things change frequently at AWS, so sometimes the older responses may not be the current best answers.
You have a number of options to automate the process of updating the instances in an Auto Scaling Group to a new or updated Launch Configuration:
CloudFormation
If you do want to use CloudFormation to manage updates to your Auto Scaling Group's instances, refer to the UpdatePolicy attribute of the AWS::AutoScaling::AutoScalingGroup Resource for documentation, and the "What are some recommended best practices for performing Auto Scaling group rolling updates?" page in the AWS Knowledge Center for more advice.
If you'd also like to script the creation/update of your AMI within a CloudFormation resource, see my answer to the question, "Create AMI image as part of a cloudformation stack".
Note, however, that CloudFormation is not a simple tool- it's a complex, relatively low-level service for orchestrating AWS resources, and migrating your existing scripts to it will likely take some time investment due to its steep learning curve.
Elastic Beanstalk
If simplicity is most important, then I'd suggest you evaluate Elastic Beanstalk, which also supports both rolling and immutable updates during deployments, in a more fully managed, console-oriented, platform-as-a-service environment. Refer to my answer to the question, "What is the difference between Elastic Beanstalk and CloudFormation for a .NET project?" for further comparisons between CloudFormation and Elastic Beanstalk.
CodeDeploy
If you want a solution for updating instances in an auto-scaling group that you can plug into existing scripts, AWS CodeDeploy might be worth looking into. You install an agent on your instances, then trigger deployments through the API/CLI/Console and it manages deploying application updates to your fleet of instances. See Deploy an Application to an Auto Scaling Group Using AWS CodeDeploy for a complete tutorial. While CodeDeploy supports 'in-place' deployments and 'blue-green' deployments (see Working With Deployments for details), I think this service assumes an approach of swapping out S3-hosted application packages onto a static base AMI rather than replacing AMIs on each deployment. So it might not be the best fit for your AMI-swapping use case, but perhaps worth looking into anyway.
You want a custom Termination policy on the Auto Scaling Group.
OldestLaunchConfiguration. Auto Scaling terminates instances that have the oldest launch configuration. This policy is useful when you're updating a group and phasing out the instances from a previous configuration.
To customize a termination policy using the console
Open the Amazon EC2 console at https://console.aws.amazon.com/ec2/.
On the navigation pane, choose Auto Scaling Groups.
Select the Auto Scaling group.
For Actions, choose Edit.
On the Details tab, locate Termination Policies. Choose one or more
termination policies. If you choose multiple policies, list them in
the order that you would like them to apply. If you use the Default
policy, make it the last one in the list.
Choose Save.
On the CLI
aws autoscaling update-auto-scaling-group --auto-scaling-group-name my-asg --termination-policies "OldestLaunchConfiguration"
https://docs.aws.amazon.com/autoscaling/latest/userguide/as-instance-termination.html
We use Ansible's ec2_asg module for that purpose. There are replace_all_instances and replace_batch_size settings for that purpose. Per documentation:
In a rolling fashion, replace all instances that used the old launch configuration with one from the new launch configuration.
It increases the ASG size by C(replace_batch_size), waits for the new instances to be up and running.
After that, it terminates a batch of old instances, waits for the replacements, and repeats, until all old instances are replaced.
Once that's done the ASG size is reduced back to the expected size.
If you provide target_group_arns, module will check for health of instances in target groups before going to next batch.
Edit: in order to maintain desired number of instances, we first set min to desired.