I have 2 instances, connected to a load balancer. I would like to stop 1 instance, and start it only when a certain alarm happens, for example when the first intance has a high CPU load.
I couldn't find how to do it. in the Auto scaling group, i see i can launch a brand new instance, but that's not what i want, i want a specific instance to start.
I couldn't find how to connect an alert to an action - wake up this specific instance.
Should this be done in the load balancer configuration? i couldn't find how...
This is not really the way autoscaling is supposed to work, and hence the solution to your particular problem is a little bit more complex than simply using autoscaling to create new instances in response to metric thresholds being crossed. It may be worth asking yourself exactly why you need to do it this way and whether it could be achieved in the usual way.
In order to achieve starting (and stopping) a particular instance you'll need three pieces:
A CloudWatch Alarm triggered by the metric you need (CPUUtilization) crossing your desired threshold.
An SNS topic that is triggered by the alarm in the previous step.
A lambda function (with the correct IAM permissions) that is subscribed to the SNS topic, which sends the relevant API calls to EC2 to start or stop the instances when the notification from SNS arrives. You can find some examples of the code needed to do this eg here in node.js and here from AWS although there are probably others if you prefer another language.
Once you put all these together you should be able to respond to a change in CPU with starting and stopping particular instances.
Related
I am told to create a Simple auto-scaling policy and using CloudWatch to trigger an increase in resources based on an alarm, I have created a target tracking scaling policy within my ASG and set the target value to 50 and with the alarm I have created a SNS to send me a notification to my email when it goes above the targeted value. But i'm not entirely sure if that's what was exactly asked for.
Is that what was meant by creating a 'Simple auto-scaling policy'? Any confirmation would be helpful
As mentioned in the above comments, you need to confirm with whoever made that request.
You said:
I am told to create a Simple auto-scaling policy and using CloudWatch to trigger an increase in resources based on an alarm,
Therefore, you should create a simple auto-scaling policy which will trigger an increase (scale out) in resources.
Note: when you create a new policy, the alarms are automatically created and managed for you. Once created, you can edit them as well. When you delete the policy, the alarms get deleted. There is usually an alarm for the "scale up" (HIGH alarm) and for the "scale in" (LOW alarm). But, again, these are automatically created for you when you create the policy.
So, I'd say you just need to setup autoscaling for whichever service you're working on (EC2, ECS, other; this is not mentioned in your question) and assign a policy to that (autoscale based on a target metric: CPU, Memory, number of requests, etc or based on some other custom metric).
In the end, you'd want to test it by applying a load test and confirm that your service scales out when the load thresholds are breached (the threshold, how many datapoints must breach the threshold in what period of time: these are all defined in your policy and associated alarms automatically created after you setup the policy).
So, to answer the question, only the person who made the request can confirm but I'll say with a good certainty that the end goal is to increase the resources under load. So, no, the goal is not to send an email. You might want to send an email as well. But I would bet a benjamin that what's really wanted here is for you to make some service autoscale under load (scale out and scale in).
I am trying to add the lambda in the Auto-scaling target, and getting the error "No scalable resources found" while trying to fetch by tag.
Is it possible or allowed to add lambda to the auto-scaling target?
UPDATE:
I am trying to figure out how to change the provisional concurrency during non-peak hours on the application that will help to save some cost so I was exploring the option of auto-scaling
Lambda automatically scales out for incoming requests, if all existing execution contexts (lambda instances) are busy. There is basically nothing you need to do here, except maybe set the maximum allowed concurrency if you want to throttle.
As a result of that, there is no integration with AutoScaling, but you can still use an Application Load Balancer to trigger your Lambda Function if that's what you're after.
If you're building a purely serverless application, you might want to look into the API Gateway instead of the ALB integration.
Update
Since you've clarified what you want to use auto scaling for, namely changing the provisioned concurrency of the function, there are ways to build something like that. Clément Duveau has mentioned a solution in the comments that I can get behind.
You can create a Lambda Function with two CloudWatch events triggers with Cron-Expressions. One for when you want to scale out and another one for when you want to scale in.
Inside the lambda function you can use the name of the rule that triggered the function to determine if you need to do a scale out or scale in. You can then use the PutFunctionConcurrency API-call through one of the SDKs mentioned at the bottom of the documentation to adjust the concurrency as you see fit.
Update 2
spmdc has mentioned an interesting blog post using application auto scaling to achieve this, I had missed that one - you might want to check it out, looks promising.
I have spot instance running and struggling with 2 issues:
Testing the termination
It seems if we use Splot Feet, reducing the fleet size would help with triggering the termination notice. Is there any way to test this without Spot Fleet, just by running only one spot instance?
It seems the way to read the spot notice is either by querying the meta data (from within the node) (or) use spot request state (or) use describeinstance API cal.
I can't use META DATA (or) spot request state due to my application requirement, now that leaves describeinstance API, using this, what value I need to parse to figure out "instance is marked for interrpution".
Appreciate any suggestions!
The Amazon EC2 Spot two-minute warning is available via Amazon CloudWatch Events. You can create a CloudWatch Event Rule to automatically trigger a response in near-real time.
I don’t have an answer for your first question (you might want to split it in two separate questions anyway, as this 1/ makes it easier to answer your question, 2/ narrows the focus, thereby making it more useful for future visitors)
I'm trying to find a way to make an Amazon EC2 instance stop automatically when a certain custom metric on CloudWatch passes a limit. So far if I've understood correctly based on these articles:
Discussion Forum: Custom Metric EC2 Action
CloudWatch Documentation: Create Alarms to Stop, Terminate, Reboot, or Recover an Instance
This will only work if the metric is defined as follows:
Tied to certain instance
With type of System/Linux
However in my case I have a custom metric that is actually not instance-related but "global" and if a certain limit is passed, I would need to stop all instances, no matter from which instance the limiting log is received.
Does anybody know if there is way to make this work? What I'd need is some way to make CloudWatch work like this:
If arbitrary custom metric value passes a certain limit -> stop defined instances not tied to the metric itself.
The main problem is that the EC2 option is greyed out as the metric is not tied to certain EC2 instance and I'm not sure if there's any way to do this without actually making the metric itself certain instance related.
Have the custom CloudWatch metric post alerts to an SNS topic.
Have the SNS topic trigger a Lambda function that shuts down your EC2 instances via a call to the AWS API.
I have been given the business logic of
A customer makes a request for services through a third party
gateway GUI to an EC2 instance
Processing for some time (15hr)
Data retrieval
Currently this is implemented by statically giving each user an EC2 instance to use to handle their requests. (This instance actually creates some sub instances to parallel process the data).
What should happen is that for each request, an EC2 instance be fired off automatically.
In the long term, I was thinking that this should be done using SWF (given the use of sub processes), however, I wondered if as a quick and dirty solution, using Autoscaling with the correct settings is worthwhile pursuing.
Any thoughts?
you can "trick" autoscalling to spin up instances based on metrics:
http://docs.aws.amazon.com/AutoScaling/latest/DeveloperGuide/policy_creating.html
So on each request, keep track/increments a metric. Decrement the metric when the process completes. Drive the autoscalling group on the metric.
Use Step Adjustments to control the number of instances: http://docs.aws.amazon.com/AutoScaling/latest/DeveloperGuide/as-scale-based-on-demand.html#as-scaling-types
Interesting challenges: binding customers to specific EC2 instances. Do you have this hard requirement of giving each customer their own instance? Sounds like autos calling is actually better suited for the paralleling process of the actual data, not for requests routing. You may get away with having a fixed number of machines for this and/or scale based on traffic, not number of customers.