AWS, how to prepare for user rush due to Push notification? - amazon-web-services

AWS should be useful for getting more server resources for user rush (for instance due to push notification)
What are the components (because AWS has so many services) I should look for ?
would it be possible to increase server capacity dynamically (programmatically) just before we send out all-user push notification?

Assuming that the emphasis on push notifications is just for the sake of giving an example for a cause of overloading on EC2 instances; you should first have an autoscaling group:
If the period your load increases is certain and fixed, you can set scheduled scaling for that autoscaling group.
Or, you can watch for metrics such as CPU usage etc. and trigger scaling in & out with events alarmed through CloudWatch. For deeper understanding of the concept, you can check: https://docs.aws.amazon.com/autoscaling/ec2/userguide/as-scaling-simple-step.html

Related

Is this the correct way to solve this problem or have I gone wrong?

I am told to create a Simple auto-scaling policy and using CloudWatch to trigger an increase in resources based on an alarm, I have created a target tracking scaling policy within my ASG and set the target value to 50 and with the alarm I have created a SNS to send me a notification to my email when it goes above the targeted value. But i'm not entirely sure if that's what was exactly asked for.
Is that what was meant by creating a 'Simple auto-scaling policy'? Any confirmation would be helpful
As mentioned in the above comments, you need to confirm with whoever made that request.
You said:
I am told to create a Simple auto-scaling policy and using CloudWatch to trigger an increase in resources based on an alarm,
Therefore, you should create a simple auto-scaling policy which will trigger an increase (scale out) in resources.
Note: when you create a new policy, the alarms are automatically created and managed for you. Once created, you can edit them as well. When you delete the policy, the alarms get deleted. There is usually an alarm for the "scale up" (HIGH alarm) and for the "scale in" (LOW alarm). But, again, these are automatically created for you when you create the policy.
So, I'd say you just need to setup autoscaling for whichever service you're working on (EC2, ECS, other; this is not mentioned in your question) and assign a policy to that (autoscale based on a target metric: CPU, Memory, number of requests, etc or based on some other custom metric).
In the end, you'd want to test it by applying a load test and confirm that your service scales out when the load thresholds are breached (the threshold, how many datapoints must breach the threshold in what period of time: these are all defined in your policy and associated alarms automatically created after you setup the policy).
So, to answer the question, only the person who made the request can confirm but I'll say with a good certainty that the end goal is to increase the resources under load. So, no, the goal is not to send an email. You might want to send an email as well. But I would bet a benjamin that what's really wanted here is for you to make some service autoscale under load (scale out and scale in).

How to start a specific instance, when another instance is overloaded?

I have 2 instances, connected to a load balancer. I would like to stop 1 instance, and start it only when a certain alarm happens, for example when the first intance has a high CPU load.
I couldn't find how to do it. in the Auto scaling group, i see i can launch a brand new instance, but that's not what i want, i want a specific instance to start.
I couldn't find how to connect an alert to an action - wake up this specific instance.
Should this be done in the load balancer configuration? i couldn't find how...
This is not really the way autoscaling is supposed to work, and hence the solution to your particular problem is a little bit more complex than simply using autoscaling to create new instances in response to metric thresholds being crossed. It may be worth asking yourself exactly why you need to do it this way and whether it could be achieved in the usual way.
In order to achieve starting (and stopping) a particular instance you'll need three pieces:
A CloudWatch Alarm triggered by the metric you need (CPUUtilization) crossing your desired threshold.
An SNS topic that is triggered by the alarm in the previous step.
A lambda function (with the correct IAM permissions) that is subscribed to the SNS topic, which sends the relevant API calls to EC2 to start or stop the instances when the notification from SNS arrives. You can find some examples of the code needed to do this eg here in node.js and here from AWS although there are probably others if you prefer another language.
Once you put all these together you should be able to respond to a change in CPU with starting and stopping particular instances.

What is the mechanism to get load on a box to trigger an autoscaling group in AWS?

I've got my web servers set up to autoscale at certain times of the day.
I can measure the load on the box using scripts executed by Consul - and this can trigger events at certain thresholds.
I want to push these two together and trigger autoscaling at certain load levels. (Assume CPU load at 75% is the threshold).
My question is: What is the mechanism to get load on a box to trigger an autoscaling group in AWS?
Assumptions:
I was not planning to use AWS Cloudwatch - but am interested if this is the solution.
I'm more interested in the autoscale triggering interface. Is it a queue or a rest endpoint?
As #mahdi said, you can easily use AWS Cloudwatch to do this.
However, if you want Consul (or anything outside the scope of an AWS "service") to do it, you can use lambda.
You would create a lambda function that scales your instance up or down (or both). Lambda can have many triggers, such as an HTTP endpoint through API Gateway. If you already have Consul set up to do it (sounds like you do since you said can trigger events at certain thresholds.) just make it issue the HTTP request to API Gateway to scale up or down.
You can create a CloudWatch alarm with a CPUUtilization metric and set it to change state when your instance has CPU utilization more than 75%. Then in the Auto Scaling Group, you use this alarm for scaling (in/out) policy. You can also control the number of instances in an Auto Scaling Group by manually (e.g. through your application running on one the instances) changing the Desired value. This documentation can be helpful.

AWS: Autoscaling based on the size of the queue

AWS auto scaling works based on the load (number of concurrent requests). It works perfectly for web sites and web APIs. However there are situations in which the number of required EC2 instances is not related to the requests but it depends on something else such as number of items in a queue.
For example an order processing system which pulls the orders from a custom queue (and not SQS) might need to scale out to process the order quicker. How can we make this happpen?
Auto scaling groups can be configured to scale in or out by linking their scaling policies to Cloud Watch alarms. Many people use CPU utilization as a scaling trigger but you can use any Cloud Watch metric you like. In your case you could use your queue's ApproximateNumberOfMessagesVisible metric.
For example, if you create an alarm that fires when the ApproximateNumberOfMessagesVisible > 500 and link that to the scale out policy of your auto scaling group, the group will create new instances whenever the queue has more that 500 messages in it.

How do I set up CloudWatch to detect when an EC2 instance goes down?

I've got an app running on AWS. How do I set up Amazon CloudWatch to notify me when the EC2 instance fails or is no longer responsive?
I went through the CloudWatch screens, and it appears that you can monitor certain statistics, like CPU or disk utilization, but I didn't see a way to monitor an event like "the instance got an http request and took more than X seconds to respond."
Amazon's Route 53 Health Check is the right tool for the job.
Route 53 can monitor the health and performance of your application as well as your web servers and other resources.
You can set up HTTP resource checks in Route 53 that will trigger an e-mail notification if the server is down or responding with an error.
http://eladnava.com/monitoring-http-health-email-alerts-aws/
To monitor an event in CloudWatch you create an Alarm, which monitors a metric against a given threshold.
When creating an alarm you can add an "action" for sending a notification. AWS handles notifications through SNS (Simple Notification Service). You can subscribe to a notification topic and then you'll receive an email for you alarm.
For EC2 metrics like CPU or disk utilization this is the guide from the AWS docs: http://docs.aws.amazon.com/AmazonCloudWatch/latest/DeveloperGuide/US_AlarmAtThresholdEC2.html
As answered already, use an ELB to monitor HTTP.
This is the list of available metrics for ELB:
http://docs.aws.amazon.com/ElasticLoadBalancing/latest/DeveloperGuide/US_MonitoringLoadBalancerWithCW.html#available_metrics
To answer your specific question, for monitoring X seconds for the http response, you would set up an alarm to monitor the ELB "Latency".
CloudWatch monitoring is just like you have discovered. You will be able to infer that one of your instances is frozen by taking a look at the metrics, but CloudWatch won't e.g. send you an email when your app is down or too slow, for example.
If you are looking for some sort of notification when your app or instance is down, I suggest you to use a monitoring service. Pingdom is a good option. You can also set up a new instance on AWS and install a monitoring tool, like Nagios, which would be my preferred option.
Good practices that are always worth, in the long road: using load balancing (Amazon ELB), more than one instance running your app, Autoscaling (when an instance is down, Amazon will automatically start a new one and maintain your SLA), and custom monitoring.
My team has used a custom monitoring script for a long time, and we always knew of failures as soon as they occurred. Basically, if we had two nodes running our app, node 1 sent HTTP requests to node 2 and node 2 to 1. If any request took more than expected, or returned an unexpected HTTP status or response body, the script sent an email to the system admins. Nowadays, we rely on more robust approaches, like Nagios, which can even monitor operating system stuff (threads, etc), application servers (connection pools health, etc) and so on. It's worth every cent invested in setting it up.
CloudWatch recently added "status check" metrics that will answer one of your questions on whether an instance is down or not. It will not do a request to your Web server but rather a system check. As previous answer suggest, use ELB for HTTP health checks.
You could always have another instance for tools/testing, that instance would try the http request based on a schedule and measure the response time, then you could publish that response time with CloudWatch and set an alarm when it goes over a certain threshold.
You could even do that from the instance itself.
As Kurst Ursan mentioned above, using "Status Check" metrics is the way to go. In some cases you won't be able to browse that metrics (i.e if you;re using AWS OpsWorks), so you're going to have to report that custom metric on your own. However, you can set up an alarm built on a metric that always matches (in an OK sate) and have the alarm trigger when the state changes to "INSUFFICIENT DATA" state, this technically means CloudWatch can't tell whether the state is OK or ALARM because it can't reach your instance, AKA your instance is offline.
There are a bunch of ways to get instance health info. Here are a couple.
Watch for instance status checks and EC2 events (planned downtime) in the EC2 API. You can poll those and send to Cloudwatch to create an alarm.
Create a simple daemon on the server which writes to DynamoDB every second (has better granularity than Cloudwatch). Have a second process query the heartbeats and alert when missing.
Put all instances in a load balancer with a dummy port open that that gives a TCP response. Setup TCP health checks on the ELB, and alert on unhealthy instances.
Unless you use a product like Blue Matador (automatically notifies you of production issues), it's actually quite heinous to set something like this up - let alone maintain it. That said, if you're going down the road, and want some help getting started using Cloudwatch (terminology, alerts, logs, etc), start with this blog: How to Monitor Amazon EC2 with CloudWatch
You can use CloudWatch Event Rule to Monitor whenever any EC2 instance goes down. You can create an Event rule from CloudWatch console as following :
In the CLoudWatch Console choose Events -> rule
For Event Pattern, In service Name Choose EC2
For Event Type, Choose EC2 Instance State-change Notification
For Specific States, Choose Stopped
In targets Choose any previously created SNS topic for sending a notification!
Source : Create a Rule - https://docs.aws.amazon.com/AmazonCloudWatch/latest/events/CloudWatch-Events-Input-Transformer-Tutorial.html#input-transformer-create-rule
This is not exactly a CloudWatch alarm, however this serves the purpose of monitoring/notification.