I manage a group of integration projects and for many we provide an Amazon instance with our product for development and/or demo purposes. In order to be economical with IT budgets, I wonder if there is a proxy software that can measure the traffic to those servers and start the instance on the first request and shut it down if there is no request for a set time (i.e. 60 min.)
Ideally the first request would trigger a page informing the user about the delay and keep autoloading until the instance has been up.
I'd also love to see usage statistics by IP, so I can measure the spread of users, how many different IPs, and the time they kept up the instance. But that is secondary.
Is there any such software/service out there? Preferably in FOSS?
If you utilize Auto Scaling and Custom CloudWatch Metrics you can potentially use any data you want to decide how to Auto Scale. If you have other log sources or application level code, you won't need a proxy, just something to interpret that data, pass it to your CloudWatch metric and then the auto scaling will occur as needed.
Utilizing t1.micro, you can have one instance handling requests and scale out from there with your autoscale group. Pricing for 1 or 3 year reserved instances costs extremely little. You need something to understand incoming volume, so having one instance would be required anyways. Using t1.micro, you operating costs are low and you can scale very granularly.
Related
CPU metrics cannot be selected below 1 minute in Cloudwatch service. For example, how can I lower this period time to trigger the Autoscale scale faster? I just need to trigger the AutoScale instances in short time. (By the way, datapoints value 1 to 1)
the minimum granularity for the metrics that EC2 provides is 1 minute.
Source: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/viewing_metrics_with_cloudwatch.html
Would also say that if you need to scale that quickly, wouldn't the startup time be an issue anyway?
You are correct -- basic monitoring of an Amazon EC2 instance provides metrics over 5-minute periods. If you activate EC2 Detailed Monitoring, metrics are provided over 1-minute periods. Extra charges apply for Detailed Monitoring.
When launching a new instance via Amazon EC2 Auto-Scaling, it can take a few minutes for the new instance to launch and for the User Data script (if any) to run. Linux instances are quite fast, but Windows instances take a while on their first boot due to sysprep operations.
You mention that you want to react to a metric in less than one minute. I would suggest that this would not be an ideal way to trigger Auto-scaling. Sometimes a computer can be busy for a while, then can drop down again. Reacting too quickly to a high CPU load would cause the Auto-Scaling group to flap between adding instances and terminating instances. It is better to provision enough capacity for a reasonable amount of extra load and then gradually add more capacity as it is required over time.
If you have a need to react so quickly, then perhaps you should investigate using AWS Lambda to perform small amounts of work in a highly-parallel fashion rather than relying on Amazon EC2 instances.
I am studying AWS, per the illustration in AWS here:
For a min/max=1 case, what does it implicit to? Seems no scaling to me as min = max
Thank you for your kind enlightening.
UPDATE:
so here is an example use case:
http://www.briefmenow.org/amazon/how-can-you-implement-the-order-fulfillment-process-while-making-sure-that-the-emails-are-delivered-reliably/
Your startup wants to implement an order fulfillment process for
selling a personalized gadget that needs an average of 3-4 days to
produce with some orders taking up to 6 months you expect 10 orders
per day on your first day. 1000 orders per day after 6 months and
10,000 orders after 12 months. Orders coming in are checked for
consistency men dispatched to your manufacturing plant for production
quality control packaging shipment and payment processing If the
product does not meet the quality standards at any stage of the
process employees may force the process to repeat a step Customers are
notified via email about order status and any critical issues with
their orders such as payment failure. Your case architecture includes
AWS Elastic Beanstalk for your website with an RDS MySQL instance for
customer data and orders. How can you implement the order fulfillment
process while making sure that the emails are delivered reliably?
Options:
A.
Add a business process management application to your Elastic Beanstalk app servers and re-use the ROS
database for tracking order status use one of the Elastic Beanstalk instances to send emails to customers.
B.
Use SWF with an Auto Scaling group of activity workers and a decider instance in another Auto Scaling group
with min/max=1 Use the decider instance to send emails to customers.
C.
Use SWF with an Auto Scaling group of activity workers and a decider instance in another Auto Scaling group
with min/max=1 use SES to send emails to customers.
D.
Use an SQS queue to manage all process tasks Use an Auto Scaling group of EC2 Instances that poll the tasks
and execute them. Use SES to send emails to customers.
The voted answer is C.
Can anyone kindly share the understanding? Thank you very much.
Correct, there will be no scaling outward or inward when min/max=1. Or when min=max. This situation is generally used for keeping a service available in case of failures.
Consider the alternative; you launch with an EC2 instance that's been bootstrapped with some user data script. If the instance has issues, you'll need to stop it and begin another.
Instead, you launch using an AutoScaling Group with a Launch Configuration that takes care of bootstrapping instances. If your application server begins to fail, you can just de-register it from the AutoScaling Group. AWS will take care of bringing up another instance while you triage the defective one.
Another situation you might consider is when you want the option to deploy a new version of an application with the same AutoScaling Group. In this case, create a new Launch Configuration and register it with the ASG. Increase max and desired by 1 temporarily. AWS will launch the instance for you and if it succeeds, you can then reduce Max and Desired back down to 1. By default, AWS will remove the oldest server but you can guarantee that the new one stays up by using termination protection.
Our web application has 5 pages (Signin, Dashboard, Map, Devices, Notification)
We have done the load test for this application, and load test script does the following:
Signin and go to Dashboard page
Click Map
Click Devices
Click Notification
We have a basic free plan in AWS.
While performing load test, till about 100 users, we didn’t get any error. please see the below image. We could see NetworkIn, CPUUtilization seems to be normal. But the NetworkOut showed 846K.
But when reach around 114 users, we started getting error in the map page (highlighted in red). During that time, it seems only NetworkOut is high. Please see the below image.
We want to know what is the optimal score for the NetworkOut, If this number is high, is there any way to reduce this number?
Please let me know if you need more information. Thanks in advance for your help.
You are using a t2.micro instance.
This instance type has limitations on CPU that means it is good for bursty workloads, but sustained loads will consume all the available CPU credits. Thus, it might perform poorly under sustained loads over long periods.
The instance also has limited network bandwidth that might impact the throughput of the server. While all Amazon EC2 instances have limited allocations of bandwidth, the t2.micro and t2.nano have particularly low bandwidth allocations. You can see this when copying data to/from the instance and it might be impacting your workloads during testing.
The t2 family, especially at the low-end, is not a good choice for production workloads. It is great for workloads that are sometimes high, but not consistently high. It is also particularly low-cost, but please realise that there are trade-offs for such a low cost.
See:
Amazon EC2 T2 Instances – Amazon Web Services (AWS)
CPU Credits and Baseline Performance for Burstable Performance Instances - Amazon Elastic Compute Cloud
Unlimited Mode for Burstable Performance Instances - Amazon Elastic Compute Cloud
That said, the network throughput showing on the graphs is a result of your application. While the t2 might be limiting the throughput, it is not responsible for the spike on the graph. For that, you will need to investigate the resources being used by the application(s) themselves.
NetworkOut simply refers to volume of outgoing traffic from the instance. You reduce the requests you are sending from this instance to reduce the NetworkOut .So you may need to see which one of click Map, Click Devices and Click Notification is sending traffic outside of the instances. It may not necessarily related only to the number of users but a combination of number of users and application module.
I have auto-scaling setup currently listed to the CPU usage on scaling in & out. Now there are scenarios that our servers got out of service due to out of memory, I applied custom metrics to get those data on the instance using the Perl scripts. Is it possible to have a scaling policy that listed to those custom metrics?
Yes!
Just create an Alarm (eg Memory-Alarm) on the Custom Metric and then adjust the Auto Scaling group to scale based on the Memory-Alarm.
You should pick one metric to trigger the scaling (CPU or Memory) -- attempting to scale with both could cause problems where one alarm is high and another is low.
Update:
When creating an Alarm on an Auto Scaling group, it uses only one alarm and the alarm uses an aggregated metric across all instances. For example, it might be Average CPU Utilization. So, if one instance is at 50% and another is at 100%, the metric will be 75%. This way, it won't add instances just because one instance is too busy.
This will probably cause a problem for your memory metric because aggregating memory across the group makes no sense. If one machine has zero memory but another has plenty of memory, it won't add more instances. This is fine because one machine can handle more load, but it won't really be a good measure of 'how busy' the servers are.
If you are experiencing "servers got out of service due to out of memory", the best thing you should do is to configure the Health Check on the load balancer such that it can detect whether an instance can handle requests. If the Auto Scaling health check fails on an instance, then it will stop sending requests to that server until the Health Check is successful. This is the correct way to identify specific instances that are having problems, rather than trying to scale-out.
At any rate, you should investigate your memory issues and determine whether it is actually related to load (how many requests are being handled) or whether it's a memory leak in the application.
I am in need of a fairly short/simple script to monitor my EC2 instances for Memory and CPU (for now).
After using Get-EC2Instance -Region , it lists all of the instances. from here where can i go?
Cloudwatch is the monitoring tool for AWS instances. While it can support custom metrics, by default it only measures what the hypervisor can see for your instance.
CPU utilization is supported by default, this is often a more accurate way to see your true CPU utilization since the value comes from the hypervisor.
Memory utilization however is not. This depends largely on your OS and is not visible to the hypervisor. However, you can set up a script that will report this metric to Cloudwatch. Some scripts to help you do this are here: http://docs.aws.amazon.com/AmazonCloudWatch/latest/DeveloperGuide/mon-scripts-perl.html
There are a few possibilities for monitoring EC2 instances.
Nagios - http://www.nagios.com/solutions/aws-monitoring
StackDriver - http://www.stackdriver.com/
CopperEgg - http://copperegg.com/aws/
But my favorite is Datadog - http://www.datadoghq.com/ - (not just because I work here, but its important to disclose I do work for Datadog.) 5 hosts or less is free and I bet you can be up and running in less than 5 minutes.
Depends what your requirements are for service availability of the monitoring solution itself, as well as how you want to be alerted about host/service notifications.
Nagios, Icinga etc... will allow you to customise an extremely large number of parameters that can be passed to your EC2 hosts, specifying exactly what you want to monitor or check up on. You can run any of the default (or custom) scripts which then feed data back to a central system, then handle those notifications however you want (i.e. send an email, SMS, execute an arbitrary script). Downside of this approach is that you need to self-manage your backend for all of the aggregated monitoring data.
The CloudWatch approach means your instances can push metric data into AWS, then define custom policies around thresholds. For example, 90% CPU usage for more than 5 minutes on an instance or ASG, which might then push a message out to your email via SNS (Simple Notification Service). This method reduces the amount of backend components to manage/maintain, but lacks the extreme customisation abilities of self-hosted monitoring platforms.