Increase vCPUS/RAM if needed - amazon-web-services

I have create a AWS EC2 instance to run a computation routine that works for most cases, however every now and then I get an user that needs to run a computation routine that crashes my program due to lack of RAM.
Is it possible to scale the EC2 instance's RAM and or vCPUs if required or if certain threshold (say when 80% of RAM is used) is reached. What I'm trying to avoid is keeping and unnecessary large instance and only scale resources when needed.

It is not possible to adjust the amount of vCPUs or RAM on an Amazon EC2 instance.
Instead, you must:
Stop the instance
Change the Instance Type
Start the instance
The virtual machine will be provisioned on a different 'host' computer that has the correct resources matched to the Instance Type.
A common approach is to scale the Quantity of instances to handle the workload. This is known as horizontal scaling and works well where work can be distributed amongst multiple computers rather than making a single computer 'bigger' (which is 'Vertical Scaling').
The only exception to the above is when using Burstable performance instances - Amazon Elastic Compute Cloud, which are capable of providing high amounts of CPU but only for limited periods. This is great when you have bursty needs (eg hourly processing or spiky workloads) but should not be used when there is a need for consistent high workloads.

Related

Why EC2 instance is not accessible to others

I deployed the Machine Learning classification model in AWS EC2 (UBUNTU)instance successfully. I am able to access the instance "http://ec2-18-191-31-0.us-east-2.compute.amazonaws.com" and predictions are working fine only for few minutes. After that I or my colleagues are not able to access this. Getting an error "cannot connected to the server".
Security group that I crated as attached.
t2.micro instances are not suitable for any long running calculations. They are burstable. This means that their performance can be sustained only for short periods of time, e.g., sudden, short lived spikes in CPU usage. On top of that they have only 1 GB of RAM which limits its usefulness in machine learning.
For calculations, you could consider Compute optimized or Memory optimized instances. Obviously, these instance types are not free, but they are suited for calculations.
You can change instance type if you want and test with other, more power types. What you are describing indicates that your t2.micro exhausts all its RAM and/or CPU burst credits after few minutes and it freezes.
You can use CloudWatch Metrics for EC2 to monitor your instances and observer its CPU utilization and other metrics which can help you determine what exactly is causing the backlog. You can also monitor RAM and disc usage but this requires CloudWatch Agent setup on the instance.

Would it be best to scale fewer larger instances, or more smaller instances?

what will be the best option to choose b/w less number of large instances or more number of the small instance when the performance is concerned, as the cloudwatch (load balancing and scaling) will be used if the traffic floods on the servers.
AWS is all about ELASTICITY
There is no need to provision large instances when not needed and burn out money.
There can be many instances when your CPU on one goes high and the next large instance you created remains under-utilized.
You should have medium instances to small w.r.t the tier you require (Memory Intensive, CPU, or Network) and scale those instances with properly written policies.
As long as the userdata, ami is stable you can spawn many instances within minutes making sure you are not spending way too much and saving every Penny.
SCALE WHEN NEEDED HORIZONTALLY
This is heavily dependent on your application.
I agree with Faisal Nizam's intuition of favoring horizontal scaling. However, there are many applications that will not run very well on small instances.
For example, Elastic recommends to have Elasticsearch cluster nodes with 64GB of RAM. Similar reasoning can be applied to many other data related applications, where it can be beneficial if a single instance is able to keep large data chunks in memory.
I would recommend to find the ideal instance size for your application, and from there scale horizontally.
Each EC2 has also some overhead, so you need to find a balance between large & costly instances vs. a lot and small instances with overhead.
(As of today) To vertically scale up/scale down an EC2 server, it needs to be shut down and spun back up - something to keep in mind before deciding to go for it.

What AWS EC2 Instance Types suitable for chat application?

Currently i'm building a chat application base on NodeJs
So i considered choose which is the best instance type for our server?
Because AWS have a lot of choice: General purpose, compute optimize, memory optimize ....
Could you please give me advise :(
You can read this - https://aws.amazon.com/blogs/aws/choosing-the-right-ec2-instance-type-for-your-application/
Actually it doesn't matter what hosting you chose -AWS, MS Azure, Google Compute Engine etc...
If you want to get as much as you can from your servers and infrastructure, you need to solve your current task.
First of all decide how many active users at the same time you will get in closest 3-6 months.
If there will be less than 1000k active users (connections) per second - I think you can start from the smallest instance type. You should check how you can increase CPU/RAM/HDD(or SSD) of your instance.
SO when you get more users you will have a plan how to speed up your server.
And keep an eye on your server analytics - CPU/RAM/IO utilizations when you are getting more and more users.
The other questions if you need to pass some certifications related to security restrictions...
Since you are not quite sure where to start with, I would recommend to start with General Purpose EC2 instance for production from M category (M3 or M4). You can start with smaller instance type like m3.medium.
Note: If its an internal chat application with low traffic you can even consider T series EC2 instances.
The important part here is not to try to predict the capacity needs. Instead you can start small with general purpose EC2 instance and down the line looking at the resource consumption of EC2 instance you can do a proper capacity planning. Since you can both Scale the instances Horizontally and Vertically, it will require to trade of the instance type also considering Cost and timely load requirements before selecting the scaling unit of EC2 instance.
One of the approach I'm following is as follows
Start with General Purpose Instance (Unless I'm confident that there are special needs such as Networking, IO & etc.)
Do a load test(Without Autoscaling for a single EC2 instance) of the application by changing the number of users and find out the limits (How many users can a single EC2 instance can handle).
After analyzing the Memory, CPU & IO utilization, you can also consider shifting to a different EC2 category or stick with the same type. (Lets say CPU goes to its limit but memory is hardly used, you can consider using C series instances).
Scale the EC2 instance vertically by moving to the next size (e.g m3.medium to m3.large) and carry out the load tests to find out its limits.
After repeating step, 3 and 4 you can find an optimal balance between Cost and Performance.
Lets take 3 instance types with cost as X for the lowest selected (Since increasing the EC2 size in one unit, makes the cost doubles)
m3.medium - can serve 100 users, cost X
m3.large - can serve 220 users, cost 2X
m3.xlarge - can serve 300 users. cost 3X
Its an easy choice to select m3.large as the EC2 instance size since it can serve 110 per X cost.
However its not straight forward for some applications where you need to decide the instance type based on your average expected load.
Setup autoscaling and load balancing to horizontally scale the EC2 instances to handle load above average.
For more details, refer the Architecting for the Cloud: Best Practices whitepaper.
I would recommend starting with a T2.micro Linux instance. Watch the CPU usage in CloudWatch. Once the CPU usage starts to exceed 50% to 75%, or free memory gets low, or disk I/O gets saturated, switch to the next larger instance.
T2.micro Linux instances are (for the most part) free. Read the fine print. T2.micro instances are burstable which means that you can get good performance from a small instance.
Unless your chat application has a huge customer / transaction base, you (probably) won't need the other instance types.

Ensuring consistent network throughput from AWS EC2 instance?

I have created few AWS EC2 instances, however, sometimes, my data throughput (both for upload and download) are becoming highly limited on certain servers.
For example, typically I have about 15-17 MB/s throughput from instance located in US West (Oregon) server. However, sometimes, especially when I transfer a large amount of data in a single day, my throughput drops to 1-2 MB/s. When it happens on one server, the other servers have a typical network throughput (as previously expect).
How can I avoid it? And what can cause this?
If it is due to amount of my data upload/download, how can I avoid it?
At the moment, I am using t2.micro type instances.
Simple answer, don't use micro instances.
AWS is a multi-tenant environment as such resource are shared. When it comes to network performance, the larger instance sizes get higher priority. Only the largest instances get any sort of dedicated performance.
Micro and nano instances get the lowest priority out of all instances types.
This matrix will show you what priority each instance size gets:
https://aws.amazon.com/ec2/instance-types/#instance-type-matrix

Confusion about instances used inside a Amazon Ec2 Container Service

When a Ec2 Container Engine cluster is created, it creates a Compute Engine managed instance group to manage the created instances. These instances are from Ec2 service, which means, they are Virtual machines.
But we know that containers represent a new way to deploy containers based on operating-system-level virtualization rather than hardware virtualization
like VMs that are heavyweight and non-portable, isn't a contradiction? correct me if I'm wrong.
We use containers because they are extremely fast (either in boot time or tasks execution) compared to VMs, and they save a lot of space storage. So if we have one node(vm) that can supports 4 containers max, our clients can rapidly lunch 4 containers, but beyond this number, Ec2 autoscaler will need to lunch a new node(vm) to support upcoming containers, which incurs some tasks delay.
Is it impossible to launch containers over physical machines?
And what do you recommend for running critical time execution tasks?
I believe you are working under an erroneous assumption that ECS scales the virtual machines ("container instances" -- the instances where containers will run) directly with task demand.
If that were true, you would have a point, because the cluster would be sluggish and unresponsive any time insufficient container instance resources were not immediately available.
ECS doesn't do that, the presence of the Auto Scaling Group notwithstanding.
Depending on the Amazon EC2 instance types you use in your clusters, and quantity of container instances you have in a cluster, your tasks have a limited amount of resources that they can use when they are run. ECS monitors the resources available in the cluster to work with the schedulers to place tasks. If your cluster runs low on any of these resources, such as memory, you will eventually be unable to launch more tasks until you add more container instances, reduce the number of desired tasks in a service, or stop some of the running tasks in your cluster to free up the constrained resource. (emphasis added)
http://docs.aws.amazon.com/AmazonECS/latest/developerguide/cloudwatch_alarm_autoscaling.html
So, no... it doesn't launch the new tasks slowly when you are out of capacity. It doesn't launch them at all.
But don't get ahead of me.
The link above explains, with examples, how scaling of the virtual machines (container instances) is designed to actually work.
Of course, you don't have to make them adaptively scalable at all. You can go with your physical server model (note: I say physical server model -- meaning a fixed, inelastic pool of resources, on always-running virtual machines, since virtual machines is what EC2 provides), and just choose how many instances you wait to have running at all times, essentially emulating physical servers. If you wanted, say, 8 container instances, the "auto scaling group" would maintain exactly 8 at all times, creating replacements if, say, one of them experienced a hardware failure. That "auto" accomplishment would be maintaining the status quo. And, of course, in this configuration, you could manually reconfigure from 8 to, say, 12 and the "auto" accomplishment would be that you'd automatically get 4 new ones to add to the existing 8.
But the idea of how the service is ideally used is that your group of virtual machines scales up and down by rules you devise, to anticipate the resources needed by future tasks -- or a future lack of tasks.
In the example given, memory reservation is the trigger:
When the memory reservation of your cluster rises above 75% (meaning that only 25% of the memory in your cluster is available to for new tasks to reserve), the alarm triggers the Auto Scaling group to add another instance and provide more resources for your tasks and services.
It triggers the addition of more container instances so that you always have whatever you have determined to be the appropriate threshold of surplus capacity already online by the time you need it.
Of course, memory is just one resource, and 75% is just an arbitrary threshold chosen for the example.
Auto Scaling Groups can scale on a variety of triggers -- the phrase of the moon, the price trends in the stock market, whatever is appropriate to anticipating your desired amount of surplus capacity and can be quantified and monitored can be used... but this service does not scale itself directly by the actual attempt to launch a new task when the task can't be launched due to insufficient resources.
Herein lies the flaw in your original argument.
Why virtual machines? Simply enough, because when you destroy a virtual machine because the capacity is not expected to be needed, you stop paying for it.
In this light, perhaps you'll agree that this is not a weakness, it's a strength. Physical servers never stop costing you when you are not using them.
You don't need to pay anything at all for capacity you will not be needing with VMs -- you only have to pay for the capacity you're using plus the amount you need to keep immediately available to handle anticipated demand.
You can have as much idle surplus immediately ready as you are willing to pay for, or you can maximize savings by allowing as little surplus capacity as you are comfortable with being able to access without delay.