AWS autoscaling an existing instance

AWS autoscaling an existing instance - amazon-web-services

This question has a conceptual and practical parts.
Conceptually I'd like to know if using the autoscaling functionality is equivalent to simply increasing the compute power by a factor of the number of added instances?
Practically ... how does this work? I have one running instance, its database sitting on an LVM composed of multiple EBS volumes, similarly with all website data. Judging from the load on the instance I either need to upgrade to a more powerful instance or introduce this autoscaling. Is it a copy of the running server? If so, how is the database (etc) kept consistent?
I've read through the AWS documentation, and still haven't got the picture yet - I could set one autoscaling group up which would probably clear my doubts, but I am very leery to do this with a production server.
Any nudges in the right direction would be welcome.

Normally if you have a solution that also uses a database, and several machines in the solution, the database is typically not on any of the machines but is instead hosted seperately with each worker machine pointing to the same database - if you are on AWS platform already, then DynamoDB or RDS are both good solutions for this.
In theory, for some applications, upgrading the size of the single machine will give you the same power as adding several smaller machines, but increasing the size of the single machine, while usually these easiest thing to do at first, should not be considered autoscaling and has its own drawbacks. Here are some things to consider:
Using multiple machines instead of one big one gives you some fault tolerance. One or more machines can go down and if your solution is properly designed new machines will spin up to replace them.
Increasing the size of a single machine solution means you are probably paying too much. If you size that single machine big enough to handle peak workloads, that means at other times (maybe most of the time), you are paying for a bigger machine than you need. If you setup your autoscaling solution properly more machines come on line in response to increasing demand, and then they terminate when that demand decreases - you only pay for the power you need when you need it.
When your solution is designed in this manner, you need to think of all of the worker machines as ephermal - likely to disappear at any time, so you need to build your solution differently. Besides using a hosted database (like on DynamoDB or AWS RDS), you also should not store any data on the machines in your auto-scaling group that doesn't also live somewhere else. For example, if part of your app allows users to upload images, you don't store them on the instances, you store them in S3. Same would apply to any other new data that comes in.
You need to be able to figuratively 'pull the plug' at any instant on any of the machines in your ASG without losing data.
Ultimately a properly setup auto-scaling solution will likely serve you better, but without doubt it is simpler to just 'buy a bigger machine' and the extra money you spend on running that bigger machine may be more than offset by the time and effort you don't have to spend re-architecting your solution to properly run in an autoscaling environment. The unique requirements of your solution will ultimately decide which approach is better.

Related

Replicated caching solutions compatible with AWS

My use case is as follow:
We have about 500 servers running in an autoscaling EC2 cluster that need to access the same configuration data (layed out in a key/value fashion) several million times per second.
The configuration data isn't very large (1 or 2 GBs) and doesn't change much (a few dozen updates/deletes/inserts per minute during peak time).
Latency is critical for us, so the data needs to be replicated and kept in memory on every single instance running our application.
Eventual consistency is fine. However we need to make sure that every update will be propagated at some point. (knowing that the servers can be shutdown at any time)
The update propagation across the servers should be reliable and easy to setup (we can't have static IPs for our servers, or we don't wanna go the route of "faking" multicast on AWS etc...)
Here are the solutions we've explored in the past:
Using regular java maps and use our custom built system to propagate updates across the cluster. (obviously, it doesn't scale that well)
Using EhCache and its replication feature. But setting it up on EC2 is very painful and somehow unreliable.
Here are the solutions we're thinking of trying out:
Apache Ignite (https://ignite.apache.org/) with a REPLICATED strategy.
Hazelcast's Replicated Map feature. (http://docs.hazelcast.org/docs/latest/manual/html-single/index.html#replicated-map)
Apache Geode on every application node. (http://geode.apache.org/)
I would like to know if each of those solutions would work for our use case. And eventually, what issues I'm likely to face with each of them.
Here is what I found so far:
Hazelcast's Replicated Map is somehow recent and still a bit unreliable (async updates can be lost in case of scaling down)
It seems like Geode became "stable" fairly recently (even though it's supposedly in development since the early 2000s)
Ignite looks like it could be a good fit, but I'm not too sure how their S3 discovery based system will work out if we keep adding / removing node regularly.
Thanks!

Geode should work for your use case. You should be able to use a Geode Replicated region on each node. You can choose to do synchronous OR asynchronous replication. In case of failures, the replicated region gets an initial copy of the data from an existing member in the system, while making sure that no in-flight operations are lost.
In terms of configuration, you will have to start a couple/few member discovery processes (Geode locators) and point each member to these locators. (We recommend that you start one locator/AZ and use 3 AZs to protect against network partitioning).
Geode/GemFire has been stable for a while; powering low latency high scalability requirements for reservation systems at Indian and Chinese railways among other users for a very long time.
Disclosure: I am a committer on Geode.

Ignite provides native AWS integration for discovery over S3 storage: https://apacheignite-mix.readme.io/docs/amazon-aws. It solves main issue - you don't need to change configuration when instances are restarted. In a nutshell, any nodes that successfully joins topology writes its coordinates to a bucket (and removes them when fails or leaves). When you start a new node, it reads this bucket and connects to one the listed addresses.

Hazelcast's Replicated Map will not work for your use-case. Note that it is a map that is replicated across all it's nodes not on the client nodes/servers. Also, as you said, it is not fully reliable yet.
Here is the Hazelcast solution:
Create a Hazelcast cluster with a set of nodes depending upon the size of data.
Create a Distributed map(IMap) and tweak the count & eviction configurations based on size/number of key/value pairs. The data gets partitioned across all the nodes.
Setup Backup count based on how critical the data is and how much time it takes to pull the data from the actual source(DB/Files). Distributed maps have 1 backup by default.
In the client side, setup a NearCache and attach it to the Distributed map. This NearCache will hold the Key/Value pair in the local/client side itself. So the get operations would end up in milliseconds.
Things to consider with NearCache solution:
The first get operation would be slower as it has to go through network to get the data from cluster.
Cache invalidation is not fully reliable as there will be a delay in synchronization with the cluster and may end reading stale data. Again, this is same case across all the cache solutions.
It is client's responsibility to setup timeout and invalidation of Nearcache entries. So that the future pulls would get fresh data from cluster. This depends on how often the data gets refreshed or value is replaced for a key.

What is an instance and how do I convert this to $

You'll have to excuse my ignorance on this one...but honestly, I've had a hard time finding clarity on this. That being said, I'm looking for a non technical answer...something in layman's terms!
Anyways, I've been playing around building a web app (first time obviously) and I'm getting to the point where I've started looking into hosting services. A quick google search and a few blogs later, I thought AWS would be a good place to start, since they give a free-year trial. I don't care about speedy upstarts or other hosting serves, so save your key strokes on offering other services.
My question is based on the fact that AWS charges "Linux Usage per hour" and they also use this term "instance". Yeah...an "instance" is an "object", which is also above my head (probably the real source of the problem), but that was the extent I was able to learn via a google search. That being said, I don't know how to translate the cost into a ball park cost. Yes, I can probably use the trial to help monitor predictable costs, but I don't want to go through the effort of learning one hosting companies system just to find out it's not going to work in the end.
OK...so hopefully by now you see where I'm coming from. What is an "instance" and how do I use the "Linux Usage per hour" to estimate cost? Is an instance a server? For example if I start NGINX is that in instance? Is it just one instance running NGINX or does every VPN represent an instance? If I have 100 people calling the server at once, can they fit on one instance? If I start another server say, Apache or Node, does that become another instance? If I connect to a database, is that an instance? Do instances start as needed? Yes, I know, that's more than one question...I'm just trying to express my confusion.
If I'm suppose to choose a pricing model from this list, "Linux Usage per hour", I need to know what them mean by "Linux Usage". If it's based on an "instance", I need to know what that is. So please, in layman's terms, help clear this up. Maybe some examples or analogies, but no deep technical stuff.
This is more a side note, but I was reading this article and it said
For a client needing to run 800 virtual instances, the annual cost of
a private cloud came to below $400,000 vs. somewhere between $800,000
and $1.2 million for public cloud services.
Considering I don't know what an instance is, that kinda made me a bit nervous...WAAAAAAyyyyyy outta my price range! Yes, it's obviously a big company, but can you imagine "hitting the lottery" with an app everyone loves then before you know it, AWS hits you with a bill of $1,000,000. Or even worse, your security sucks and someone spawns millions of these "instances"...help alive my paranoia!!

Basically, an instance is a virtual machine, which looks very much like a server. As such it's running an operating system - e.g. linux - which is capable of running many programs (aka 'processes' or sometimes, 'services') at the same time.
To go through your questions (some of the explanations below are not technically accurate, but are hopefully more explanatory for it - if anything is obvious or already known, apologies - trying not to assume any knowledge)
An instance is an object
This definition is coming up in your searches because 'instance' has many definitions in different situations. If you see the definition of 'instance' as an object, it's from the topic of object oriented programming languages - you define a class in your code (kind of like a 'template'), and then create instances of the class - kind of like real copies of the template.
Amazon borrowed the term to be analogous - because in the 'cloud' world, you can create an AMI (Amazon Machine Image - the template) and then create lots of instances that are copies or clones of that template.
Is an instance a server?
In terms of what you can do with it, yes, it's a server.
(Technically it's a virtual server - Amazon runs multiple virtual servers on each physical server.)
how do I use the "Linux Usage per hour" to estimate cost?
Estimate how long you will have your instance running for in hours per month, multiply it by cost per hour and you will have your estimated cost per instance per month.
e.g. - one instance always turned on would be - 24 hrs * 31 days = 744 hours. At $0.013/hr (for a t2.micro) that would be 744 * $0.013 = $9.672/mth.
(And that's the reason the free tier gives you 750 hours of instance time per month.)
Instances come in different types and sizes and each size costs a different amount. If you are not sure what size you need, I'd start with the smallest until you discover you need more - which would be when your program starts running too slowly.
For example if I start NGINX is that in instance?
Nginx is a program that runs as a daemon in linux terms - a program that runs in the background so it's always on. It will be one of the many programs running on the server (aka the instance)
If I have 100 people calling the server at once, can they fit on one instance?
It depends - on how big your instance is, and how efficient the program is that is responding to their requests. If you are just getting started learning to program websites, I wouldn't worry about handling 100 people issuing requests to the server all at once just yet - walk before you run :) (also, even when there are 100 people visiting your website, the odds that all of them issue a request at exactly the same time is low - usually they load a page and read it - while they're reading it, some of the other people are loading other pages, and it all spreads out so you might only have ~10 page requests actively being processed by your server at the same time.)
However, if you have 2,000 people on your site at the same time, you might be processing 200 page requests at once, so by then you do need to have put some thought into performance and scalability.
(Note: these numbers are arbitrary and depend entirely on the type of site and it's traffic patterns.)
Generally, most websites pick a mid-level instance size, and then to handle more requests they 'scale out' - create lots of copies of that instance, and allow each instance to handle a portion of the traffic.
If I start another server say, Apache or Node, does that become another instance
The language to use here would be 'start another service say, Apache or Node' - they are other programs, and your instance will be perfectly fine running nginx, apache and node all at the same time. Although each will consume some of the resources (e.g. memory and cpu) and the more activity they are doing, the faster you will run out of resources and need to get a bigger instance size
So - no, they don't automatically become another instance. The language is confusing because sometimes people don't distinguish between the 'server' (aka the instance) and the service (aka the program) and will say the 'apache server' and the 'apache service' interchangably.
If I connect to a database, is that an instance?
Your instance, as a fully capable server, could run a database service on it at the same time as the other services - e.g. you could install and run mysql on your instance.
There is another option, though - if you use the AWS RDS product, then you will be starting an RDS instance. An RDS instance is different from an EC2 instance (what we've been talking about so far) in that RDS instances are specialised to just run the database service and nothing else, but EC2 instances are general servers that you can do pretty much anything on.
It's usually recommended to use RDS, but if you are trying to save money and aren't serving many users, there's nothing particularly wrong with installing mysql on your instance yourself (especially while you're learning how it works) and then moving your data to an RDS instance when you want to support more load or traffic.
Do instances start as needed?
Not by default, no - you have to manually start and stop them.
However, there are options other than manually starting and stopping. Amazon provides a lot of APIs, so you could write a program that would connect to the API and automatically start and stop your instance(s) based on rules you build into your program..
Also, Amazon offers a product called "AutoScalingGroups" which allows you to have a related group of instances and for Amazon to automatically start and stop them according to rules that you configure into that product. These rules can be 'scheduled actions' - start/stop at certain times of day - or they can be reactive - e.g. when the average CPU usage is > 50% for more than 5 minutes, start another instance.
This is more a side note, but I was reading this article and it said
For a client needing to run 800 virtual instances, the annual cost of
a private cloud came to below $400,000 vs. somewhere between $800,000
and $1.2 million for public cloud services.
The 'free tier' gives you a t2.micro sized instance (1 vCPU, 1 GiB RAM) which you could leave turned on permanently for free during that free year.
Even after your free tier expires, that same instance would cost you $9.67/mth, and you have the option to go downgrade to a t2.nano (0.5 GiB RAM) which would only cost ~$4/mth - but 0.5GiB RAM isn't much these days, so may not be enough for you.
A t2.micro should be more than enough to learn how to build websites on. If you are fortunate enough to build a site that is popular enough that you are getting more requests than that server can handle, then you will have to decide if you can generate revenue from that popularity sufficient to cover the cost, but by then you'll have more of a sense of how efficient your program is, and what instance size (and/or how many instances) you'll need.
Yes, it's obviously a big company, but can you imagine "hitting the
lottery" with an app everyone loves then before you know it, AWS hits
you with a bill of $1,000,000
AWS protects you from yourself here a bit - they have limits which generally restrict you from running more than 20 instances at a time - unless you ask for permission. So, by default, your instance won't go multiplying like rabbits on it's own - unless you set it up to. And even if you have set it up to, it won't be able to grow beyond 20 instances unless you have asked amazon to let you. So, worst case is 20 x $9.67/mth - $197/mth.
But - that's just the instance cost. Amazon charges you for lots of things including data traffic in and out, RDS instance costs, and if you start using other service such as S3 buckets and/or elastic load balancers, they all attract their own costs.
But hopefully, if you hit the lottery with an app everyone loves, you've worked out how to convert that love into dollars and cents so you can pay for all those instances you're going to need :)

Best practice: Multiple django applications on a single Amazon EC2 instance

I've been runnning a single django application on Amazon EC2 using gunicorn to serve the django portion and nginx for the static files.
I'm going to be starting new project soon, and wondering which of the following options would be better:
A larger amazon EC2 instance (Medium) runnning multiple django applications
Multiple smallers EC2 instances (Small/Micro) all running their own django applications?
Would anybody have any experience with this? What would the relevant performance metrics I could measure to get a good cost to performance ratio?

The answer to this question really depends on your app I'm afraid. You need to benchmark to be sure you are running on the right instance type. Some key metrics to watch are:
CPU
Memory usage
Requests per second, per instance size
App startup time
You will also need to tweak nginx/gunicorn settings to make sure you are running with a configuration that is optimised for your instance size.
If costs are a factor for you, one interesting metric is "cost per ten thousand requests", i.e. how much are you paying per 10000 requests for each instance type?

I agree with Mike Ryan's answer. I would add that you also have to evaluate whether your app needs a separate database. Sometimes it makes sense to isolate large/complex applications with their own database, which makes changes and maintenance easier. (Also reduces your risk in the case that something goes wrong). Not all of your user base would be affected in the case of an outage. You might want to create a separate instance for these applications. Note: Django supports multiple databases in one project but, again, that increases complexity for changes and maintenance.

Can you get a cluster of Google Compute Engine instances that are physically local?

Google Compute Engine lets you get a group of instances that are semantically local in the sense that only they can talk to each other and all external access has to go through a firewall etc. If I want to run Map-Reduce or other kinds of cluster jobs that are going to induce high network traffic, then I also want machines that are physically local (say, on the same rack). Looking at the APIs and initial documentation, I don't see any way to request that; does anyone know otherwise?

There is no support in GCE right now for specifying rack locality. However, we built the system to work well in the face of large numbers of instances talking to each other in a fully connected way, as long as they are in the same zone.
This is one of the things that allowed MapR to approach the record for a hadoop terasort. You can see that in action in the video for the Criag Mcluckie's talk from IO:
https://developers.google.com/events/io/sessions/gooio2012/302/
The best way to see is to test out your application and see how it works.

EC2 server, lots of micro instances or fewer larger instances?

I was wondering which would be better, to host a site on EC2 with many micro instances, or fewer larger instances such as m1.large. All will sit behind one or a few larger instances as load balancers. I will say what my understanding is, and anybody who knows better can add or correct me if I'm wrong
Main reason for choosing micro instances is cost. A single micro instance on average will give around 0.35ECU for $0.02/hour, while one small instance will give 1ECU for $0.085. If you do the math of $/ECU/hour, a micro instance works out to be $0.057/ECU/hour, whereas for a small instance it's $0.085/ECU/hour. So for the same average computing power, choosing 100 micro instances would be cheaper than 35 small instances.
Main problem with micro instances is more fluctuating performance, but I'm not sure if this will be less of a problem when you have many instances.
So does anybody have experience benching such setups and see the benefits and drawbacks? I'm trying to choose which way to go.
PS: an article on the subject, http://huanliu.wordpress.com/2010/09/10/amazon-ec2-micro-instances-deeper-dive/

Beware of micro-instances, they may bite you. We have out test environment all on micro-instances. Since they are just functional test environment, it works smoothly. However, we happened to have update some application (well, Jetty 7.5.3) that has known bug of spinning higher CPU usage. This rendered those instances useless as Amazon throttles the available CPU to 2%.
Also, micro instances are EBS backed. EBS is not advisable (over instance-store) for high IO operations like the ones require for Cassandra or the likes.
If you want to save money and your software is architected to handle interruptions, you may opt for spot instances. They usually cost less than on-demand ones.
If all these are not a issue to you, I would say, micro-instances is the way to go! :)
Basics questions about micro instances performance
CPU pattern for micro
Stolen CPU on micro

I would say: depends on what kind of architecture your app will have and how reliable it will need to be:
AWS Load Balancers does not provide instant (maybe real-time is a better word?)
auto-scale which is different of fail-over concept. It works with
health checks from time to time and have its small delay because it
is done via http requests (more overhead if you choose https).
You will have more points of failure if you choose more instances depending on architecture. To avoid it, your app will need to be async between instances.
You must benchmark and test more your application if you choose more
instances, to guarantee those bursts won't affect your app too much.
That's my point of view and it would be a very pleasant discussion between experienced people.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js