I have simple server now (some xeon cpu hosted somewhere), running apache/php/mysql (no docker, but its a possibility) and Im expecting some heavy traffic and I need my server to handle that.
Currently the server can handle about 100 users at once, I need it to handle couple thousands possibly.
What would be easiest and fastest solution to move my app to some scalable hosting?
I have no experience with AWS or something like that.
I was reading about AWS and similar, but Im mostly confused and not sure what should I choose.
The basic choice is:
Scale vertically by using a bigger computer. However, you will eventually hit a limit and you will have a single-point of failure (one server!), or
Scale horizontally by adding more servers and spreading the traffic across the servers. This has the added advantage of handling failure because, if one server fails, the others can continue serving traffic.
A benefit of doing horizontal scaling in the cloud is the ability to add/remove servers based on workload. When things are busy, add more servers. When things are quiet, remove servers. This also allows you to lower costs when things are quiet (which is not possible on-premises when you own your own equipment).
The architecture involves putting multiple servers behind a Load Balancer:
Traffic comes into a Load Balancer
The Load Balancer sends the request to a server (often based upon some measure of how "busy" each server is)
The server processes the request and sends a response back to the Load Balancer
The Load Balancer sends the response to the original requester
AWS has several Load Balancers available, which vary by need. If you are simply sending traffic to a single application that is installed on all servers, a Network Load Balancer should be sufficient. For situations where different parts of the application are on different servers (eg mobile interface vs web interface), you could use a Application Load Balancer.
AWS also assists with horizontal scaling by providing the Amazon EC2 Auto Scaling service. This allows you to specify details of the servers to launch (disk image, instance type, network settings) and Auto Scaling can then automatically launch new servers when required and terminate ones that aren't required. (Note that they launch and terminate, not start and stop.)
You can further define scaling policies that tell Auto Scaling when to launch/terminate instances by measuring metrics such as CPU Utilization. This way, the number of servers can approximately match the volume of traffic.
It should be mentioned that if you have a database, it should be stored separately to the application servers so that it does not get terminated. You could use the Amazon Relational Database Service (RDS) to run a database for you, or you could run one on a separate Amazon EC2 instance.
If you want to find out more about any of the above technologies, there are plenty of talks on YouTube or blog posts that can explain and demonstrate their use.
Related
Can an AWS Elastic Load Balancer be setup so it sends all traffic to a main server and if that server fails, only then send traffic to a second server.
Have an existing web app I picked up that was never built to run on multiple servers and the client has become worried about redundancy. They don't want to invest enough to make it run well across multiple servers so I was thinking I could setup a second EC2 server with a MySQL slave and periodically copy files from the primary server to the secondary using rsync. Then have an AWS ELB send traffic to the primary server and only if that fails send it to the second server.
AWS load balancers don't support "backup" nodes that only take traffic when the primary is down.
Beyond that, you are proposing a complicated scenario.
was thinking I could setup a second EC2 server with a MySQL slave
If you do that, you can only fail over once, then you can't fail back, because the master database will then be obsolete. For a configuration like this to work and be useful, your two MySQL servers need to be configured with master/master (circular) replication, so that each is a replica of the other. This is an advanced configuration that requires expertise and caution.
For the MySQL component, an RDS instance with multi-AZ enabled will provide you with hands-off fault tolerance of the database.
Of course, the client may be unwilling to pay for this as well.
A reasonable shortcut for small systems might be EC2 instance recovery which will bring the site back up if the underlying hardware fails. This feature replaces a failed instance with a new instance, reattaches the EBS volumes, and starts it back up. If the system is stable and you have a solid backup strategy for all data, this might be sufficient. Effective redundancy as a retrofit is non-trivial.
I am reading about load balancing.
I understand the idea that load balancers transfer the load among several slave servers of any given app. However very few literature that I can find talks about what happens when the load balancers themselves start struggling with the huge amount of requests, to the point that the "simple" task of load balancing (distribute requests among slaves) becomes an impossible undertaking.
Take for example this picture where you see 3 Load Balancers (LB) and some slave servers.
Figure 1: Clients know one IP to which they connect, one load balancer is behind that IP and will have to handle all those requests, thus that first load balancer is the bottleneck (and the internet connection).
What happens when the first load balancer starts struggling? If I add a new load balancer to side with the first one, I must add even another one so that the clients only need to know one IP. So the dilema continues: I still have only one load balancer receiving all my requests...!
Figure 2: I added one load balancer, but for having clients to know just one IP I had to add another one to centralize the incoming connections, thus ending up with the same bottleneck.
Moreover, my internet connection will also reach its limit of clients it can handle so I probably will want to have my load balancers in remote places to avoid flooding my internet connection. However if I distribute my load balancers, and want to keep my clients knowing just one single IP they have to connect, I still need to have one central load balancer behind that IP carrying all the traffic once again...
How do real world companies like Google and Facebook handle these issues? Can this be done without giving the clients multiple IPs and expect them to choose one at random avoiding every client to connect to the same load balancer, thus flooding us?
Your question doesn't sound AWS specific, so here's a generic answer (elastic LB in AWS auto-scales depending on traffic):
You're right, you can overwhelm a loadbalancer with the number of requests coming in. If you deploy a LB on a standard build machine, you're likely to first exhaust/overload the network stack including max number of open connections and handling rate of incoming connections.
As a first step, you would fine tune the network stack of your LB machine. If that still does not provide you the required throughput, there are special purpose loadbalancer appliances on the market, that are built ground-up and highly optimized to handle a large number of incoming connections and routing them to several servers. Examples of these are F5 and netscaler
You can also design your application in ways that help you split traffic to different sub domains, thereby reducing the number of requests 1 LB has to handle.
It is also possible to implement a round-robin DNS, where you would have 1 DNS entry point to several client facing LBs instead of just one as you've depicted.
Advanced load balancers like Netscaler and similar also does GSLB with DNS not simple DNS-RR (to explain further scaling)
if you are to connect to i.e service.domain.com, you let the load balancers become Authorative DNS for the zone and you add all the load balancers as valid name servers.
When a client looks up "service.domain.com" any of your loadbalancers will answer the DNS request and reply with the IP of the correct data center for your client. You can then further make the loadbalancer reply on the DNS request based of geo location of your client, latency between clients dns server and netscaler, or you can answer based on the different data centers load.
In each datacenter you typically set up one node or several nodes in cluster. You can scale quite high using such a design.
Since you tagged Amazon, they have load balancers built in to their system so you don't need to. Just use ELB and Amazon will direct the traffic to your correct system.
If you are doing it yourself, load balancers typically have a very light processing load. They typically do little more than redirect a connection from one machine to another based on a shallow inspection (or no inspection) of the data. It is possible for them to be overwhelmed, but typically that requires a load that would saturate most connections.
If you are running it yourself, and if your load balancer is doing more work or your connection is getting saturated, the next step is to use Round-Robin DNS for looking up your load balancers, generally using a combination of NS and CNAME records so different name lookups give different IP addresses.
If you plan to use amazon elastic load balancer they claim that
Elastic Load Balancing automatically scales its request handling
capacity to meet the demands of application traffic. Additionally,
Elastic Load Balancing offers integration with Auto Scaling to ensure
that you have back-end capacity to meet varying levels of traffic
levels without requiring manual intervention.
so you can go with them and do not need to handle the Load Balancer using your own instance/product
I’m using Amazon Web Services, and trying to set up a modest system for load balancing and disaster recovery. The application is PHP based, with Zend Framework 2 (ZF2) on the front end, a local memcached server and MySQL through RDS. All servers are running Amazon Linux.
I am trying to configure the elastic load balancer to use two servers in two different AWS “availability zones.” To seamlessly allow one server to shut down and another take over, we need shared PHP sessions. So I set up PHP database sessions with ZF2.
In general, I assume the likelihood of an outage of an AWS zone is considerably lower than chance of a fatal problem in the individual servers or the application itself. So I am considering a different approach:
All the servers in the same availability zone
Separate AWS ElastiCache server (essentially memcached, cannot be used across zones)
PHP sessions stored in the cache (built-in support for memcached)
One emergency server in a different zone – in the rare case of a zone outage, we would change the DNS record to use the different server
Is this a good standard approach to DR and load balancing? I don’t like the DR solution in the case of zone outage, but I haven’t seen a zone go down much, and we can probably handle that level of risk if it simplifies the design. If the load balancer could weight be servers, I would pull all the weight on one zone, with the backup server weighted much lower.
What would be the benefit of keeping all the PHP servers in the same AZ vs distributing them among multiple AZs? I can't think of any, except a very small (3-5ms) latency improvement. Since there's very little downside, why not spread the servers among multiple AZs?
Your Elasticache memcached is still a single point of failure. If the AZ that the Elasticache instance is running in has a problem you will lose sessions. You could switch to use Elasticache w/ Redis (which supports master/slave) to achieve multi-AZ for your cache layer as well.
Questions about load balancers if you have time.
So I've been using AWS for some time now. Super basic instances, using them to do some tasks whenever I needed something done.
I have a task that needs to be load balanced now. It's not a public service though. It's pretty much a giant cron job that I don't want running on the same servers as my website.
I set up an AWS load balancer, but it doesn't do what I expected it to do.
It get's stuck on one server, and doesn't load balance at all. I've read why it does this, and that's all fine and well, but I need it to be a serious round-robin load balancer.
edit:
I've set up the instances on different zones, but no matter how many instances I add to the ELB, it just uses one. If I take that instance down, it switches to a different one, so I know it's working. But I really would like it to always use a different one under every circumstance.
I know there are alternatives. Here's my question(s):
Would a custom php load balancer be an ok option for now?
IE: Have a list of servers, and have php randomly select a ec2 instance. Wouldn't be scalable at all, bu atleast I could set this up in 2 mins and it can work for now.
or
Should I take the time to learn how HAProxy works, and set that up in place of the AWS ELB?
or
Am I doing it wrong, and AWS's ELB does do round-robin. I just have something configured wrong?
edit:
Structure:
1) Web server finds a task to do.
2) If it's too large it sends it off to AWS (to load balancer).
3) Do the job on EC2
4) Report back via curl to an API
5) Rinse and repeat
Everything works great. But because the connection always comes from my server (one IP) it get's sticky'd to a single EC2 machine.
ELB works well for sites whose loads increase gradually. If you are expecting an uncommon and sudden increase on the load, you can ask AWS to pre-warm it for you.
I can tell you I used ELB in different scenarios and it always worked well for me. As you didn't provide too much information about your architecture, I would bet that ELB works for you, and the case that all connections are hitting only one server, I would ask you:
1) Did you check the ELB to see how many instances are behind it?
2) The instances that you have behind the ELB, are all alive?
3) Are you accessing your application through the ELB DNS?
Anyway, I took an excerpt from the excellent article that does a very good comparison between ELB and HAProxy. http://harish11g.blogspot.com.br/2012/11/amazon-elb-vs-haproxy-ec2-analysis.html
ELB provides Round Robin and Session Sticky algorithms based on EC2
instance health status. HAProxy provides variety of algorithms like
Round Robin, Static-RR, Least connection, source, uri, url_param etc.
Hope this helps.
This point comes as a surprise to many users using Amazon ELB. Amazon
ELB behaves little strange when incoming traffic is originated from
Single or Specific IP ranges, it does not efficiently do round robin
and sticks the request. Amazon ELB starts favoring a single EC2 or
EC2’s in Single Availability zones alone in Multi-AZ deployments
during such conditions. For example: If you have application
A(customer company) and Application B, and Application B is deployed
inside AWS infrastructure with ELB front end. All the traffic
generated from Application A(single host) is sent to Application B in
AWS, in this case ELB of Application B will not efficiently Round
Robin the traffic to Web/App EC2 instances deployed under it. This is
because the entire incoming traffic from application A will be from a
Single Firewall/ NAT or Specific IP range servers and ELB will start
unevenly sticking the requests to Single EC2 or EC2’s in Single AZ.
Note: Users encounter this usually during load test, so it is ideal to
load test AWS Infra from multiple distributed agents.
More info at the Point 9 in the following article http://harish11g.blogspot.in/2012/07/aws-elastic-load-balancing-elb-amazon.html
HAProxy is not hard to learn and is tremendously lightweight yet flexible. I actually use HAProxy behind ELB for the best of both worlds -- the hardened, managed, hands-off reliability of ELB facing the Internet and unwrapping SSL, and the flexible configuration of HAProxy to allow me to fine tune how things hit my servers. I've never lost an HAProxy instance yet, but it I do, ELB will just take that one out of rotation... as I have seen happen when the back-end servers have all become inaccessible, which (because of the way it's configured) makes ELB think the HAProxy is unhealthy, but that's by design in my setup.
I have created two Amazon EC2 instances. After that I created an Elastic Load Balancer and registered the two instances in it.
Now what I would like to know is, when we use the DNS name of the load balancer, which instance will the load balancer use?
The idea of Load balancing is to distribute workload across multiple computers or a computer cluster, network links, central processing units, disk drives, or other resources [...].
While there are many algorithms conceivable, the general goal is to achieve optimal resource utilization, maximize throughput, minimize response time, and avoid overload, which usually implies transparent distribution of the load between the load balanced resources. Therefore you usually won't know (and shouldn't need to know), which load balanced resource serves a particular request.
Accordingly, Elastic Load Balancing (ELB) automatically distributes incoming application traffic across multiple Amazon EC2 instances.
How this is done specifically is a fairly complicated topic, mostly due to the ELB routing documentation falling short of being non existent, so one needs to assemble some pieces to draw a conclusion - see my answer to the related question Can Elastic Load Balancers correctly distribute traffic to different size instances for a detailed analysis including all the references I'm aware of.
For the question at hand I think it boils down to the somewhat vague AWS team response from 2009 to ELB Strategy:
ELB loosely keeps track of how many requests (or connections in the
case of TCP) are outstanding at each instance. It does not monitor
resource usage (such as CPU or memory) at each instance. ELB
currently will round-robin amongst those instances that it believes
has the fewest outstanding requests. [emphasis mine]
stf ,
you cannot come to know, for which server load is distributing through EBS , EBS internally take care of request distribution .
Of course you can figure out which server your request goes to!
On each server you are going to need something akin to a health_check.html file (can be named anything, someone suggested index.htm but that is a bad idea and is another discussion entirely) so the load balancer can call it and determine how long it took to get a response.
On server #1 put the following in the health_check.html file: <HTML><BODY>1</BODY></HTML>
On server #2 put this in the health_check.html file: <HTML><BODY>2</BODY></HTML>
Now when you navigate to www.YourDomain.com/health_check.html you will know exactly which server you are on.
Clear your cookies and re-navigate to the same URL to see which server you get next. Good luck cloud developer!