Load Balancers, server task, splitting servers and AWS instance - amazon-web-services

I am trying to learn how load balancing and servers work (both cloud servers or regular pool servers). From my understanding, load balancers redirect requests from users to servers with the least amount of stress/connections, so that the webpages may load quickly no matter how many users are making the same request.
The part I am confused about is the TASK each server does. From what I am seeing online from diagrams and such, it seems like there are multiple servers that do different tasks such as sending a view file (html) or sending static content or a server for the database (MySQL). But I also hear that doing this can be bad since splitting your servers can make things complicated by for example by having different DNS's that each do different things. So I guess what I need to be clarified is, do servers just do all the things it needs to do in ONE server, so I mean if a request is asking for a view file, it goes to the same server as the one that handles requests for static images or a post request or get request etc.
And regarding AWS instances, does that just mean each instance is just another copy of the "setup" you have. Meaning one instance has server A with Database Server A, but another instance is just another copy of it, so the second instance will also have Server A with Database Server A?
Server A (for doing everything such as sending
view file, get requests static
assets etc.)
Server B (For database MySQL)
/
user ---> DNS ---> load b. - Server A (for doing everything such as
sending view file, get requests
static assets etc.)
Server B (For database MySQL)
\
Server A (for doing everything such as
sending view file, get requests
static assets etc.)
Server B (For database MySQL)
So basically what I am trying to ask is, do all servers the load balancers redirect to have the same task or do the load balancers send them to different separate servers with different task. If the latter, how does it know when to send it to the server for lets say serving static files and what if there were many request to this server, how would the load balancer handle that?
And is each group of servers I have in my little diagram (server A and Server B) is that what an AWS instance is kind of like?

The answer is that the servers do whatever you configure them to do.
Sometimes people design their system so that every server can respond to any request, and sometimes they have separate servers for separate tasks (eg some servers handling mobile requests, some handling authentication requests, others handling web requests). In AWS terminology, these different groups of servers would be separate Target Groups and the Application Load Balancer would be configured to send requests to different Target Groups based upon the content of the URL.
On thing, however... Database servers are never placed behind a load balancer like you have shown above. Only the application servers have communication with a database server. This makes a 3-tier architecture:
Load Balancer -> Application Servers -> Database
Normal practice is to put the database on a separate server (or a cluster of servers) so that they are independent to the application. This allows the application to scale by adding/removing app servers, without impacting the database.
It is also worthwhile offloading static content to Amazon S3. This can be as simple as changing the URL in an <img src=...> tag. This improves bandwidth because Amazon S3 is huge and it means there is less traffic going through the Application Servers. This improves scaling of the application.

Related

Application ELB - sticky sessions based on consistent hashing

I couldn't find anything in the documentation but still writing to make sure I did not miss it. I want all connections from different clients with the same value for a certain request parameter to end up on the same upstream host. With ELB sticky session, you can have the same client connect to the same host but no guarantees across different clients.
This is possible with Envoy proxy, see: https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/upstream/load_balancing/load_balancers#ring-hash
We already use ELB so if the above is possible with ELB then we can avoid introducing another layer in between with envoy.
UPDATE:
Use-case - in a multi-tenant cloud solution, we want all clients from a given customer account to connect to the same upstream host.
Unfortunately this is not possible to be performed in an ALB.
An application load balancer controls all the logic over which host receives the traffic with features such as ELB sticky sessions and pattern based routing.
If there is no work around then you could look at a Classic Loadbalancer which has support for the application setting the sticky session cookie name and value.
From best practice ideally your application should be stateless, is it possible to look at rearchitecting your app instead of trying work around. Some suggestions I would have are:
Using DynamoDB to store any session based data, moving from a disk based session (if that's what your application does).
Any disk based files that need to persist could be shared between all hosts either using EFS for your Linux based hosts, or FSX on Windows.
Medium/Long term persisting files could be migrated to S3, any assets that rarely change could be stored here and then your application could use S3 rather than disk.
It's important to remember that as I stated above, you should keep your application as stateless as you can. Assume that your EC2 instances could fail, by preparing for this it will make it easier to recover.

Scalable server hosting

I have simple server now (some xeon cpu hosted somewhere), running apache/php/mysql (no docker, but its a possibility) and Im expecting some heavy traffic and I need my server to handle that.
Currently the server can handle about 100 users at once, I need it to handle couple thousands possibly.
What would be easiest and fastest solution to move my app to some scalable hosting?
I have no experience with AWS or something like that.
I was reading about AWS and similar, but Im mostly confused and not sure what should I choose.
The basic choice is:
Scale vertically by using a bigger computer. However, you will eventually hit a limit and you will have a single-point of failure (one server!), or
Scale horizontally by adding more servers and spreading the traffic across the servers. This has the added advantage of handling failure because, if one server fails, the others can continue serving traffic.
A benefit of doing horizontal scaling in the cloud is the ability to add/remove servers based on workload. When things are busy, add more servers. When things are quiet, remove servers. This also allows you to lower costs when things are quiet (which is not possible on-premises when you own your own equipment).
The architecture involves putting multiple servers behind a Load Balancer:
Traffic comes into a Load Balancer
The Load Balancer sends the request to a server (often based upon some measure of how "busy" each server is)
The server processes the request and sends a response back to the Load Balancer
The Load Balancer sends the response to the original requester
AWS has several Load Balancers available, which vary by need. If you are simply sending traffic to a single application that is installed on all servers, a Network Load Balancer should be sufficient. For situations where different parts of the application are on different servers (eg mobile interface vs web interface), you could use a Application Load Balancer.
AWS also assists with horizontal scaling by providing the Amazon EC2 Auto Scaling service. This allows you to specify details of the servers to launch (disk image, instance type, network settings) and Auto Scaling can then automatically launch new servers when required and terminate ones that aren't required. (Note that they launch and terminate, not start and stop.)
You can further define scaling policies that tell Auto Scaling when to launch/terminate instances by measuring metrics such as CPU Utilization. This way, the number of servers can approximately match the volume of traffic.
It should be mentioned that if you have a database, it should be stored separately to the application servers so that it does not get terminated. You could use the Amazon Relational Database Service (RDS) to run a database for you, or you could run one on a separate Amazon EC2 instance.
If you want to find out more about any of the above technologies, there are plenty of talks on YouTube or blog posts that can explain and demonstrate their use.

AWS Application Load Balancer with HTTP2

I have a RESTful app deployed on a number of EC2 instances sitting behind a Load Balancer.
Authentication is handled in part by a custom request header called "X-App-Key".
I have just migrated my classic Load Balancers to Application Load Balancers and I'm starting to experience intermittent issues where some valid requests (via testing with CURL) are failing authentication for some users. It looks like the custom request header is only intermittently being passed through. Using apache bench approx 100 of 500 requests failed.
If I test with a classic Load Balancer all 500 succeed.
I looked into this a bit more and found that the users who this is failing for are using a slightly newer version of CURL and specifically the requests coming from these users are using HTTP2. If I add "--http1.1" to the CURL request they all pass fine.
So the issues seem to be specific to us using a custom request header with the new generation application load balancers and HTTP2.
Am I doing something wrong?!
I found the answer on this post...
AWS Application Load Balancer transforms all headers to lower case
It seems the headers come through from the ALB in lowercase. I needed to update my backend to support this
You probably have to enable Sticky sessions in your loadbalancer.
They are needed to keep the session open liked to the same instance.
But, it's at application level the need of having to keep a session active, and not really useful in some kind of services, (depending on the nature of your system, not really recommended) as it provides performance reduction in REST like systems.

how to carry performance testing on sticky enabled load balanced web application?

Hie,
I read a lot of blogs and tutorials. I cannot figure it out how to carry out performance testing on a cookie based sticky web application which sits behind a reverse proxy load balancer. I have 3 backed application servers serving same instance of a shopping cart. A load balancer sits infront of them and directs the traffic.
Problem: when i send HTTP request for performance analysis the load balancer (tracks client ip through cookie) redirects the HTTP request to the same back end server that was assigned to. I have an option of using IP spoofing but it wont work when the backend servers are distribted in WAN rather than LAN. Moreover, each backend servers has its own public IP address and sits behind the firewall.
Question: IS there a way Jmeter can be configured to load test in this scenario. or is there othere better solution
Much appreciate your thoughts and contribution.
Regards
Here are few possible workarounds:
Point different JMeter instances directly to different backend hosts bypassing the load balancer.
Use Distributed Testing having JMeter nodes somewhere in the cloud, i.e. Amazon Micro Instances are free. You can use JMeter ec2 Script to simplify the installation, configuration and execution.
Try using DNS Cache Manager, it enables individual DNS resolution for each JMeter thread.

AWS Load Balancer - Remove cache elements on EC2

I'm currenty upscaling from 1xEC2 server to:
1xLoad Balancer
2xEC2 servers
I have quiet a lot of customers, each running our service on their own domain.
We have a webfront and admin-interface and use a lot of caching. When something is changed on the admin-part, the server calls eg.: customer.net/cacheutil.ashx?f=delete&obj=objectname to remove the object on crossdomains.
Hence the new setup, I don't know how to do this with multiple servers, ensuring that the cached objects is deleted on both servers (or more, if we choose to launch more).
I think that it is a "bit much" to require our customers to add eg. "web1.customer.net", "web2.customer.net" and "customer.net" to point at 3 different DNS CNAMEs, since they are not that IT experienced.
How does anyone else do this?
When scaling horizontally, it is recommended to keep your web servers stateless. That is, do not store data on a specific server. Instead, store the information in a database or cache that can be accessed by all servers. (eg DynamoDB, ElastiCache)
Alternatively, use the Sticky Sessions feature of the Elastic Load Balancing service, which uses a cookie to always redirect a user's connection back to the same server.
See documentation: Configure Sticky Sessions for Your Load Balancer