AWS Load Balancer - Remove cache elements on EC2 - amazon-web-services

I'm currenty upscaling from 1xEC2 server to:
1xLoad Balancer
2xEC2 servers
I have quiet a lot of customers, each running our service on their own domain.
We have a webfront and admin-interface and use a lot of caching. When something is changed on the admin-part, the server calls eg.: customer.net/cacheutil.ashx?f=delete&obj=objectname to remove the object on crossdomains.
Hence the new setup, I don't know how to do this with multiple servers, ensuring that the cached objects is deleted on both servers (or more, if we choose to launch more).
I think that it is a "bit much" to require our customers to add eg. "web1.customer.net", "web2.customer.net" and "customer.net" to point at 3 different DNS CNAMEs, since they are not that IT experienced.
How does anyone else do this?

When scaling horizontally, it is recommended to keep your web servers stateless. That is, do not store data on a specific server. Instead, store the information in a database or cache that can be accessed by all servers. (eg DynamoDB, ElastiCache)
Alternatively, use the Sticky Sessions feature of the Elastic Load Balancing service, which uses a cookie to always redirect a user's connection back to the same server.
See documentation: Configure Sticky Sessions for Your Load Balancer

Related

Application ELB - sticky sessions based on consistent hashing

I couldn't find anything in the documentation but still writing to make sure I did not miss it. I want all connections from different clients with the same value for a certain request parameter to end up on the same upstream host. With ELB sticky session, you can have the same client connect to the same host but no guarantees across different clients.
This is possible with Envoy proxy, see: https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/upstream/load_balancing/load_balancers#ring-hash
We already use ELB so if the above is possible with ELB then we can avoid introducing another layer in between with envoy.
UPDATE:
Use-case - in a multi-tenant cloud solution, we want all clients from a given customer account to connect to the same upstream host.
Unfortunately this is not possible to be performed in an ALB.
An application load balancer controls all the logic over which host receives the traffic with features such as ELB sticky sessions and pattern based routing.
If there is no work around then you could look at a Classic Loadbalancer which has support for the application setting the sticky session cookie name and value.
From best practice ideally your application should be stateless, is it possible to look at rearchitecting your app instead of trying work around. Some suggestions I would have are:
Using DynamoDB to store any session based data, moving from a disk based session (if that's what your application does).
Any disk based files that need to persist could be shared between all hosts either using EFS for your Linux based hosts, or FSX on Windows.
Medium/Long term persisting files could be migrated to S3, any assets that rarely change could be stored here and then your application could use S3 rather than disk.
It's important to remember that as I stated above, you should keep your application as stateless as you can. Assume that your EC2 instances could fail, by preparing for this it will make it easier to recover.

What is the difference between Load Balancer cookies stickiness and ElastiCache for storing user session?

I have heard about two approaches to store user session in Amazon AWS. One approach is to use cookies stickiness with Load Balancer and the other is to store user session to ElastiCache. What are the advantages and disadvantages if I want to use the EC2 Load Balancer as well as ElastiCache? Where should I store the user session?
AWS LB stickiness is something else, you can not store thing in LB stickiness, this is controlled by AWS underlying service. The load balancer uses a special cookie to track the instance for each request to each listener. When the load balancer receives a request, it first checks to see if this cookie is present in the request. If so, the request is sent to the instance specified in the cookie. If there is no cookie, the load balancer chooses an instance based on the existing load balancing algorithm.
you can use the sticky session feature (also known as session
affinity), which enables the load balancer to bind a user's session to
a specific instance. This ensures that all requests from the user
during the session are sent to the same instance.
LB sticky sessions just route the subsequent request same ec2 instance from the same user, it will help application like WebSocket.
lb-sticky-sessions
So if you are looking for a way to management and store sensitive data and that data should be available across multiple nodes then you need
Distributed Session Management using Redis or Memcached. if you use case is just to stick the subsequent request to the same EC2 instance then LB stickiness is enough.
There are many ways of managing user sessions in web applications,
ranging from cookies-only to distributed key/value databases,
including server-local caching. Storing session data in the web server
responding to a given request may seem convenient, as accessing the
data incurs no network latency. The main drawback is that requests
have to be routed carefully so that each user interacts with one
server and one server only. Another drawback is that once a server
goes down, all the session data is gone as well. A distributed,
in-memory key/value database can solve both issues by paying the small
price of a tiny network latency. Storing all the session data in
cookies is good enough most of the time; if you plan to store
sensitive data, then using server-side sessions is preferable.
building-fast-session-caching-with-amazon-elasticache-for-redis

Scalable server hosting

I have simple server now (some xeon cpu hosted somewhere), running apache/php/mysql (no docker, but its a possibility) and Im expecting some heavy traffic and I need my server to handle that.
Currently the server can handle about 100 users at once, I need it to handle couple thousands possibly.
What would be easiest and fastest solution to move my app to some scalable hosting?
I have no experience with AWS or something like that.
I was reading about AWS and similar, but Im mostly confused and not sure what should I choose.
The basic choice is:
Scale vertically by using a bigger computer. However, you will eventually hit a limit and you will have a single-point of failure (one server!), or
Scale horizontally by adding more servers and spreading the traffic across the servers. This has the added advantage of handling failure because, if one server fails, the others can continue serving traffic.
A benefit of doing horizontal scaling in the cloud is the ability to add/remove servers based on workload. When things are busy, add more servers. When things are quiet, remove servers. This also allows you to lower costs when things are quiet (which is not possible on-premises when you own your own equipment).
The architecture involves putting multiple servers behind a Load Balancer:
Traffic comes into a Load Balancer
The Load Balancer sends the request to a server (often based upon some measure of how "busy" each server is)
The server processes the request and sends a response back to the Load Balancer
The Load Balancer sends the response to the original requester
AWS has several Load Balancers available, which vary by need. If you are simply sending traffic to a single application that is installed on all servers, a Network Load Balancer should be sufficient. For situations where different parts of the application are on different servers (eg mobile interface vs web interface), you could use a Application Load Balancer.
AWS also assists with horizontal scaling by providing the Amazon EC2 Auto Scaling service. This allows you to specify details of the servers to launch (disk image, instance type, network settings) and Auto Scaling can then automatically launch new servers when required and terminate ones that aren't required. (Note that they launch and terminate, not start and stop.)
You can further define scaling policies that tell Auto Scaling when to launch/terminate instances by measuring metrics such as CPU Utilization. This way, the number of servers can approximately match the volume of traffic.
It should be mentioned that if you have a database, it should be stored separately to the application servers so that it does not get terminated. You could use the Amazon Relational Database Service (RDS) to run a database for you, or you could run one on a separate Amazon EC2 instance.
If you want to find out more about any of the above technologies, there are plenty of talks on YouTube or blog posts that can explain and demonstrate their use.

Load Balancers, server task, splitting servers and AWS instance

I am trying to learn how load balancing and servers work (both cloud servers or regular pool servers). From my understanding, load balancers redirect requests from users to servers with the least amount of stress/connections, so that the webpages may load quickly no matter how many users are making the same request.
The part I am confused about is the TASK each server does. From what I am seeing online from diagrams and such, it seems like there are multiple servers that do different tasks such as sending a view file (html) or sending static content or a server for the database (MySQL). But I also hear that doing this can be bad since splitting your servers can make things complicated by for example by having different DNS's that each do different things. So I guess what I need to be clarified is, do servers just do all the things it needs to do in ONE server, so I mean if a request is asking for a view file, it goes to the same server as the one that handles requests for static images or a post request or get request etc.
And regarding AWS instances, does that just mean each instance is just another copy of the "setup" you have. Meaning one instance has server A with Database Server A, but another instance is just another copy of it, so the second instance will also have Server A with Database Server A?
Server A (for doing everything such as sending
view file, get requests static
assets etc.)
Server B (For database MySQL)
/
user ---> DNS ---> load b. - Server A (for doing everything such as
sending view file, get requests
static assets etc.)
Server B (For database MySQL)
\
Server A (for doing everything such as
sending view file, get requests
static assets etc.)
Server B (For database MySQL)
So basically what I am trying to ask is, do all servers the load balancers redirect to have the same task or do the load balancers send them to different separate servers with different task. If the latter, how does it know when to send it to the server for lets say serving static files and what if there were many request to this server, how would the load balancer handle that?
And is each group of servers I have in my little diagram (server A and Server B) is that what an AWS instance is kind of like?
The answer is that the servers do whatever you configure them to do.
Sometimes people design their system so that every server can respond to any request, and sometimes they have separate servers for separate tasks (eg some servers handling mobile requests, some handling authentication requests, others handling web requests). In AWS terminology, these different groups of servers would be separate Target Groups and the Application Load Balancer would be configured to send requests to different Target Groups based upon the content of the URL.
On thing, however... Database servers are never placed behind a load balancer like you have shown above. Only the application servers have communication with a database server. This makes a 3-tier architecture:
Load Balancer -> Application Servers -> Database
Normal practice is to put the database on a separate server (or a cluster of servers) so that they are independent to the application. This allows the application to scale by adding/removing app servers, without impacting the database.
It is also worthwhile offloading static content to Amazon S3. This can be as simple as changing the URL in an <img src=...> tag. This improves bandwidth because Amazon S3 is huge and it means there is less traffic going through the Application Servers. This improves scaling of the application.

how to carry performance testing on sticky enabled load balanced web application?

Hie,
I read a lot of blogs and tutorials. I cannot figure it out how to carry out performance testing on a cookie based sticky web application which sits behind a reverse proxy load balancer. I have 3 backed application servers serving same instance of a shopping cart. A load balancer sits infront of them and directs the traffic.
Problem: when i send HTTP request for performance analysis the load balancer (tracks client ip through cookie) redirects the HTTP request to the same back end server that was assigned to. I have an option of using IP spoofing but it wont work when the backend servers are distribted in WAN rather than LAN. Moreover, each backend servers has its own public IP address and sits behind the firewall.
Question: IS there a way Jmeter can be configured to load test in this scenario. or is there othere better solution
Much appreciate your thoughts and contribution.
Regards
Here are few possible workarounds:
Point different JMeter instances directly to different backend hosts bypassing the load balancer.
Use Distributed Testing having JMeter nodes somewhere in the cloud, i.e. Amazon Micro Instances are free. You can use JMeter ec2 Script to simplify the installation, configuration and execution.
Try using DNS Cache Manager, it enables individual DNS resolution for each JMeter thread.