AWS Application Load Balancer WebSocket metadata based stickiness? - amazon-web-services

We have a cluster of some service. The clients connect to the cluster via Websocket. The clients are targeted to nodes based on the group they belong to (lets call it a "conference").
In other words, a whole group of clients (conference) is served by one particular node. So, the target node should be selected based on some metadata sent when initiating the WebSocket connection.
Client Tom Hanks connects to Actors conference -> LB routes to node EU Server
Client Tom Hanks connects to Tesla fans conference -> LB routes to node USA Server
Client Ada Zizkova connects to Actors conference -> LB routes to node EU Server
Client Ada Zizkova connects to Tesla fans conference -> LB routes to node USA Server
...
Notice that this is NOT a HTTP session based stickiness. The HTTP session is the same one for the same user.
All this is what we would like to have. But currently we are at the simple AWS Elastic Load Balancer and we are about to implement this stickiness in-house and bypass the ELB.
Before doing that, I am looking into whether the ALB could do what I described above. Can't find anything, just this: Does an Application Load Balancer support WebSockets? Which looks like a general connection stickiness. See AWS docs here.
How can I do a metadata-based WebSocket stickiness with ALB? (Or with something else in AWS).

For most of the applications, you can use AWS ELB(Classic Load Balancers) by "Sticky Sessions" feature.
By default, a Classic Load Balancer routes each request independently to the registered instance with the smallest load. However, you can use the sticky session feature (also known as session affinity), which enables the load balancer to bind a user's session to a specific instance. This ensures that all requests from the user during the session are sent to the same instance.
The key to managing sticky sessions is to determine how long your load balancer should consistently route the user's request to the same instance.
Also, the WebSockets connections are inherently sticky. If the client requests a connection upgrade to WebSockets, the target that returns an HTTP 101 status code to accept the connection upgrade is the target used in the WebSockets connection. After the WebSockets upgrade is complete, cookie-based stickiness is not used.
For more information, read the following doc on the AWS website:
https://docs.aws.amazon.com/elasticloadbalancing/latest/classic/elb-sticky-sessions.html
Eventually, you can use AWS ALB (Application Load Balancer), ALB supports Web Sockets.
Just replace the ELB with the ALB and enable sticky sessions.
The Application Load Balancer is designed to handle streaming, real-time, and WebSocket workloads in an optimized fashion. Instead of buffering requests and responses, it handles them in streaming fashion. This reduces latency and increases the perceived performance of your application.
For more information about AWS ALB, read the following doc on the AWS website:
https://docs.aws.amazon.com/elasticloadbalancing/latest/application/introduction.html

Related

Does global layer 7 http(s) load balancer has option to rate limiting?

I use global http(s) load balancer for backend services running on Kubernetes cluster. I didn't find any information on how to limit number of requests in a time window from one IP. There is Cloud Armor, but there also simple IP, region, and header based access can be performed. Could you please share how can I perform IP based rate limitation on global http load balancer on google cloud to provide defence against DoS attacks.
Edit:
The backend service on running on Kubernetes cluster is a symfony server with web interface. I want to use Cloud CDN for the server therefore I had to use gce ingress instead of ingress-nginx. On google cloud, gce ingress creates a global HTTP(s) load balancer and ingress-nginx creates TCP load balancer.
In the nginx-ingress, I could simply use nginx.ingress.kubernetes.io/limit-rps annotation, which helps in limiting flood of http requests. I want to do the similar configuration on my global HTTP(s) load balancer. In the current setting, I observed that flood of http requests are sent to the load balancer which are forwarded to the symfony server and at one point latency of request increases. Which makes the liveness probe fail for the pod.
Cloud Armor is a WAF that you can configure to protect your service against DoS attacks, especially by blocking specific IPs.
Rate limiting isn't to protect your service against DDoS. Indeed, if the attack flood your rate limiting service, your valid IPs and the bad IPs won't be served, because your service is flooded: it's a denial of service
Rate limiting helps to preserve resource for legit users. But some can try to overcome some constraint by using in a different (wrong/bad) manner your APIs.
For example, you have a paid API to export all the customer. You have a free API to request 1 customer. A user can say "Hey I don't want to pay, I will request in a loop the single customer API to create my daily report like that!". It's a misuse of the single customer API and you can protect it against this misuse with rate limiting
You can use Cloud Endpoint and ESP (Extensible Service Proxy). I wrote an article with an ESP deployed on Cloud Run, but you can deploy it on K8S also.
You can also use API gateway which is the managed service of ESP, that will be soon plugable on HTTPS load balancer (to use it in addition to WAF protection).

I want to deploy a multi-tier web app into AWS but don't understand how to set it up

I was hoping someone may be able to explain how I would setup a multi-tiered web application. There is a database tier, app tier, web server tier and then the client tier. I'm not exactly sure how to separate the app tier and web server tier since the app tier will be in a private subnet. I would have the client send the request directly to the app server but the private net is a requirement. And having the app server separated from the web server is a requirement as well.
The only idea I have had was to serve the content on the web server and then the client will send all requests to the same web server on another port. Like port 3000, if a request is captured on that port, a node app using express will forward the request to the app tier since the web server can speak to the app server.
I did setup a small proof of concept doing this. The web server serves the content, then I have another express app setup to listen on port 3000, the client sends the request on port 3000 and then it just sends the exact same thing back to the app server.
This is my current setup with the web servers hosting two servers. One to serve the frontend on port 80 and one to receive requests on port 3000. The server listening on port 3000 forwards all requests to the app server ALB(It's basically a copy of all the same routes on the app server but it just forwards the requests instead of performing an action). But is there a way to not have this extra hop in the middle? Get rid of the additional server that is listening on 3000 without exposing the internal ALB?
To separate your web servers and application servers, you can use a VPC with public and private subnets. In fact, this is such a common scenario that Amazon has already provided us with documentation.
As for a "better way to do this," I assume you mean security. Here are some options:
You can (and should) run host based firewalls such as IP tables on your hosts.
AWS also provides a variety of options.
You can use Security Groups, which are statefull firewalls for your hosts
You can also use Network Access Control Lists (ACLs), which are stateless firewalls used to control traffic in and out of subnets.
AWS would also argue that many shops can improve their security posture by using managed services, so that all of the patching and maintenance handled by AWS. For example, static content could be hosted on Amazon S3, with dynamic content provided by microservices leveraging API Gateway. Finally, from a security perspective AWS provides services like Trusted Advisor, which can help you find and fix common security misconfigurations.

Load balancing gRPC requests using one of AWS Load Balancers

I'm trying to work out whether I could use one of the (A/E/N)LBs to load balance gRPC traffic. A simple round robin would suffice in our case.
I've read that ALB doesn't fully support HTTP2 and therefore can't be used with gRPC. Specifically lack of support of sending HTTP2 traffic downstream and lack of support for trailer headers was mentioned. Is it still true?
Couldn't find any definitive answers with regards to NLBs or "classic" ELBs. Any hints?
As of October 29, 2020, Application Load Balancers now support HTTP/2 and gRPC load balancing. From the announcement:
To use the feature on your ALB, choose HTTPS as your listener protocol, gRPC as the protocol version for your target group and register instance or IP as targets for the configured target group. ALB provides rich content based routing features that will let you inspect gRPC calls and route them to the appropriate target group based on the service and method requested. Within a target group, ALB will use gRPC specific health checks to determine availability of targets and provide gRPC specific access logs to monitor your traffic.
The support for gRPC and end-to-end HTTP/2 is available for existing and new Application Load Balancers at no extra charge in all AWS Regions. To learn more, please refer to the blog post, demo, and the ALB documentation.
Using gRPC on AWS had some major challenges. Without full HTTP/2 support on AWS Application Load Balancer, you have to spin up and manage your own load balancers. Neither NLB and ELB are viable alternatives on AWS due to issues with traffic to and from the same host, dynamic port mappings, SSL termination complications, and sub-optimal client and server-side round-robining of TCP connections.
gRPC demonstrated performance improvements, however, it would take considerable infrastructure efforts to adopt, whether it be using LBs such as Nginx or Envoy; or setting up a service mesh with something of the likes of Istio. Another possibility would be to make use of thick client load balancing, though this would also require additional service discovery infrastructures such as Consul or ZooKeeper.
AWS recently announced a new service called AWS App Mesh. AWS App Mesh supports HTTP2 and gRPC services
gRPC can now model and manage their inter-service communications using AWS App Mesh.
Reference:
https://aws.amazon.com/about-aws/whats-new/2019/11/aws-app-mesh-now-supports-http2-and-grpc-services/
https://aws.amazon.com/app-mesh/
https://docs.aws.amazon.com/app-mesh/latest/userguide/what-is-app-mesh.html

How to create an scalable Websocket application using AWS elb?

I am developing an Websocket application and I am having doubts on how to create a scalable application.
1- Should I use Nginx? And if so, where does nginx stand? It would be like this:
ELB -> Nginx -> Ec2 instances
or
Nginx -> ELB -> Ec2 instances
2- Is it necessary to use a service like Redis to make the communication between servers? Example: I am connected to server1 and my friend is connected to server2, but we are in the same room chat. If I send a message, it needs to reach my friend.
3 - Is it possible to let my Elb receives only calls in https but the conversation with the backend is http? I ask this, because I use OpsWorks and it was very hard to normalize cookbooks to create my environment.
Thank you.
Generally the architecture looks like:
ALB --> nginx1,niginx2 --> ALB --> ec2 websocket server1, server2
This allows your web servers and app servers to be load balanced independently of each other
Not necessarily. Redis is used primarily as an in memory data store for caching.
Yes - You can terminate ssl on ALB and it is in fact recommended to do it this way in order to offload ssl processing on load balancer as opposed to doing it on instances themselves. See - https://docs.aws.amazon.com/elasticloadbalancing/latest/application/create-https-listener.html . Additional benefit of using this is that you can use ACM to issue certificates for free that can be deployed on ALB. ACM can handle renewals for you automatically as well.

Is any aws service suitable for sending real time updates to browser?

I'm developing a stocks app and have to keep users browser updated with pricing changes
I don't need to access past data, browser just have to get current data whenever it changes
is it possible to filter a dynamodb stream and expose an endpoint (behind api gateway) that could be used with a javascript EventSource?
I realize this is not using Server Sent Events but AWS just announced Serverless WebSockets for API Gateway. Pricing is based on minutes connected and number of messages sent.
Product Launch Article: https://aws.amazon.com/about-aws/whats-new/2018/12/amazon-api-gateway-launches-support-for-websocket-apis/
Documentation: https://docs.aws.amazon.com/apigateway/latest/developerguide/apigateway-websocket-api.html
Pricing: https://aws.amazon.com/api-gateway/pricing/
API Gateway is a store-and-forward service. It collects the response from whatever the back-end may happen to be (Lambda, an HTTP server, etc.) and then returns it en block to the browser -- it doesn't stream the response, so it would not be suited for use as an Eventsource.
AWS doesn't currently have a managed service offering that is obviously suited to this use case... you'd need a server (or more than one) on EC2, consuming the data stream and relaying it back to the connected browsers.
Assuming that running EC2 servers is an acceptable option, you then need HTTPS and load balancing. Application Load Balancer supports web sockets, so it also might also support an eventsource. A Classic ELB in TCP (not HTTP) mode should support an eventsource without a problem, though it might not correctly signal to the back-end when the browser connection is lost. Both of those balancers can also offload HTTPS for you. Network Load Balancer would definitely work for balancing an eventsource, but your instances would need to provide the HTTPS, since NLB doesn't offload it for you.
A somewhat unorthodox alternative might actually be AWS IoT, which has built-in websocket support... Not the same as eventsource, of course, but a streaming connection nonetheless... in such an environment, I suppose each browser user could be an addressable "thing."