I Would like to understand what is the difference between WAF, Security Group, and a routing table.
Let's say I have a VPC, 2 subnets (a private one) and I want to deploy a web application (UI and backend service and a database (RDS)), In this scenario where does WAF and security groups come into the picture.
Can someone help me to understand a use case?
HTTP protocol is built on top of the TCP protocol.
WAF inspects the HTTP traffic before it reaches your web application in order to block malicious web traffic.
In order to implement WAF in front of a containerized application (running on ECS for example) or in front of an application running on EC2 you should use an Application Load Balancer in front of the application servers and associate the WAF with that load balancer.
If your application runs on Lambda you can do the same but using API Gateway.
A Security Group accepts or blocks networking protocols such as TCP, UDP, ICMP - based on ports. Open up port 443 and 80 if you want to expose your web application.
Routing tables should be associated with your subnets so that the network traffic (TCP) can knows where to go.
Best practice is to put your application servers and databases in private subnets (with routing tables that does not route traffic from the Internet) and then put e.g Application Load Balancer in the public subnets in order to accept traffic from the Internet and route it to your private subnets.
Related
with this flow:
external world --> AWS API Gateway ---> VPC Link ---> Network Load Balancer ---> my single EC2 instance
How can I configure AWS Netword Load Balancer such that:
Requests to https://myapp.com is routed into port 80 of my EC2 instance.
Requests to https://myapp.com/api/* is routed into port 3000 of my EC2 instance.
?
Currently I have only configured one Listener on the NLB that listens on port 80 and all traffics from the API Gateway are routed to port 80 of my EC2 instance.
I have found that in Application Load Balancer, you can configure "Rules" that map path to different ports: Path based routing in AWS ALB to single host with multiple ports
Is this available with NLB?
This is not possible with the Network Load Balancer, because it operates on a level of the network stack that has no concept of Paths.
The NLB operates on Layer 4 and supports the protocols TCP and UDP. These essentially create a connection between ports on two machines that allow data to flow between them.
Paths as in HTTP(S) Paths are a Layer 5+ concept and belong to the HTTP Protocol. They're not available to the NLB because it can only work based on data that's guaranteed to be available there.
You can use an Application Load Balancer as the target for your Network Load Balancer and then configure the Path-based rules there, because the ALB is a layer 5+ load balancer and understands the Layer 5 protocol HTTP.
Here is a blog detailing this: Application Load Balancer-type Target Group for Network Load Balancer
When connecting to an AWS load balancer I know each one has different protocols, e.g. Network Lb is TCP and UDP and Application Lb is Http/s, but is it written into the application code somewhere which protocol the app should connect with?
I’m in Devops and trying to better understand how these connections work as I’m looking to move away from Classic Lb
And so if I wanted to know if I should create an NLB or ALB, should I be asking developers which protocol the application uses to connect to the internet?
Network Load Balancers (NLB) are Layer 4 load balancers, meaning that they can route TCP and UDP traffic. Since HTTP/HTTPS goes over TCP, NLBs can be used for HTTP/HTTPS traffic as well.
Application Load Balancer (ALB) are layer 7 load balancers, they can route HTTP/HTTPS traffic only (no bare TCP traffic, no UDP).
Although both ALB and NLB can route HTTP/HTTPS traffic, the difference between them is that ALB can understand the HTTP protocol, meaning that we can have routing rules based on HTTP headers, path variables, query params, etc. This is not possible if we are using NLBs, network load balancers can route based on source/destination IP and ports.
As a rule of thumb, use a ALBs if you have mainly web applications and you need all the features that a L7 load balancer can provide. Use NLB for anything else.
I have a PHP + Apache application running in ECS with an Application Load Balance sitting in front of it. Everything works fine except when the application makes request to itself and the request times out.
Let's say the URL to reach the application is www.app.com and in PHP I use Guzzle to send requests to www.app.com but that request will always time out.
I suspect it is a networking issue with ALB but I do not know how I can go about fixing it. Any help please?
Thanks.
As you're using ECS I would recommend replacing calls to a public load balancer with a service mesh instead to allow your application to keep all HTTP(S) traffic internal to the network. This will improve both security and performance (latency is reduced). AWS has an existing product that integrates with ECS to allow this functionality named App Mesh/
Alternatively if you want to stick with what you currently have setup you will need to check the following functionality:
If the hosts are ECS hosts are private then they will need to connect outbound by using a NAT Gateway/NAT Instance in the routing table for the 0.0.0.0/0 route. For Fargate this will depend on if the container is public or private.
If the host/container is public it will need the internet gateway added to its route table for the 0.0.0.0/0 route. Even if inbound access from the ALB to the host is private the host will always speak outbound to the internet via an internet gateway.
Ensure that inbound/outbound security groups allow access to either HTTP or HTTPS
I have a website hosted on Site Ground let's say www.test.com
I create a subdomain xyz.test.com and routed the traffic to a backend server A through a load balancer which is hosted in a privated subnet in a VPC. It works fine since I have create a listener on which forwards the traffic from 443 to 3000. Now I want to add one backend server B in the same private subnet and want the traffic hitting port 444 of the ELB to be routed to this server.I want the requests xyz.test.com:444 to go to to port 3010 of the server B. In short I want to route traffic to different instances behind a load balancer but my URL is the same just the ports are different.
How can I achieve this?
You want to setup an Application Load Balancer. From the documentation (emphasis mine):
A load balancer serves as the single point of contact for clients. The load balancer distributes incoming application traffic across multiple targets, such as EC2 instances, in multiple Availability Zones. This increases the availability of your application. You add one or more listeners to your load balancer.
A listener checks for connection requests from clients, using the protocol and port that you configure, and forwards requests to one or more target groups, based on the rules that you define. Each rule specifies a target group, condition, and priority. When the condition is met, the traffic is forwarded to the target group. You must define a default rule for each listener, and you can add rules that specify different target groups based on the content of the request (also known as content-based routing).
Some of the benefits that may interested you, over a Classic Load Balancer are:
Support for path-based routing. You can configure rules for your
listener that forward requests based on the URL in the request. This
enables you to structure your application as smaller services, and
route requests to the correct service based on the content of the URL.
Support for host-based routing. You can configure rules for your listener that forward requests based on the host field in the HTTP
header. This enables you to route requests to multiple domains using a
single load balancer.
Support for routing requests to multiple applications on a single EC2 instance. You can register each instance or IP address with the
same target group using multiple ports.
Support for registering targets by IP address, including targets outside the VPC for the load balancer.
I am trying to understand in which scenario I should pick a service registry over a load balancer.
From my understanding both solutions are covering the same functionality.
For instance if we consider consul.io as a feature list we have:
Service Discovery
Health Checking
Key/Value Store
Multi Datacenter
Where a load balancer like Amazon ELB for instance has:
configurable to accept traffic only from your load balancer
accept traffic using the following protocols: HTTP, HTTPS (secure HTTP), TCP, and SSL (secure TCP)
distribute requests to EC2 instances in multiple Availability Zones
The number of connections scales with the number of concurrent requests that the load balancer receives
configure the health checks that Elastic Load Balancing uses to monitor the health of the EC2 instances registered with the load balancer so that it can send requests only to the healthy instances
You can use end-to-end traffic encryption on those networks that use secure (HTTPS/SSL) connections
[EC2-VPC] You can create an Internet-facing load balancer, which takes requests from clients over the Internet and routes them to your EC2 instances, or an internal-facing load balancer, which takes requests from clients in your VPC and routes them to EC2 instances in your private subnets. Load balancers in EC2-Classic are always Internet-facing.
[EC2-Classic] Load balancers for EC2-Classic support both IPv4 and IPv6 addresses. Load balancers for a VPC do not support IPv6 addresses.
You can monitor your load balancer using CloudWatch metrics, access logs, and AWS CloudTrail.
You can associate your Internet-facing load balancer with your domain name.
etc.
So in this scenario I am failing to understand why I would pick something like consul.io or netflix eureka over Amazon ELB for service discovery.
I have a hunch that this might be due to implementing client side service discovery vs server side service discovery, but I am not quite sure.
You should think about it as client side load balancing versus dedicated load balancing.
Client side load balancers include Baker Street (http://bakerstreet.io); SmartStack (http://nerds.airbnb.com/smartstack-service-discovery-cloud/); or Consul HA Proxy (https://hashicorp.com/blog/haproxy-with-consul.html).
Client side LBs use a service discovery component (Baker Street uses a stateless pub/sub service discovery mechanism; SmartStack uses ZooKeeper; Consul HA Proxy uses Consul) as part of their implementation, but they provide the health checking / end-to-end functionality you're probably looking for.
AWS ELB and Eureka differ at many points:
Edge Services vs Mid-tier Services
AWS ELB is a load balancing solution for edge services exposed to end-user web traffic. Eureka fills the need for mid-tier load balancing.
Mid-tier server refers to an application server that sits between the user's machine and the database server where the processing takes place in. The middle tier server performs the business logic.
While you can theoretically put your mid-tier services behind the AWS ELB, in EC2
Classic you expose them to the outside world and thereby losing all the usefulness of the AWS security groups.
Dedicated vs Client side Load Balancing
AWS ELB is also a traditional proxy-based load balancing solution whereas with Eureka it is different in that the load balancing happens at the instance/server/host level in a round robin fashion. The client instances know all the information about which servers they need to talk to.
If you are looking for a sticky user session (all requests from a user during the session are sent to the same instance) based load balancing which AWS now offers, Eureka does not offer a solution out of the box.
Load Balancer Outages
Another important aspect that differentiates proxy-based load balancing from load balancing using Eureka is that your application can be resilient to the outages of the load balancers since the information regarding the available servers is cached on the Eureka client.
This does require a small amount of memory but buys better resiliency. The Eureka client gets all the registry information at once and in subsequent requests to the Eureka server, it only receives the delta i.e the changes in the registry information rather than the whole registry information. Also, Eureka servers can operate in cluster mode where each peer is not affected by the performance of other peers.
Scale and convenience
Also, imagine, 1000s of microservices running and each having multiple instances. You will require 1000 ELBs, one for each of the microservice, or something like HAProxy that sits behind the ELB to make layer 7 decisions based on the hostname, etc. and then forward the traffic to a subset of instances. While with Eureka, you only play with the application name which is far less complicated.
Service Discovery component usually has a notification component. It is not a load balancer eventhough some might have the capability to do so. It can notify registered clients about changes, for example a loadbalancer going down.
A client can query a service discovery/registry to get a load balancer that is running. Whereas a load balancer does not noitfy a client when it is down.
You should also read about EUREKA
Amazon ELB provides the EC2 instances to your service requests based on Load balancer and the IP addresses of the EC2 instances are not consistent so you can also use EUREKA which does the same job but is based on service registry and client side load balancing in which the Application client for each region has the registry.
You can read more about it here :
https://github.com/Netflix/eureka/wiki/Eureka-at-a-glance