What's the target group port for, when using Application Load Balancer + EC2 Container Service - amazon-web-services

I'm trying to setup an ALB which listens on port 443, load balancing to ECS Docker containers on random ports, lets say I have 2 container instances of the same task definition, listening on port 30000 and 30001.
When I try to create a target group in the AWS EC2 Management console, there's a "port" input field with 1-65535 range.
What number should I put there?
And when I try to create a new service in the AWS EC2 Container Service console, together with a new target group to connect to a existing ALB, there's no input field for a target group "port".
After it's created, navigating to the EC2 console, the new target group has port "80".
Do I have to listen on port 80?
But the health check happens against the "traffic port", which is the container port, 30000 and 30001, so what's the point?

Turns out, when combined with ECS, the target group's port doesn't mean anything. You don't need to listen on that port.

I ran into this situation myself at work. I noticed the target group port and the port of the registered instance were different. I've typically set them up to be the same thing so wondered what this was all about which led me to this thread. I couldn't find a good answer on AWS docs, but found this in the Terraform docs for aws_lb_target_group resource:
port - (Optional, Forces new resource) The port on which targets receive traffic, unless overridden when registering a specific target.
So, I guess it's just the default port used unless you override it. Makes sense.

I think what he's referring to is the health checks. If your ELB is listening on port 443 but your target group is set for port 80, then every health check for the target group will attempt a request on port 80 and get redirected to port 443 by the load balancer. This results in a 301 code, which is considered unhealthy. Only 200 codes are supposed to be considered healthy. At that point you either have all unhealthy targets all the time or you add 301 to the list of healthy codes which defeats the whole point in health checks because it will always return a 301 for port 80. You might as well just match the ports.

By default, a load balancer routes requests to its targets using the protocol and port number that you specified when you created the target group. Alternatively, you can override the port used for routing traffic to a target when you register it with the target group.

The port in the target group is used in conjunction with auto-scaling groups and if you ever plan to use those you want to use the right port from the start. Why? Because you can not change it after the target group has been created and auto-scaling will simply not work if you set the port wrong.

Related

ELB health check fail AWS

My ELB health check fails all the time but cannot figure it why (502 bad gateway).
I have a cluster (ECS) with a service that runs at least one task (Fargate) which is a Node API listening on port 3000 & 3001 (3000 for http & 3001 for https since I cannot use port below 1024).
I have an Elastic Load Balancer (application) that is listening on port 80. It forwards the trafic on a target group with protocol port 3000.
This target group has as target type: ip address since I use fargate and not EC2 for my tasks.
So when a task is turning on, I correctly see the private IP of the task registering into the target group.
My health route is server_ip_address/health and it returns a classic 200 status code. This route works well because I tried it directly from the public ip address of the task (quickly before it stopped because of the health check failing) and it returns a 200. I also tried it through the ELB dns name (so my-elb.eu-west-1.elb.amazonaws.com/health) and it worked well as well so I don't understand why the health check fail.
Anyone know what I missed ?
In the screenshot of your targets in the target group it is showing the port as 80, this means that the load balancer (and health check) will be attempting to connect to the Fargate container on port 80.
You mentioned that it should be served from port 3000, therefore you will need to ensure that the target group is listening on port 3000 instead. Once this is in place, assuming that the security group of the host allows inbound access the 502 error should go away.
To be clear the listener port is what port the client connects to, whereas the target port is the port the load balancer connects to your target on.

aws ECS, ECS instance is not registered to ALB target group

I create ECS service and it runs 1 ecs instance and I can see the instance is registered as a target of the load balancer.
Now I trigger a Auto Scaling Group (by just incrementing desired instance count) to launch a new instance.
The instance is launched and added to the ECS cluster. (I can see it on ECS instances tab)
But the instance is not added to the ALB target. (I expect to see 2 instances in the following image, but I only see 1)
I can edit AutoScalingGroup 's target group like the following
Then I see the following .
But the health check fails. It seems the 80 port is not reachable.
Although I have port 80 open for public in the security group for the instance. (Also, instance created from ecs service uses dynamic port mapping but instance created by ALS does not)
So AutoScalingGroup can launch new instance but my load balancer never gives traffic to the new instance.
I did try https://aws.amazon.com/premiumsupport/knowledge-center/troubleshoot-unhealthy-checks-ecs/?nc1=h_ls and it shows I can connect to port 80 from host to the docker container by something like curl -v http://${IPADDR}/health.
So it must be the case that there's something wrong with host port 80 (load balancer can't connect to it).
But it is also the case the security group setting is not wrong, because the working instance and this non working instance is using the same SG.
Edit
Because I used dynamic mapping, my webserver is running on some random port.
As you can see the instance started by ecs service has registered itself to target group with random port.
However instance started by ALB has registered itself to target group with port 80.
The instance will not be added to the target group if it's not healthy. So you need to fix the health check first.
From your first instance, your mapped port is 32769 so I assume if this is the same target group and if it is the same application then the port in new instance should be 32769.
When you curl the IP endpoint curl -I -v http://${IPADDR}/health. is the HTTP status code was 200, if it is 200 then it should be healthy if it's not 200 then update the backend http-status code or you can update health check HTTP status code.
I assume that you are also running ECS in both instances, so ECS create target group against each ECS services, are you running some mix services that you need target group in AS group? if you are running dynamic port then remove the health check path to traffic port.
Now if we look the offical possible causes for 502 bad Gateway
Dynamic port mapping is a feature of container instance in Amazon Elastic Container Service (Amazon ECS)
Dynamic port mapping with an Application Load Balancer makes it easier
to run multiple tasks on the same Amazon ECS service on an Amazon ECS
cluster.
With the Classic Load Balancer, you must statically map port numbers
on a container instance. The Classic Load Balancer does not allow you
to run multiple copies of a task on the same instance because the
ports conflict. An Application Load Balancer uses dynamic port mapping
so that you can run multiple tasks from a single service on the same
container instance.
Your created target group will not work with dynamic port, you have to bind the target group with ECS services.
dynamic-port-mapping-ecs
HTTP 502: Bad Gateway
Possible causes:
The load balancer received a TCP RST from the target when attempting to establish a connection.
The load balancer received an unexpected response from the target, such as "ICMP Destination unreachable (Host unreachable)", when attempting to establish a connection. Check whether traffic is allowed from the load balancer subnets to the targets on the target port.
The target closed the connection with a TCP RST or a TCP FIN while the load balancer had an outstanding request to the target. Check whether the keep-alive duration of the target is shorter than the idle timeout value of the load balancer.
The target response is malformed or contains HTTP headers that are not valid.
The load balancer encountered an SSL handshake error or SSL handshake timeout (10 seconds) when connecting to a target.
The deregistration delay period elapsed for a request being handled by a target that was deregistered. Increase the delay period so that lengthy operations can complete.
http-502-issues
It seems you know the root cause, which is that port 80 is failing the health check and thats why it is never added to ALB. Here is what you can try
First, check that your service is listening on port 80 on the new host. You can use command like netcat
nv -v localhost 80
Once you know that the service is listening, the recommended way to allow your ALB to connect to your host is to add a Security group inbound rule for your instance to allow traffic from your ALB security group on port 80

AWS Application Load Balancer health checks fail

I have an ecs fargate cluster with an ALB to route the traffic to. The docker containers are listening on port 9000.
My containers are accessible over the alb dns name via https. That works. But they keep getting stopped/deregistered from the target group and restarted only to be in unhealthy state immediately after they are registered in the target group.
The ALB has only one listener on 443.
The security groups are set up so that the sg-alb allows outbound traffic on port 9000 to sg-fargate and sg-fargate allows all inbound traffic on port 9000 from sg-alb.
The target group is also setup to use port 9000.
I'm not sure what the problem is, or how to debug it.
Everything is set up with cdk. Not sure if that's relevant.
As it turns out this was not a problem with security groups. It was just coincidental, that it worked at the time when I changed the security groups.
It seems the containers aren't starting fast enough to accept connections from the alb when it starts the health checks.
What helped:
changing healthCheckGracePeriod to two minutes
tweaking the healthcheck paremeters for the target group, interval, unhealthyThreshold, healthyThreshold
Also, in my application logs it looks like the service gets two health check requests at once. Per default the unhealthy threshold is set to 2. So maybe the service was marked unhealthy only after one health check.

AWS auto scaling targets in target groups for Network Load Balancers

Recently started using Network Load Balancer which listens on port 80 and forwards traffic to my target group. My autoscaling group is configured to add any new targets to this target group.
However, my application on the target EC2 instances runs on port 8001, not 80. So my targets should register under port 8001 in the target group. The auto-scaling configuration doesn't seem to support that. All new instances created by auto scaling are added as targets with port 80 and there is no way to auto specify which port that should be used instead (8001 for me).
Any ideas how to implement this?
The port definition in the target group is the port definition you're looking for. The port in the target group is the port on which the targets receive traffic. The port on the listener is the port on which the load balancer listens for requests.
So you should set port 80 on the listener and port 8001 on the target group.
What kind of application are you using (web server, application server, ...)? Maybe ALB would be more suitable for you as it works on layer 7 of OSI model, therefore it is able to proccess HTTP headers, for example.
Back to your question; To be able to forward traffic to your EC2 instances, that runs application on port 8001, you have to set port on your target group to 8001. Auto-scaling group knows nothing about what application is running on EC2 it provisions, nor about ports that are used by that application.
So the final flow is like:
LB listens on port 80 and forwards traffic to target group on port 8001. This target group then sends traffic to its targets (your EC2 instances) on port 8001.
I was crying over this for hours and the answers here gave me a clue and finally, I found out what the heck is going on!
The story of ports can get complicated, but let's clarify them.
Four Ports in the story!
It's crucial to know that you are dealing with 4 ports here! So let's name them one by one.
(I am using ECS, but the same applies to anything else that manages your code to ELB.)
P1: LB's "Listening" Port
The port where the Load Balancer receives traffic. Usually 80 or 443.
P2: TG's Port
The port where the Target Group is set to work on. It's baked into the Target Group at the time of creation of TG (as jpact mentioned) and shows in the description. You cannot change it later.
You, however, can set the health-check of the Target Group on a different port, but it doesn't help you have a working set-up.
P3: Container's "exposed" port
This is the port that the container is giving out and expects to receive traffic.
P4: The application's/host's port
This is the port your code (say a Node.js app) is actually listening to.
Want no pain? set P2 = P3
The thing is P2 (TG's) and P3 (container's) can differ. In fact, you will not even face a challenge if they are set to arbitrarily different numbers, initially.
Application <> LB (e.g. in ECS)
When you register a Service's container to a TG, no one asks about ports and it can work well. You just say attach this container to that TG and it automatically pick's the "container's" port (P3) here.
And then if you go to your TG's page you will see it is on the container's port (P3) which is a different port that the TG's (P2), but it works well and who cares!
Here comes Auto Scaling!
The headache begins when you add Auto Scaling with healthcheck!
ASG knows it should create instances on TG. However, it needs to assign a port between EC2 instance and TG. And clearly, ASG doesn't know what container (P3) is going to be in it, so by default it picks TG's port (P2). And this is where the craziness happens!
TL;DR
Set TG's port == Container's port. They can be anything you want, but make them match.
Below the container and to to your app (P4) you can have a mapping. Above TG (on LB) you have another port mapping (P1). But these P2 and P3 MUST match!
Go ahead, create new TG's with the container's port (P3), wire them to above (ELB) and below (Services) and hopefully it will work well!
PS. Apparently, you cannot change a Service's TG! So new Service as well... :)

Health check on container port and host port in ECS + ALB

I have a problem with my deployment in ECS.
I try to deploy 4 instances of 2 docker images on 2 EC2 instances with an ALB in front.
So in my tasks definitions, I use the dynamic port mapping (2 Nginx on container port 80).
This creates a trouble in the Health check of my target group.
In fact, for each instance, I have a health check on the dynamics port (that is ok) and on the container port (80).
So the dynamic port says, it's ok. And the container port, logically says unhealthy ...
(Like in my screenshot)
So can you help me ton find why I have that type of error (this error make my server terminated each 5 minutes....)
Thanks in advance for your help :D
So to me it looks like you aren't completely using dynamic port mapping. For dynamic port mapping you have
Client -> ALB (port 80) -> EC2 host (dynamic port) -> container (dynamic port) -> nginx (port 80)
None of your healthchecks should be hitting port 80 since the only thing that uses port 80 is external connection into your application and nginx (but it is mapped to a different port). For ALB healthchecks all you really need is a path to hit and the port will default to the port that it connects on.
See the host port mapping in this doc: http://docs.aws.amazon.com/AmazonECS/latest/APIReference/API_PortMapping.html
ALB Health Check Docs: http://docs.aws.amazon.com/elasticloadbalancing/latest/application/target-group-health-checks.html
I found the solution with the AWS support.
So there are two problems here:
To disable the health which kill the EC2 instance, go to the auto scaling group and switch health check to "EC2" type
To remove the health check on port 80, go to the auto scaling group, and under "Target groups" section, removes the target groups managed by ECS