I have created an AWS Network Load Balancer, with TCP:80 (HTTP) listener. This listener forward requests to a Target Group called "My-TargetGroup."
I have created a Task Defintion, that points to a Docker Image of the Spring Boot service, that runs on port 8080. In ECS, When I created the ECS Service I selected "My-TargetGroup", with listener port at 80.
I can see that my ECS Service has one Task running successfully. However, I do not know how to test whether NLB is able to forward the request to the underlying spring boot service. For eg. in my Spring boot API, I have a the endpoint myapi/faq. How do I call this API through curl?. Basically, I will be calling this API end point as http/https method. So I want to now test this API as a get call through https protocol
You can try to use netcat command for a variety of connectivity tests. Here is the syntax
nc -v {host} {port}
With -v(verbose) option, you should ideally see an output if your server socket returns something on connection.
Before curl, I will suggest making the sure thing on the infrastructure side, cur l will never help you to debug the issue. curl may work in your case but normally network Load balancer work on netwrok layer so you can try with telnet.
telnet lb_endpoint 80
But what you need is to make sure is the target healthy?
So if the target is healthy application is running and LB should response and if not then check security group of LB.
If the target is unhealthy something wrong with ECS services so do need to debug LB.
Related
So the current issue I have is that before I was able to connect properly to my rabbitMQ cluster that was hosted on AWS MQ. After I changed its IP visibility to private I had to create some configuration to access the cluster from outside the VPC.
Current example of how the cluster is accessed:
mq.example.com -> Load balancer (w/target group to cluster host IP & TLS port 5671) in public VPC -> Cluster in private VPC.
I've done the same thing for the web console. Now the web console works perfectly, so the issue isn't necessarily with the load balancing or a certificate issue. I then checked out if the issue could be with the code I wrote, but that is also not the case since sometimes from inside the services it connects, but sometimes it then doesn't. It throws the error: "Socket closed abruptly during opening handshake".
I think I believe where the issue may arise from, however I don't really have a proper view on how to solve it. I believe the issue has to do with the fact that the service has go through the load balancer first before it can connect to the rabbit cluster. I just don't know what to do about it and most documentation on amqplib is obscure as it is. I haven't found any (documented) similar issue with AWS MQ & a load balancer.
So my question, specifically is: How would I be able to resolve the fact that sometimes my services connect and don't connect to the cluster when they go through the load balancer?
Good to know: I use AWS MQ for rabbit, amqplib for the client connection, amqps as the protocol, web console works with the same setup but services don't.
For people who run into this issue later on I have found a solution:
When creating a Network Load Balancer to route traffic to your cluster you have to assign it a target group. Make sure to NOT DO THIS: Do not register both port 5671 (amqps) and 443 (web console) to the same target group. During routing issues will arise like this.
Instead do the following:
Create two target groups on aws EC2:
TG1: Register: TLS - 443 (web console)
TG2: Register: TLS - 5671 (amqps)
Your NLB that is configured to simple routing & alias for IPV4 connections then needs the following listeners:
Listener 1: TLS - 443 and assign it to TG1
Listener 2: TLS - 5671 and assign it to TG2
This should then make sure whenever you connect there is no confusion for the microservice you're trying to connect to the cluster.
You can then connect to your web console with your subdomain:
eg. webconsole.example.com
and to your services: eg. amqps://cluster.example.com:5671 as host (how your host is formatted depends on the library you're using for the clientside)
I have been at this for a couple of days and just cant figure it out.
I have tried this with gRPC in node.js and java on Elastic Beanstalk. On a normal VPS its quite simple just create a proxy grpcpass and it's set. I would like to move my micro services over to AWS Elastic Beanstalk but cant get the gRPC to connect.
What I did:
Created a new Java environment on Elastic Beanstalk and deployed my service. The gRPC server is on port 9086.
I have looked around the net and the closest thing I could find to a tutorial is New – Application Load Balancer Support for End-to-End HTTP/2 and gRPC but it does not cover how to setup the load balancer for gRPC for an instance.
Using the guide I made a few changes to the Target group like so:
Created a Target Group using the instances configuration
I have tried building the target group with both http and https for port 9086,
after creating the target group I registered the instance on the target group
After that I went to the load balancer and created a listener on port 443 and forwarded it to the target group. Port 443 is also open on the security policy.
The security listener settings pointing it to the AWS certificate allocated to the url.
I have tried both http and https on the target group on port 9086 but all my gRPC client calls fail with either status 13 or 14 meaning the request is not going through. I have confirmed in the logs the gRPC server is up and running.
Does anybody know where I am going wrong here? I feel like its something simple that I am missing, just can't find any tutorials or documentation on the proper way to set this up. Is what I am trying to do even possible on AWS Elastic Beanstalk?
From what I see on your screens, your ALB targets were added but they did not pass the health check. Meaning, that they are not allowed to accept any traffic yet.
You can find a good sample of a gRPC application with an implemented health check in the attached file in this article:
https://docs.aws.amazon.com/prescriptive-guidance/latest/patterns/deploy-a-grpc-based-application-on-an-amazon-eks-cluster-and-access-it-with-an-application-load-balancer.html#attachments-abf727c1-ff8b-43a7-923f-bce825d1b459
When I send requests using the ALB's DNS host, the listener's path, and the web services endpoint path, I don't get a response within the expected timeframe, which I've determined by successfully sending
requests directly to each of the tasks using their public ip addresses, they return successful responses.
For example:
The ALB's DNS entry: http://myapp-alb-11111111.us-west-1.elb.amazonaws.com
The web app, "abc", listens on port 80 for requests on "/api/health".
The web app is using "abc-svc/*" as the path in the listener.
The web app was assigned a public ip address of 10.88.77.66.
Sending a GET request to 'http://10.88.77.66/api/health' is successful.
Sending a GET request to 'http://myapp-alb-11111111.us-west-1.elb.amazonaws.com/abc-svc/api/health' does not return within several minutes, which is not expected behavior.
I've looked through the logs, but cannot find anything that is amiss. I'd appreciate any ideas or suggestions...
AWS CONFIGURATION
I have three docker images that are running in ECS. Each image is assigned to a separate service. Each service has a single task. Port 80 is open in the security group from the Internet to the ALB. Port 80 is open from the ALB to each task. The ALB's listener for port 80 is using path-based routing. There is a separate, unique path for each service. Each task contains a docker linux, spring boot 2, web service. Each web service's router has a "/api/health" route that expects a GET request with no parameters and returns a simple string. We are not using HTTP or SSL at this time.
Thank you for your time and interest.
Mike
There is a different reason for that but some of the common issues that you can debug
Check health check for each target group under LB target group, if its unhealthy LB will never route the traffic
Verify the target port is correct
Verify Target group associated properly with LB and is not showing unused.
Verify LB security group
Check the response from LB is it gateway timeout or service unavailbe if gateway timeout its not reachable if service unavailable probably restarting
Services Event logs, check that service is in steady-state or not, if not its mean restarting again and again
Check deployment logs of service, if you see unhealthy target group message then update the target group health path with status code
I create ECS service and it runs 1 ecs instance and I can see the instance is registered as a target of the load balancer.
Now I trigger a Auto Scaling Group (by just incrementing desired instance count) to launch a new instance.
The instance is launched and added to the ECS cluster. (I can see it on ECS instances tab)
But the instance is not added to the ALB target. (I expect to see 2 instances in the following image, but I only see 1)
I can edit AutoScalingGroup 's target group like the following
Then I see the following .
But the health check fails. It seems the 80 port is not reachable.
Although I have port 80 open for public in the security group for the instance. (Also, instance created from ecs service uses dynamic port mapping but instance created by ALS does not)
So AutoScalingGroup can launch new instance but my load balancer never gives traffic to the new instance.
I did try https://aws.amazon.com/premiumsupport/knowledge-center/troubleshoot-unhealthy-checks-ecs/?nc1=h_ls and it shows I can connect to port 80 from host to the docker container by something like curl -v http://${IPADDR}/health.
So it must be the case that there's something wrong with host port 80 (load balancer can't connect to it).
But it is also the case the security group setting is not wrong, because the working instance and this non working instance is using the same SG.
Edit
Because I used dynamic mapping, my webserver is running on some random port.
As you can see the instance started by ecs service has registered itself to target group with random port.
However instance started by ALB has registered itself to target group with port 80.
The instance will not be added to the target group if it's not healthy. So you need to fix the health check first.
From your first instance, your mapped port is 32769 so I assume if this is the same target group and if it is the same application then the port in new instance should be 32769.
When you curl the IP endpoint curl -I -v http://${IPADDR}/health. is the HTTP status code was 200, if it is 200 then it should be healthy if it's not 200 then update the backend http-status code or you can update health check HTTP status code.
I assume that you are also running ECS in both instances, so ECS create target group against each ECS services, are you running some mix services that you need target group in AS group? if you are running dynamic port then remove the health check path to traffic port.
Now if we look the offical possible causes for 502 bad Gateway
Dynamic port mapping is a feature of container instance in Amazon Elastic Container Service (Amazon ECS)
Dynamic port mapping with an Application Load Balancer makes it easier
to run multiple tasks on the same Amazon ECS service on an Amazon ECS
cluster.
With the Classic Load Balancer, you must statically map port numbers
on a container instance. The Classic Load Balancer does not allow you
to run multiple copies of a task on the same instance because the
ports conflict. An Application Load Balancer uses dynamic port mapping
so that you can run multiple tasks from a single service on the same
container instance.
Your created target group will not work with dynamic port, you have to bind the target group with ECS services.
dynamic-port-mapping-ecs
HTTP 502: Bad Gateway
Possible causes:
The load balancer received a TCP RST from the target when attempting to establish a connection.
The load balancer received an unexpected response from the target, such as "ICMP Destination unreachable (Host unreachable)", when attempting to establish a connection. Check whether traffic is allowed from the load balancer subnets to the targets on the target port.
The target closed the connection with a TCP RST or a TCP FIN while the load balancer had an outstanding request to the target. Check whether the keep-alive duration of the target is shorter than the idle timeout value of the load balancer.
The target response is malformed or contains HTTP headers that are not valid.
The load balancer encountered an SSL handshake error or SSL handshake timeout (10 seconds) when connecting to a target.
The deregistration delay period elapsed for a request being handled by a target that was deregistered. Increase the delay period so that lengthy operations can complete.
http-502-issues
It seems you know the root cause, which is that port 80 is failing the health check and thats why it is never added to ALB. Here is what you can try
First, check that your service is listening on port 80 on the new host. You can use command like netcat
nv -v localhost 80
Once you know that the service is listening, the recommended way to allow your ALB to connect to your host is to add a Security group inbound rule for your instance to allow traffic from your ALB security group on port 80
We are using Eureka with AWS ECS service that can scale docker containers.
In ECS if you leave out the host port, or specify it as being '0', in your task definition, then the port will be chosen automatically and reported back to the service. After the task is running, describing it should show what port(s) it bound to.
How does Eureka can resolve what port to use if we have several EC2 instance. For example Service A from EC2-A try to call Service B from EC2-B. So Eureka can resolve hostname , but cannot identify exposed port
Hi #Aleksandr Filichkin,
I don't think Application Load Balancer and service registry does the same.
The main difference traffic flows over the (application) load balancer whereas the service registry just gives you a healthy endpoint that your client directly can address (so the network traffic does not flow over the service registry).
Cheap is a very relative term, maybe it's cheap for some, maybe it's an unnecessary overhead for others.
The issue was resolved
https://github.com/Netflix/eureka/issues/937
Currently ECS agent knows about running port.
But I don't recommend to use Eureka with ECS, because Application Load Balancer does the same. It works as service registry and discovery. You don't need to run addition service(Eureka), ALB is cheap.
There is another solution.
You can create an application loadbalancer and a target group, in which the docker containers can be launched.
Every docker container has set their hostname to the hostname of the loadbalancer. If you need a pretty url, then you can utilize Route53 for DNS-Routing.
It looks like this:
Service Discovery with Loadbalancer-Hostname
Request Flow
If you have two containers of the same task on different hosts, both will communicate the same loadbalancer hostname to eureka.
With this solution you can use eureka with docker on AWS ECS without loosing the advantages and flexibility of dynamic port mapping.