I want to add the servers which are behind the AWS Auto Scaling Group to the Nginx configuration file , I see with Nginx plus there is an agent nginx-asg-sync which we can use directly and it will do the work .
Is there any possibility that we can use the same in Nginx open source service ? , I am using Nginx open source and I am not finding a way to come up from this issue
Thanks
in AWS you only need to know how CLI/API works.
you can build this agent using only two cli commands:
aws autoscaling describe-auto-scaling-groups --auto-scaling-group-names {PARAMS}
where {PARAMS} you query auto scaling group name and get instances IDs from it.
the second command is:
aws ec2 describe-instances --instance-ids {PARAMS}
then all you have to do is to build all the logic around this, for example in bash script you create nginx upstream template, and everytime new instance was launched you compare ip addresses and swap upstreams and reload nginx. or you can simply add/delete ip with sed
here is more eamples how you can do this:
https://serverfault.com/questions/704806/how-to-get-autoscaling-group-instances-ip-adresses
also you can add health check before changing upstreams.
Currently we are passing our requests through an AWS Network Load Balancer and then onto an AWS Application Load Balancer. However, we are trying to preserve the original IP address of the request, but this is being stripped out. We are attempting to enable Proxy Protocol v2, but this causes error. Does the AWS ALB speak proxy protocol v2?
Does the AWS ALB speak proxy protocol v2?
No it does not. The proxy protocol is for NLB and CLB only as they operate (CLB has TCP listeners) in layer 4. ALB is layer 7 and it uses X-Forwarded-For, X-Forwarded-Proto, and X-Forwarded-Port to preserve IP source information.
From this document it looks like you do it on the Target Groups:
https://docs.amazonaws.cn/en_us/elasticloadbalancing/latest/network/elb-ng.pdf
Specifically:
To enable proxy protocol v2 using the new console
Open the Amazon EC2 console at https://console.amazonaws.cn/ec2/.
On the navigation pane, under LOAD BALANCING, choose Target Groups.
Choose the name the target group to open its details page.
On the Group details page, in the Attributes section, choose Edit.
On the Edit attributes page, select Proxy protocol v2.
Choose Save changes.
Or even by script (taken from https://docs.cloudbees.com/docs/cloudbees-ci/latest/eks-install-guide/eks-prerequisites-helm-install) , e.g.:
#!/bin/bash -eu
export AWS_PAGER=""
hostname=$(kubectl get -n ingress-nginx services ingress-nginx-controller --output jsonpath='{.status.loadBalancer.ingress[0].hostname}')
loadBalancerArn=$(aws elbv2 describe-load-balancers --query "LoadBalancers[?DNSName==\`$hostname\`].LoadBalancerArn" --output text)
targetGroupsArn=$(aws elbv2 describe-target-groups --load-balancer-arn $loadBalancerArn --query TargetGroups[\*].TargetGroupArn --output text)
for targetGroupArn in $targetGroupsArn; do
aws elbv2 modify-target-group-attributes --target-group-arn $targetGroupArn --attributes Key=proxy_protocol_v2.enabled,Value=true --output text
done
I have service that exposes multiple ports and it worked fine with kubernetes but now we move it to AWS ECS. It seems I can only expose ports via Load Balancer and I am limited to 1 port per service/tasks even when docker defines multiple ports I have to choose one port
Add to load balancer button allows to add one port. Once added there is no button to add second port.
Is there any nicer workarround than making second proxy service to expose second port?
UPDATE: I use fargate based service.
You don't need any workaround, AWS ECS now supports multiple target groups within the same ECS service. This will be helpful for the use-cases where you wanted to expose multiple ports of the containers.
Currently, if you want to create a service specifying multiple target groups, you must create the service using the Amazon ECS API, SDK, AWS CLI, or an AWS CloudFormation template. After the service is created, you can view the service and the target groups registered to it with the AWS Management Console.
For example, A Jenkins container might expose port 8080 for the
Jenkins web interface and port 50000 for the API.
Ref:
https://docs.aws.amazon.com/AmazonECS/latest/developerguide/register-multiple-targetgroups.html
https://aws.amazon.com/about-aws/whats-new/2019/07/amazon-ecs-services-now-support-multiple-load-balancer-target-groups/
Update:
I was able to configure target group using Terraform but did not find so far this option on AWS console.
resource "aws_ecs_service" "multiple_target_example" {
name = "multiple_target_example1"
cluster = "${aws_ecs_cluster.main.id}"
task_definition = "${aws_ecs_task_definition.with_lb_changes.arn}"
desired_count = 1
iam_role = "${aws_iam_role.ecs_service.name}"
load_balancer {
target_group_arn = "${aws_lb_target_group.target2.id}"
container_name = "ghost"
container_port = "3000"
}
load_balancer {
target_group_arn = "${aws_lb_target_group.target2.id}"
container_name = "ghost"
container_port = "3001"
}
depends_on = [
"aws_iam_role_policy.ecs_service",
]
}
Version note:
Multiple load_balancer configuration block support was added in Terraform AWS Provider version 2.22.0.
ecs_service_terraform
I can't say that this will be a nice workaround but I was working on a project where I need to run Ejabberd using AWS ECS but the same issue happened when its come to bind port of the service to the load balancer.
I was working with terraform and due to this limitation of AWS ECS, we agree to run one container per instance to resolve the port issue as we were supposed to expose two port.
If you do not want to assign a dynamic port to your container and you want to run one container per instance then the solution will definitely work.
Create a target group and specify the second port of the container.
Go to the AutoScalingGroups of your ECS cluster
Edit and add the newly created target group of in the Autoscaling group of the ECS cluster
So if you scale to two containers it's mean there will be two instances so the newly launch instance will register to the second target group and Autoscaling group take care of it.
This approach working fine in my case, but few things need to be consider.
Do not bind the primary port in target, better to bind primary port in
ALB service. The main advantage of this approach will be that if your
container failed to respond to AWS health check the container will be
restart automatically. As the target groupe health check will not recreate your container.
This approach will not work when there is dynamic port expose in Docker container.
AWS should update its ECS agent to handle such scenario.
I have faced this issue while creating more than one container per instances and second container was not coming up because it was using the same port defined in the taskdefinition.
What we did was, Created an Application Load balancer on top of these containers and removed hardcoded ports. What application load balancer does when it doesn't get predefined ports under it is, Use the functionality of dynamic port mapping. Containers will come up on random ports and reside in one target group and the load balancer will automatically send the request to these ports.
More details can be found here
Thanks to mohit's answer, I used AWS CLI to register multiple target groups (multiple ports) into one ECS service:
ecs-sample-service.json
{
"serviceName": "sample-service",
"taskDefinition": "sample-task",
"loadBalancers":[
{
"targetGroupArn":"arn:aws:elasticloadbalancing:us-west-2:0000000000:targetgroup/sample-target-group/00000000000000",
"containerName":"faktory",
"containerPort":7419
},
{
"targetGroupArn":"arn:aws:elasticloadbalancing:us-west-2:0000000000:targetgroup/sample-target-group-web/11111111111111",
"containerName":"faktory",
"containerPort":7420
}
],
"desiredCount": 1
}
aws ecs create-service --cluster sample-cluster --service-name sample-service --cli-input-json file://ecs-sample-service.json --network-configuration "awsvpcConfiguration={subnets=[subnet-0000000000000],securityGroups=[sg-00000000000000],assignPublicIp=ENABLED}" --launch-type FARGATE
If the task needs internet access to pull image, make sure subnet-0000000000000 has internet access.
Security group sg-00000000000000 needs to give relevant ports inbound access. (in this case, 7419 and 7420).
If the traffic only comes from ALB, the task does not need public IP. Then assignPublicIp can be false.
Usually, I use the AWS CLI method itself by creating task def, target groups and attaching them to the application load balancer. But the issue is that when there is multiple services to be done this is a time-consuming task so I would use terraform to create such services
terraform module link This is a multiport ECS service with Fargate deployment. currently, this supports only 2 ports. when using multiports with sockets this socket won't be sending any response so the health check might fail.
so to fix that I would override the port in the target group to other ports and fix that.
hope this helps
I am using AWS application load balancer to connect to a target group that has an EC2 instance with docker installed using cloud init scripts. I am executing an Nginx dockercontainer inside EC2.
I am getting a request time out exception as an information.
I connected to the target and checked if the service is available. I received nginx default page. Performing a curl -I on the internal IP also gives a response code as 200.
Please help me in understanding how I can troubleshoot this to get the root cause.
Thanks in advance
The configuration should be:
A security group on the Application Load Balancer (ALB-SG) permitting inbound traffic from, presumably, the whole Internet (0.0.0.0/0) on the appropriate ports (80, 443?)
A security group on the EC2 instance (App-SG) that permits inbound access from ALB-SG on the appropriate ports (80, 443?)
That is, App-SG should specifically reference ALB-SG. (Type in the name, it will resolve to a sg-xxx ID.)
I am using cloud formation template to build the infrastructure (ECS fargate cluster).
Template executed successfully and stack has been created successfully. However, task has failed with the following error:
Task failed ELB health checks in (target-group arn:aws:elasticloadbalancing:eu-central-1:890543041640:targetgroup/prc-service-devTargetGroup/97e3566c8b307abf)
I am not getting what and where to look for this to troubleshoot the issue.
as it is fargate cluster, I am not getting how to login to container and execute some health check queries to debug further.
Can someone please help me to guide further on this and help me?
Due to this error, I am not even able to access my web app. As ALB won't route the traffic if it is unhealthy.
What I did
After some googling, I found this post:
https://aws.amazon.com/premiumsupport/knowledge-center/troubleshoot-unhealthy-checks-ecs/
However, I guess, this is related to EC2 compatibility in fargate. But in my case, EC2 is not there.
If you feel, I can paste the entire template as well.
please help
This is resolved.
It was the issue with the following points:
Docker container port mapping with host port were incorrect
ALB health check interval time was very short. Due to that, ALB was giving up immediately, not waiting for docker container to up and running properly.
after making these changes, it worked properly
There are quite a few of different possible reasons for this issue, not only the open ports:
Improper IAM permissions for the ecsServiceRole IAM role
Container instance security group Elastic Load Balancing load
balancer not configured for all Availability Zones Elastic Load
Balancing load balancer health check misconfigured
Unable to update the service servicename: Load balancer container name or port changed in task definition
Therefore AWS created an own website in order to address the possibilities of this error:
https://docs.aws.amazon.com/en_en/AmazonECS/latest/developerguide/troubleshoot-service-load-balancers.html
Edit: in my case the health check code of my application was different. The default is 200 but you can also add a range such as 200-499.
Let me share my experience.
In my case everything was correct, except the host on which the server listens, it was localhost which makes the server not reachable from the outside world and respectively the health check didn't work. It should be 0.0.0.0 or empty in some libraries.
I got this error message because the security group between the ECS service and the load balancer target group was only allowing HTTP and HTTPS traffic.
Apparently the health check happens over some other port and or protocol as updating the security group to allow all traffic on all ports (as suggested at https://docs.aws.amazon.com/AmazonECS/latest/userguide/create-application-load-balancer.html) made the health check work.
I had this exact same problem. I was able to get around the issue by:
navigate to EC2 service
then select Target Group in the side panel
select your target group for your load balancer
select the health check tab
make sure the health check for your EC2 instance is the same as the health check in the target group. This will tell your ELB to route its traffic to this endpoint when conducting its health check. In my case my health check path was /health.
In my case, ECS Fargate orchestration of the docker container functionality as a service and not a Web app or API. The service is that is not listening to any port (eg: Schedule corn/ActiveMQ message consumer ...etc).
In order words, it is a client and not a server node. So I made to listen to localhost for health check only...
All I added health check path in Target Group to -
And below code in index.ts -
import express from 'express';
const app = express();
const port = process.env.PORT || 8080;
//Health Check
app.get('/__health', (_, res) => res.send({ ok: 'yes' }));
app.listen(port, () => {
logger.info(`Health Check: Listening at http://localhost:${port}`);
});
As mentioned by tschumann above, check the security group around the ECS cluster. If using Terraform, allow ingress to all docker ephemeral ports with something like below:
resource "aws_security_group" "ecs_sg" {
name = "ecs_security_group"
vpc_id = "${data.aws_vpc.vpc.id}"
}
resource "aws_security_group_rule" "ingress_docker_ports" {
type = "ingress"
from_port = 32768
to_port = 61000
protocol = "-1"
cidr_blocks = ["${data.aws_vpc.vpc.cidr_block}"]
security_group_id = "${aws_security_group.ecs_sg.id}"
}
Possibly helpful for someone.. our target group health check path was set to /, which for our services pointed to Swagger and worked well. After updating to use Springfox instead of manually generating swagger.json, / now performs a 302 redirect to /swagger-ui.html, which caused the health check to fail. Since this was for a Spring Boot service we simply pointed the health check path in the target group to /health instead (OOTB Spring status page).
Solution is partial correct in response 'iravinandan', but in last part of your nodejs router just simple add status(200) and that's it. Or you can set your personal status clicking on advance tab, on end of the page.
app.get('/__health', (request, response) => response.status(200).end(""));
More info here: enter link description here
Regards
My case was a React application running on FARGATE mode.
The first issue was that the Docker image was built over NodeJS "serving" it with:
CMD npm run start # react-scripts start
Besides that's not a good practice at all, it requires a lot of resources (4GB & 2vCPU were not enough), and because of that, the checks were failing. (this article mentions this as a probable cause)
To solve the previous issue, we modify the image as a multistage build with NodeJS for the building phase + NGINX for serving the content. Locally that was working great, but we haven't realized that the default port for NGINX is 80, and you can not use a different host and container port on FARGATE with awsvpc network mode.
To troubleshoot it, I launched an EC2 instance with the right Security Groups to connect with the FARGATE targets on the same port the Load Balancer was failing to perform a Health Check. I was able to execute curl's commands against other targets, but with this unhealthy target (constantly being recycled) I received an instant Connection refused response. It wasn't a timeout, which told me that the target was not able to manage that request because it was not listening to that port. Then I realized that my container was expecting traffic on port 80 and my application was configured to work on a 3xxx port.
The solution here was to modify the default configuration of NGINX to listen to the port we wanted, re-build the image and re-launch the service.
On my case, my ECS Fargate service does not need load balancer so I've removed "Load Balancer" and "Security Group" then it works.
I had the same issue with deploying a java springboot app on ACS running as a fargate. There were 3 issues which I had to address to fix the problem, if this can help others in future.
The container was running on port 8080 (because of tomcat), so the ELB, target group and the two security groups (one with ELB and one with ECS) must allow 8080 in their inbounds rules. Also the task set up had to be revised to change the container to map at 8080.
The port on target group health check section (advance settings) had to be explicitly changed to 8080 instead of 80 as the default.
I had to create a dummy health check path in the application because pinging the root of the app at "/" was resulting in a 302 error code.
Hope this helps.
I have also faced the same issue while using the AWS Fargate.
Here are some possible solutions to try:
First Check the Security group of Service that Attached has outbound and Inbound rules in place.
If you are using the Loadbalancer and pointing out to target group then you must enable the docker container port on security group and attached the inbound traffic only coming from the ALB security group
3)Also check the healthcheck endpoint that we are assigning to target group are there any dependanies it should return only 200 status repsonse / what we have specifed in target group
In my case it was a security group rule which allowed connections only from a certain IP, and this was blocking healthchecks from LB. I added VPC's cidr as another rule to the security group and then it worked.