request times out when pinging aws load balancer - amazon-web-services

I have a dockerized Node.JS express application that I am migrating to AWS from Google Cloud. I had done this before successfully on the same project before deciding Cloud Run was more cost effective because of their free tier. Now, wanting to switch back to Fargate, but am unable to do it again due what I'm guessing is a crucial step. For minimal setup, I used the following guide: https://docs.docker.com/cloud/ecs-integration/ Essentially, using docker compose up with aws context and project name to deploy to ECS and Fargate.
The Load Balancer gives me a public DNS name in the format: xxxxx.elb.us-west-2.amazonaws.com and I have defined a port of 5002 in my Docker container. I know the issue is not related to exposing port numbers or any code-related issue since I had this successfully running in Google Cloud Run. When I try to hit any of my express endpoints, by sending POST to xxxxx.elb.us-west-2.amazonaws.com:5002/my_endpoint, I end up with Error: Request Timed Out
Note: I have already verified that my inbound security rules have been set to all traffic.
I am very new to AWS, so would love guidance if I am missing a critical step.
Thanks!
EDIT (SOLUTION): Turns out everything was deploying correctly, but after checking CloudWatch Logs, it turns out Fargate can't read environment variables defined inside of docker-compose file. Instead, they need to be defined in .env files, then read in docker-compose through -env-file flag. My code was then trying to listen on a port that was in environment variable but was undefined, so was receiving the below error in CloudWatch.

Related

AWS ECS Task can't connect to RDS Database

I'm a newer AWS user and today I got stuck while working on a sample project. I successfully created a docker container that runs a simple R script that connects to my AWS RDS MySQL Database and creates & writes some basic files to it. I built a public ECR repository, pushed my docker image there, and built a ECS cluster & task choosing Fargate and using the container image from my repository. My task ran and I could see the R code being executed when I went through the logs, but it was never able to connect to the SQL Database and exited afterwards.
I've had to whitelist my own IP address in the security group for the RDS Database so that I can connect to it, so I'm aware I probably have to do that for my ECS task to establish that connection too. But won't that IP address constantly change because I won't have a static IP for the Fargate Server that is executing my task? I'm trying to stay on the free tier so I'm not sure I want to setup an elastic IP address for this server.
These 2 articles seem close if not the same issue I'm having but I can't figure out a solution. I haven't found any other info.
https://aws.amazon.com/premiumsupport/knowledge-center/ecs-fargate-task-database-connection/
https://aws.amazon.com/premiumsupport/knowledge-center/ecs-fargate-static-elastic-ip-address/
The end goal is to get this sample project successfully running on a scheduled fixed interval, and then running actual scripts on there to help automate things and make my life easier, so this sample project is a first step towards that. Any help or info on the questions I'm having would be appreciated !
Yes, your task is ephemeral (whether you launch it manually or as part of an ECS service) and its private/public ip address may change over time if it gets replaced. The way you'd make the connectivity rules to stick is to assign a security group to the task (that may have inbound access on a specific port you need I assume and outbound to everything) and assign another security group to the RDS db that has inbound access on port 3306 for the security group you assigned to the task (this is the trick, the SG will not change and you are telling RDS to allow access to ALL traffic coming from that SG). I see the first article you posted doesn't talk about this part (it should).

Why are outbound SSH connections from Google CloudRun to EC2 instances unspeakably slow?

I have a Node API deployed to Google CloudRun and it is responsible for managing external servers (clean, new Amazon EC2 Linux VM's), including through SSH and SFTP. SSH and SFTP actually work eventually but the connections take 2-5 MINUTES to initiate. Sometimes they timeout with handshake timeout errors.
The same service running on my laptop, connecting to the same external servers, has no issues and the connections are as fast as any normal SSH connection.
The deployment on CloudRun is pretty standard. I'm running it with a service account that permits access to secrets, etc. Plenty of memory allocated.
I have a VPC Connector set up, and have routed all traffic through the VPC connector, as per the instructions here: https://cloud.google.com/run/docs/configuring/static-outbound-ip
I also tried setting UseDNS no in the /etc/ssh/sshd_config file on the EC2 as per some suggestions online re: slow SSH logins, but that has not make a difference.
I have rebuilt and redeployed the project a few dozen times and all tests are on brand new EC2 instances.
I am attempting these connections using open source wrappers on the Node ssh2 library, node-ssh and ssh2-sftp-client.
Ideas?
Cloud Run works only until you have a HTTP request active.
You proably don't have an active request during this on Cloud Run, as outside of the active request the CPU is throttled.
Best for this pipeline is Cloud Workflows and regular Compute Engine instances.
You can setup a Workflow to start a Compute Engine for this task, and stop once it finished doing the steps.
I am the author of article: Run shell commands and orchestrate Compute Engine VMs with Cloud Workflows it will guide you how to setup.
Executing the Workflow can be triggered by Cloud Scheduler or by HTTP ping.

How to access the apache container of a task on AWS ECS?

I am setting up an infrastructure to deploy my application on AWS. I am using ECS service because I am trying to deploy a Docker-based application. So far I have created a task definition with two containers one for the apache and another one for PHP. Then I launched an ECS cluster with an EC2 instance and a task running. They all seem to be up and running. Now, I am trying to figure out how I can access the apache of my EC2 instance with the Cluster on the browser.
This is how I created the apache container.
And then I created the php container as follow.
Then I launched an EC2 based ECS cluster with one instance in it. Then I run one task within the cluster. Then I tried to open the public IP address of my instance. It just keeps loading loading and loading. What is wrong with my configuration? How can I access it on the browser?
It seems to me there's a couple of possible scenarios here you could check:
If do you reach the service and are stuck on an endless reloading loop, which might point to something in your code that could be causing it to do that,
If you're having a long wait time till the browser actually gives a timeout, which might be caused by not having the right port open on the Security Group associated with your task definition.

Unable to access REST service deployed in docker swarm in AWS

I used the cloud formation template provided by Docker for AWS setup & prerequisites to set up a docker swarm.
I created a REST service using Tibco BusinessWorks Container Edition and deployed it into the swarm by creating a docker service.
docker service create --name aka-swarm-demo --publish 8087:8085 akamatibco/docker_swarm_demo:part1
The service starts successfully but the CloudWatch logs show the below exception:
I have tried passing the JVM environment variable in the Dockerfile as :
ENV JAVA_OPTS= "-Dbw.rest.docApi.port=7778"
but it doesn't help.
The interesting fact is at the end the log says:
com.tibco.thor.frwk.Application - TIBCO-THOR-FRWK-300006: Started BW Application [SFDemo:1.0]
So I tried to access the application using CURL -
curl -X GET --header 'Accept: application/json' 'URL of AWS load balancer : port which I exposed while creating the service/resource URI'
But I am getting the below message:
The REST service works fine when I do docker run.
I have checked the Security Groups of the manager and load-balancer. The load-balancer has inbound open to all traffic and for the manager I opened HTTP connections.
I am not able to figure out if anything I have missed. Can anyone please help ?
As mentioned in Deploy services to swarm, if you read along, you will find the following:
PUBLISH A SERVICE’S PORTS DIRECTLY ON THE SWARM NODE
Using the routing mesh may not be the right choice for your application if you need to make routing decisions based on application state or you need total control of the process for routing requests to your service’s tasks. To publish a service’s port directly on the node where it is running, use the mode=host option to the --publish flag.
Note: If you publish a service’s ports directly on the swarm node using mode=host and also set published= this creates an implicit limitation that you can only run one task for that service on a given swarm node. In addition, if you use mode=host and you do not use the --mode=global flag on docker service create, it will be difficult to know which nodes are running the service in order to route work to them.
Publishing ports for services works different than for regular containers. The problem was; the image does not expose the port after running service create --publish and hence the swarm routing layer cannot reach the REST service. To resolve this use mode = host.
So I used the below command to create a service:
docker service create --name tuesday --publish mode=host,target=8085,published=8087 akamatibco/docker_swarm_demo:part1
Which eventually removed the exception.
Also make sure to configure the firewall settings of your load balancer so as to allow communications through desired protocols in order to access your applications deployed inside the container.
For my case it was HTTP protocol, enabling port 8087 on load balancer which served the purpose.

How to connect to specific instance behind Elastic Load Balancer

I'm deploying my app via Elastic Beanstalk, which creates and Elastic load balancer and puts all my instances behind it (3 or more).
Is there a way to contact each of these instances directly? I want to trigger a specific command on each instances (git pull command to synchronize with the latest code in my remote repo).
I have the list of IP address and public DNS of the instances from PHP SDK but since the firewalls rules restricts the source of IP address to the elastic load balancer IP on port 80, I can't seem to access them directly.
Is there a way around it?
P.S. The SSH port seems to open for all traffic, but how can I create a trigger with that? I'm hoping to create a PHP script to automate this with a webhook on the remote repo.
I highly suggest you use the EB CLI with git integration for all deployments, no matter how small. It is great because you can map a git branch to an environment with eb use YOUR_ENV then when you run eb deploy with that branch checked out it will deploy to that environment.
There is a lot of work involved in ensuring multiple servers pull the correct code and everything is working as expected. What if a server is in the processes of spinning up but is not ready for SSH so your script skips it and it does not get the new code?
Also, what happens when a new server spins up but it is using the old application because that's what is in EB? You could have your kickstart do a git pull but then what happens when you are not ready to push, a new server starts and is alone with the new code?
I could probably find 5 more edge cases without breaking a sweat. Look into eb deploy, you will be happy you did.
You need to setup a CI (or make a simple web service) and create a webhook in your repository. Your CI need to get all instances under your Elastic Beanstalk environment and then call git pull via SSH.
Or, just create a cron job in your all instances via .ebxensions script.
I thought it's not a good practice in Elastic Beanstalk to run git pull in order to synchronize your app with your git repo. Because, it misused the Application Version semantic meaning. Sometimes, you can't determine which app version are in your instances from Application Version. It's better to create a new Application Version in Elastic Beanstalk to deploy a new app version.
If you host your repo in Github, you can take a look into CodeDeploy.