AWS ECS Task can't connect to RDS Database - amazon-web-services

I'm a newer AWS user and today I got stuck while working on a sample project. I successfully created a docker container that runs a simple R script that connects to my AWS RDS MySQL Database and creates & writes some basic files to it. I built a public ECR repository, pushed my docker image there, and built a ECS cluster & task choosing Fargate and using the container image from my repository. My task ran and I could see the R code being executed when I went through the logs, but it was never able to connect to the SQL Database and exited afterwards.
I've had to whitelist my own IP address in the security group for the RDS Database so that I can connect to it, so I'm aware I probably have to do that for my ECS task to establish that connection too. But won't that IP address constantly change because I won't have a static IP for the Fargate Server that is executing my task? I'm trying to stay on the free tier so I'm not sure I want to setup an elastic IP address for this server.
These 2 articles seem close if not the same issue I'm having but I can't figure out a solution. I haven't found any other info.
https://aws.amazon.com/premiumsupport/knowledge-center/ecs-fargate-task-database-connection/
https://aws.amazon.com/premiumsupport/knowledge-center/ecs-fargate-static-elastic-ip-address/
The end goal is to get this sample project successfully running on a scheduled fixed interval, and then running actual scripts on there to help automate things and make my life easier, so this sample project is a first step towards that. Any help or info on the questions I'm having would be appreciated !

Yes, your task is ephemeral (whether you launch it manually or as part of an ECS service) and its private/public ip address may change over time if it gets replaced. The way you'd make the connectivity rules to stick is to assign a security group to the task (that may have inbound access on a specific port you need I assume and outbound to everything) and assign another security group to the RDS db that has inbound access on port 3306 for the security group you assigned to the task (this is the trick, the SG will not change and you are telling RDS to allow access to ALL traffic coming from that SG). I see the first article you posted doesn't talk about this part (it should).

Related

How can I deploy and connect to a postgreSQL instance in AlloyDB without utilizing VM?

Currently, I have followed the google docs quick start docs for deploying a simple cloud run web server that is connected to AlloyDB. However, in the docs, it all seem to point towards of having to utilize VM for a postgreSQL client, which then is connected to my AlloyDB cluster instance. I believe a connection can only be made within the same VPC and/or a proxy service via the VM(? Please correct me if I'm wrong)
I was wondering, if I only want to give access to services within the same VPC, is having a VM a must? or is there another way?
You're correct. AlloyDB currently only allows connecting via Private IP, so the only way to talk directly to the instances is within the same VPC. The reason all the tutorials (e.g. https://cloud.google.com/alloydb/docs/quickstart/integrate-cloud-run, which is likely the quickstart you mention) talk about a VM is that in order to create your databases themselves within the AlloyDB cluster, set user grants, etc, you need to be able to talk to it from inside the VPC. Another option for example, would be to set up Cloud VPN to some local network to connect your LAN to the VPC directly. But that's slow, costly, and kind of a pain.
Cloud Run itself does not require the VM piece, the quickstart I linked to above walks through setting up the Serverless VPC Connector which is the required piece to connect Cloud Run to AlloyDB. The VM in those instructions is only for configuring the PG database itself. So once you've done all the configuration you need, you can shut down the VM so it's not costing you anything. If you needed to step back in to make configuration changes, you can spin the VM back up, but it's not something that needs to be running for the Cloud Run -> AlloyDB connection.
Providing public ip functionality for AlloyDB is on the roadmap, but I don't have any kind of timeframe for when it will be implemented.

Connecting to DynamoDB from an EC2 instance running in an ECS cluster with VPC

I have an EC2 instance running inside an ECS cluster with VPC.
On the instance, I need to run a ECS task that needs access to DynamoDB.
When I try running the same task using Fargate, I can use the assignPublicIp = 'ENABLED' option to allow my task to have access to other AWS services, and everything works fine.
However, the assignPublicIp option is not available for the EC2 launch type, and I cannot figure out how to allow my EC2 instance have access to other AWS services.
I read the AWS docs and followed guides like this one to setup VPC endpoint for DynamoDB.
I also made sure that there aren't any network access restrictions by making sure that inbound/outbound rules for my NACL and security group for the VPC are wide open (at least for the sake of testing).
Here is how the rules look like, for both NACL and my security group:
Finally, I used the VPC > Reachability Analyzer to check if AWS can detect any problems regarding the connection path between my EC2 instance and DynamoDB, but the analysis reported a Reachable status.
It basically told me that there was no issues regarding establishing a connection along the following path:
Network interface for my EC2 instance (source)
Security group for the VPC
NACL for the VPC
Route table for the VPC
which includes the following route added by the VPC endpoint for DynamoDB
Destination: pl-02cd2c6b (com.amazonaws.us-east-1.dynamodb, 3.218.182.0/24, 3.218.180.0/23, 52.94.0.0/22, 52.119.224.0/20)
Target: the endpoint ID (e.g., vpce-foobar)
VPC endpoint for DynamoDB (destination)
Despite AWS telling me that I have a "Reachable" status, I still think it might be a network reachability problem, because when I run the task, the script I am running gets stuck right after it makes a GetItem call to DynamoDB.
If it was a permission error or an invalid parameter issue, I would get an error immediately, but everything just "hangs" there, until the task eventually times out.
Any pointers on what I might be missing here, or other workarounds would be very appreciated.
Thank you.
EDIT 1 (2021/02/13):
So I went back to the AWS docs to see if I had missed anything in setting up the VPC endpoints. I originally had one setup for DynamoDB, but since I also need to use S3 in my service, I went ahead and setup a Gateway VPC Endpoint for S3 too (I also wanted to see if the issue I am having is a generic network problem, or specific to DynamoDB).
Then, I made some changes to my script to try to make a call to S3 (to get the bucket's location, for simplicity) as the very first thing to do. I knew that the call would end up timing out, so I wanted to trigger the error immediately upon starting my script execution.
I waited until my task would eventually fail because of the timeout, and this time I noticed something interesting.
Here is the error logs I got when the task failed:
The IP address that my task was trying to reach was 52.85.146.194:443.
And here are the IP addresses that I found in the managed prefix list for S3, which I found in the VPC console:
The IP address I got the timeout error from is not on the list. Could this be a hint to the cause of the issue? Or am I missing something and there is actually nothing wrong with that?

How to access the apache container of a task on AWS ECS?

I am setting up an infrastructure to deploy my application on AWS. I am using ECS service because I am trying to deploy a Docker-based application. So far I have created a task definition with two containers one for the apache and another one for PHP. Then I launched an ECS cluster with an EC2 instance and a task running. They all seem to be up and running. Now, I am trying to figure out how I can access the apache of my EC2 instance with the Cluster on the browser.
This is how I created the apache container.
And then I created the php container as follow.
Then I launched an EC2 based ECS cluster with one instance in it. Then I run one task within the cluster. Then I tried to open the public IP address of my instance. It just keeps loading loading and loading. What is wrong with my configuration? How can I access it on the browser?
It seems to me there's a couple of possible scenarios here you could check:
If do you reach the service and are stuck on an endless reloading loop, which might point to something in your code that could be causing it to do that,
If you're having a long wait time till the browser actually gives a timeout, which might be caused by not having the right port open on the Security Group associated with your task definition.

SSL Install on AWS

I've been tasked with getting a new SSL installed on a website, the site is hosted on AWS EC2.
I've discovered that I need the key pair in order to connect to the server instance, however the client doesn't have contact with the former web master.
I don't have much familiarity with AWS so I'm somewhat at a loss of how to proceed. I'm guessing I would need the old key pair to access the server instance and install the SSL?
I see there's also the Certificate Manager section in AWS, but don't currently see an SSL in there. Will installing it here attach it to the website or do I need to access the server instance and install it there?
There is a documented process for updating the SSH keys on an EC2 instance. However, this will require some downtime, and must not be run on an instance-store-backed instance. If you're new to AWS then you might not be able to determine whether this is the case, so would be risky.
Instead, I think your best option is to bring up an Elastic Load Balancer to be the new front-end for the application: clients will connect to it, and it will in turn connect to the application instance. You can attach an ACM cert to the ELB, and shifting traffic should be a matter of changing the DNS entry (but, of course, test it out first!).
Moving forward, you should redeploy the application to a new EC2 instance, and then point the ELB at this instance. This may be easier said than done, because the old instance is probably manually configured. With luck you have the site in source control, and can do deploys in a test environment.
If not, and you're running on Linux, you'll need to make a snapshot of the live instance and attach it to a different instance to learn how it's configured. Start with the EC2 EBS docs and try it out in a test environment before touching production.
I'm not sure if there's any good way to recover the content from a Windows EC2 instance. And if you're not comfortable with doing ops, you should find someone who is.

zookeeper installation on multiple AWS EC2instances

I am new to zookeeper and aws EC2. I am trying to install zookeeper on 3 ec2 instances.
as per zookeeper document, I have installed zookeeper on all 3 instances, created zoo.conf and add below configuration:
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/opt/zookeeper/data
clientPort=2181
server.1=localhost:2888:3888
server.2=<public ip of ec2 instance 2>:2889:3889
server.3=<public ip of ec2 instance 3>:2890:3890
also I have created myid file on all 3 instances as /opt/zookeeper/data/myid
as per guideline..
I have couple of queries as below:
whenever I am starting zookeeper server on each instance, it will start in standalone mode.(as per logs)
can above configuration is really gonna connect to each other? port 2889:3889 & 2890:38900 - what these port all about. can I need to configure it on ec2 machine or I need to give some other port against it?
Is I need to create security group to open these connection? I am not sure how to do it in ec2 instance.
How to confirm all 3 zookeeper has started and they can communicate with each other?
The ZooKeeper configuration is designed such that you can install the exact same configuration file on all servers in the cluster without modification. This makes ops a bit simpler. The component that specifies the configuration for the local node is the myid file.
The configuration you've defined is not one that can be shared across all servers. All of the servers in your server list should be binding to a private IP address that is accessible to other nodes in the network. You're seeing your server start in standalone mode because you're binding to localhost. So, the problem is the other servers in the cluster can't see localhost.
Your configuration should look more like:
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/opt/zookeeper/data
clientPort=2181
server.1=<private ip of ec2 instance 1>:2888:3888
server.2=<private ip of ec2 instance 2>:2888:3888
server.3=<private ip of ec2 instance 3>:2888:3888
The two ports listed in each server definition are respectively the quorum and election ports used by ZooKeeper nodes to communicate with one another internally. There's usually no need to modify these ports, and you should try to keep them the same across servers for consistency.
Additionally, as I said you should be able to share that exact same configuration file across all instances. The only thing that should have to change is the myid file.
You probably will need to create a security group and open up the client port to be available for clients and the quorum/election ports to be accessible by other ZooKeeper servers.
Finally, you might want to look in to a UI to help manage the cluster. Netflix makes a decent UI that will give you a view of your cluster and also help with cleaning up old logs and storing snapshots to S3 (ZooKeeper takes snapshots but does not delete old transaction logs, so your disk will eventually fill up if they're not properly removed). But once it's configured correctly, you should be able to see the ZooKeeper servers connecting to each other in the logs as well.
EDIT
#czerasz notes that starting from version 3.4.0 you can use the autopurge.snapRetainCount and autopurge.purgeInterval directives to keep your snapshots clean.
#chomp notes that some users have had to use 0.0.0.0 for the local server IP to get the ZooKeeper configuration to work on EC2. In other words, replace <private ip of ec2 instance 1> with 0.0.0.0 in the configuration file on instance 1. This is counter to the way ZooKeeper configuration files are designed but may be necessary on EC2.
Adding additional info regarding Zookeeper clustering inside Amazon's VPC.
Solution with VPC's public IP addres should be preferable solution since Zookeeper and using '0.0.0.0' should be your last option.
In case when you are using docker in your EC2 instance '0.0.0.0' will not work properly with Zookeeper 3.5.X after node restart.
The issue lies in resolving '0.0.0.0' and ensemble sharing of node addresses and SID order (if you will start your nodes in descending order, this issue may not occur).
So far the only working solution is to upgrade to 3.6.2+ version.