Basically, I'm follow these two guides:
Deploying Hasura on AWS with Fargate, RDS and Terraform
Deploying Containers on Amazon’s ECS using Fargate and Terraform: Part 2
I have:
Postgres RDS Database deployed in 'Multi-AZ'
My python/flask app deployed in Fargate across multiple AZ's
I run a migration inside the task definition before the app
ALB Load balancing between the tasks
Logging for RDS, ECS and ALB into Cloudwatch Logs.
A NAT gateway with an Elastic IP for each private subnet to get internet connectivity
A new route table for the private subnets
NO certificates
I use terraform 0.12 for the deploy.
The repository is on ECR
But...
My app can't connect to the RDS database:
sqlalchemy.exc.OperationalError
(psycopg2.OperationalError): FATAL: password authentication failed for user "postgres"
These are the logs on pastebin-logs
I've already tried changing the password to a very simple one, before deploy, on the console directly, opening ports, turning access public, changing private to public subnet, etcetera, etcetera...
Please, I have a week with this error!!!
UPDATE
I inject the database credentials in this way:
pastebin-terraform
I cannot comment, but I mean this as a comment.
What does the security group egress look like on your ECS service that runs the task? You need to make sure it can talk to the RDS, usually on port 5432.
Related
I have created an EKS cluster with the Managed Node Groups.
Recently, I have deployed Redis as an external Load Balancer service.
I am trying to to set up an authenticated connection to it via NodeJS and Python microservices but I am getting Connection timeout error.
However, I am able to enter into the deployed redis container and execute the redis commands.
Also, I was able to do the same when I deployed Redis on GKE.
Have I missed some network configurations to allow traffic from external resources?
The subnets which the EKS node is using are all public.
Also, while creating the Amazon EKS node role, I have attached 3 policies to this role as suggested in the doc -
AmazonEKSWorkerNodePolicy
AmazonEC2ContainerRegistryReadOnly
AmazonEKS_CNI_Policy
It was also mentioned that -
We recommend assigning the policy to the role associated to the Kubernetes service account instead of assigning it to this role.
Will attaching this to the Kubernetes service account, solve my problem ?
Also, here is the guide that I used for deploying redis -
https://ot-container-kit.github.io/redis-operator/guide/setup.html#redis-standalone
I have created an ECS Fargate Task, which I can manually run. It updates a Dynomodb and I get logs.
Now I want this to run on a schedule. I have setup a scheduled ECS task through EventBridge. However, this does not run.
My looking at the EventBridge logs I can see that the container has been stopped for the following stopped reason:
ResourceInitializationError: unable to pull secrets or registry auth: execution resource
retrieval failed: unable to retrieve ecr registry auth: service call has been retried 3
time(s): RequestError: send request failed caused by: Post https://api.ecr....
I thought this might be a problem with permissions. However, I tested giving the Task Execution Role full power user permissions and I still get the same error. Could the problem be something else?
This is due to a connectivity issue.
The docs say the following:
For tasks on Fargate, in order for the task to pull the container image it must either use a public subnet and be assigned a public IP address or a private subnet that has a route to the internet or a NAT gateway that can route requests to the internet.
So you need to make sure your task has a route to an internet gateway (i.e. it's in a Public subnet) or a NAT gateway.
Alternatively, if your service is in an isolated subnet, you need to create VPC endpoints for ECR and other services you need to call, as described in the docs:
To allow your tasks to pull private images from Amazon ECR, you must create the interface VPC endpoints for Amazon ECR.
When you create a scheduled task, you also specify the networking options. The docs mention this step:
(Optional) Expand Configure network configuration to specify a network configuration. This is required for tasks hosted on Fargate and for tasks using the awsvpc network mode.
For Subnets, specify one or more subnet IDs.
For Security groups, specify one or more security group IDs.
For Auto-assign public IP, specify whether to assign a public IP address from your subnet to the task.
So the networking configuration changed between the manually run task and the scheduled task. Refer to the above to figure out the needed settings for your case.
I fixed this by enabling auto-assign public IP.
However, to do this, I had to first change from "Capacity provider strategy" -
"Use cluster default", to "Launch type" - "FARGATE". Then the option to enable auto-assign public IP became available in the dropdown in the EventBridge UI.
This seems odd to me, because my default capacity provider strategy for my cluster is Fargate. But it is working now.
Need to use a gateway to follow the traffic from ECS to ECR. It can either Internet Gateway or NAT Gateway eventually which would be effecting cost factor.
But where we can resolve this scenario, by creating VPC Endpoints. Which maintains the traffic within the AWS Resources.
Endpoints Required for this would be :
S3 Gateway
ECR
ECS
I am using docker containers with secrets on ECS, without problems. After moving to fargate and platform 1.4 for efs support i start getting the following error.
Any help please?
ResourceInitializationError: unable to pull secrets or registry auth: execution resource retrieval failed: unable to retrieve secret from asm: service call has been retried 1 time(s): secret arn:aws:secretsmanager:eu-central-1:.....
Here's a checklist:
If your ECS tasks are in a public subnet (0.0.0.0/0 routes to Internet Gateway) make sure your tasks can call the "public" endpoint for Secrets Manager. Basically, outbound TCP/443.
If your ECS tasks are in a private subnet, make sure that one of the following is true: (a) your instances need to connect to the Internet through a NAT gateway (0.0.0.0/0 routes to NAT gateway) or (b) you have an AWS PrivateLink endpoint to secrets manager connected to your VPC (and to your subnets)
If you have an AWS PrivateLink connection, make sure the associated Security Group has inbound access from the security groups linked to your ECS tasks.
Make sure you have set GetSecretValue IAM permission to the ARN(s) of the secrets manager entry(or entries) set in the ECS "tasks role".
Edit: Here's another excellent answer - https://stackoverflow.com/a/66802973
I had the same error message, but the checklist above misses the cause of my problem. If you are using VPC endpoints to access AWS services (ie, secretsmanager, ecr, SQS, etc) then those endpoints MUST permit access to the security group that is associated with the VPC subnet that your ECS instance is running in.
Another watchit is, if you are using EFS to host volumes, ensure that your volumes can be mounted by the same security group identified above. Go to EFS, select the appropriate file system, Network tab, then Manage.
I have a SQL Server database running on Windows Server EC2 instance. I also have Web API (ASP.NET Core WebAPI) deployed as a Service in ECS cluster (Fargate launch type).
What connection string should I use to access this database from my web API?
Right now I'm trying:
data source=NAME_OF_THE_EC2_INSTANCE;initial
catalog=DATABASE_NAME;User
Id=USER_NAME;Password=PASSWORD;MultipleActiveResultSets=True;App=EntityFramework;Connection Timeout=10;
But it doesn't work. The error returned suggests that the app doesn't even see the database at all.
It seems you'll need to use a NAT instance/Gateway
This will enable connectivity between your Fargate instance and EC2 instance where DB is installed.
Another source and also the official documentation
"...Container instances need external network access to communicate with the Amazon ECS service endpoint, so if your container instances are running in a private VPC, they need a network address translation (NAT) instance to provide this access. For more information, see NAT Instances in the Amazon VPC User Guide."
Can someone please clearly explain in a step-by-step guide and in simple terms from start to finish how to properly setup a private RDS instance that connects to:
Elastic Beanstalk instance where the the environment is using a load balancing, auto scaling web server environment using PHP as it’s platform.
MySQL Workbench
Side note, the EB and RDS instance(s) are all in the same VPC. I suppose in reality this may be more of a how to properly setup and connect IAM profiles and roles question.
In essence, I want to restrict all internet access from the RDS instance, while still allowing my EB instance or other resources i.e other EC2 instances (all located in the same VPC) the ability to connect to the RDS instance, while also allowing me to use (connect to) a DB tool like MySQL Workbench.
Elastic Beanstalk Security Questions:
Instance Profile: How should I setup/config this role and it’s associated policy
Service Profile: How should I setup/config this role and it’s associated policy
RDS Security Questions:
VPC Security Groups: How should I setup/config this security group(s) to allow access from EB instance, other specified resources (EC2), and MySQL Workbench