AWS - Connect lambdas to RDS in private isolated subnet - amazon-web-services

I'm wanting to establish connectivity to an RDS instance from some Lambda functions. Lambda functions are autodeployed with serverless framework, so ideally my config would be dynamic. I am currently managing infrastructure with CDK, and have the following resources:
1. RDS on Private Isolated subnet in VPC A, managed by CDK
2. EC2 instance on public subnet in VPC A, managed by CDK (For access to the RDS from the wider internet)
3. (Backend) 4 Lambdas without a VPC, behind an API Gateway in default VPC, managed by serverless deploy
4. Frontend hosted on S3 behind Cloudfront, managed by serverless deploy
I can deploy the lambdas to VPC A to either the private isolated or public subnets.
Additional constraints:
Lambdas require outbound connectivity, but should be protected from inbound internet requests from public internet.
I'm a bit stumped because I don't want to update my CDK script whenever the lambdas change. Help is much appreciated.

Your lambda functions need to be in the same VPC as the database, specifically in a private subnet.
You would then adjust the security group rules to allow connectivity from the functions to the DB using something like myFynction.connections.allowToDefaultPort(myDatabaseInstance);
The VPC needs to have a NAT gateway for the lambda functions to be able to access the internet. To clarify - the functions cannot be in an isolated subnet, because isolated subnets don't have Internet connectivity. Placing the functions in a public subnet will not work either - refer to this for an explanation.
Relevant documentation: https://aws.amazon.com/premiumsupport/knowledge-center/connect-lambda-to-an-rds-instance/

Related

AWS ECS: Issue when access external network

I have an issue regarding to AWS VPC networking, I want to access external API from my ECS task, I've configured ECS in a Private subnet and the only way to access this ECS is trough an load balancer, in some services like Lambda is working (I can access external resources) but in ECS I can't access to them, I tried modifying the security group rules + modify ACL rules but isn't working, If anyone know how I can do, I be very grateful, thanks.
ps: I created the VPC on the UI that AWS has.
For resources in a private subnet to access the Internet, the only option is to send that traffic through a NAT Gateway.
You will have to create an AWS NAT Gateway in at least one of the public subnets of your VPC, and then add a route in each of the private subnets to that NAT Gateway.

Do I need NAT gateway and route table in a VPC if I don't want internet access?

I am going to build a Lambda and a RDS aurora for my application. The RDS aurora needs to be inside a VPC and it doesn't need internet access. I have read a lot articles about VPC setup for database and all of them mentioned that need to create VPC, public/private subnets, route table, NAT gateway and internet gateway.
However, in my case, I don't need internet access in the database VPC. So my question is do I need NAT gateway and route table at all? I know each VPC has a default route table, is the default route table good enough? If I just create a VPC with 3 private subnets and attach the VPC to my lambda. Does it work?
Your understanding is correct and you don't need any NAT.
NAT is specifically used for accessing public internet from private subnet, but it doesn't seem to be required here.
Just make sure your Lambda doesn't need to access any external entity or AWS Service as well (Like S3). If you are required to access an AWS Service, you may create a VPC Endpoint for it. (Linked example is for S3)

Why can't an AWS lambda function inside a public subnet in a VPC connect to the internet?

I've followed the tutorial here to create a VPC with public and private subnets.
Then I set up an AWS lambda function inside the public subnet to test if it could connect to the outside internet.
Here's my lambda function written in python3
import requests
def lambda_handler(event, context):
r = requests.get('http://www.google.com')
print(r)
The function above failed to fetch the content of http://www.google.com when I set it inside the public subnet in a VPC.
Here's the error message:
"errorMessage": "HTTPConnectionPool(host='www.google.com', port=80):
Max retries exceeded with url: / (Caused by
NewConnectionError(': Failed to establish a new connection: [Errno 110]
Connection timed out',))", "errorType": "ConnectionError",
I don't understand why.
The route table of the public subnet looks like this:
The GET request to http://www.google.com should match igw-XXXXXXXXX target. Why can't the internet-gateway(igw) deliver the request to http://www.google.com and get back the website content?
This article says that I must set the lambda function inside the private subnet in order to have internet access.
If your Lambda function needs to access private VPC resources (for
example, an Amazon RDS DB instance or Amazon EC2 instance), you must
associate the function with a VPC. If your function also requires
internet access (for example, to reach a public AWS service endpoint),
your function must use a NAT gateway or instance.
But it doesn't explain why I can't set the lambda function inside the public subnet.
Lambda functions connected to a VPC public subnet cannot typically access the internet.
To access the internet from a public subnet you need a public IP or you need to route via a NAT that itself has a public IP. You also need an Internet Gateway (IGW). However:
Lambda functions do not, and cannot, have public IP addresses, and
the default route target in a VPC public subnet is the IGW, not a NAT
So, because the Lambda function only has a private IP and its traffic is routed to the IGW rather than to a NAT, all packets to the internet from the Lambda function will be dropped at the IGW.
Should I Configure my Lambda Function for VPC Access?
If your Lambda function does not need to reach private resources inside your VPC (e.g. an RDS database or Elasticsearch cluster) then do not configure the Lambda function to connect to the VPC.
If your Lambda function does need to reach private resources inside your VPC, then configure the Lambda function to connect to private subnets (and only private subnets).
NAT or Not?
If the Lambda function only needs access to resources in the VPC (e.g. an RDS database in a private subnet) then you don't need to route through NAT.
If the Lambda function only needs access to resources in the VPC and access to AWS services that are all available via private VPC Endpoint then you don't need to route through NAT. Use VPC Endpoints.
If your Lambda function needs to reach endpoints on the internet then ensure a default route from the Lambda function's private subnets to a NAT instance or NAT Gateway in a public subnet. And configure an IGW, if needed, without which internet access is not possible.
Be aware that NAT gateway charges per hour and per GB processed so it's worth understanding how to reduce data transfer costs for NAT gateway.
Best Practices
When configuring Lambda functions for VPC access, it is an HA best practice to configure multiple (private) subnets across different Availability Zones (AZs).
Intermittent Connectivity
Be sure that all the subnets you configure for your Lambda function are private subnets. It is a common mistake to configure, for example, 1 private subnet and 1 public subnet. This will result in your Lambda function working OK sometimes and failing at other times without any obvious cause.
For example, the Lambda function may succeed 5 times in a row, and then fail with a timeout (being unable to access some internet resource or AWS service). This happens because the first launch was in a private subnet, launches 2-5 reused the same Lambda function execution environment in the same private subnet (the so-called "warm start"), and then launch 6 was a "cold start" where the AWS Lambda service deployed the Lambda function in a public subnet where the Lambda function has no route to the internet.
You can make a lambda function access the public internet from within your VPC. Solution A is the actual answer, Solution B is a more elegant alternative solution.
Solution A - Lambda in VPC + Public IP associated with ENI
For accessing resources external to AWS such as Google API (like OP's example) you do need a Public IP. For other cases like RDS or S3 you don't need a Public IP, you can use a VPC Endpoint, so communication between your Lambda and the desired AWS Service doesn't leave AWS network.
By default some AWS Services are indeed reached via public internet, but it doesn't have to be.
Now if you want an actual external resource (e.g. google), you need to assign Elastic Public IPs to the Network Interfaces for each subnet linked to your lambda. First let's figure which subnets and security groups are linked to your lambda:
Next, go to EC2 Service, find the Public IPs menu under Network & Security. Allocate one IP for each subnet (in the example above there are two subnets).
Go to Network Interfaces menu, find the network interfaces attached to your lambda (same subnet and security group).
Associate the Public IPs in the actions menu for each one:
That's it, now your Lambda can reach out to public internet.
[EDIT]
Someone was concerned about Solution's A scalability issues saying each lambda instance has a new network interface but they missed this from AWS Docs:
"Multiple Lambda functions can share a network interface, if the functions share the same subnet and security group"
So whatever scalability issues you may face has nothing to do with this solution and how Lambda uses ENI, you'd face the same issues using EC2, ECS, EKS, not just Lambda.
Solution B - Decompose into multiple Lambdas
Requiring access both to external resources and VPC resources would seem like too much responsibility for a single function. You may want to rethink your design and decompose your single lambda function into at least two lambda functions:
Lambda A goes to external resources (e.g. Google API), fetches whatever data you need, add to SQS. No need to attach to VPC, no need to manually associate Elastic Public IP to ENI.
Lambda B processes the message from SQS, stores results to a storage (db, s3, efs, another queue, etc). This one lives within your VPC, and don't need external access.
This way seems more scalable, more secure, each individual lambda is less complex and more maintainable, the architecture looks better overall.
Of course life is not always rainbows and butterflies, so Solution A is good and scalable enough, but improving the architecture is even better.

How do I configure a NAT for use with my Lambdas that already have an Internet Gateway for S3 access?

My setup is:
S3 (website) -> API Gateway -> Lambda -> RDS
-> S3 (configuration)
-> Shopify
-> Transactional Mail
I have an Internet Gateway set up to allow access to my S3 configurations and I need to hook up a NAT to allow me to make my calls out to 3rd parties. I've attempted to only use the NAT (per this question) by changing my Routing Table entry for 0.0.0.0/0 -> {my NAT}, but that just results in not being able to access my S3 configuration bucket.
Any help would be greatly appreciated!
Edit: To be clear I've read the documentation, what I'm having issues understanding is the relationships between the Security Group my Lambdas and RDS share, and the Subnets they're associated with.
When I configure my lambda to be part of the security group my RDS instances is in, I need to associate it with at least 2 subnets... Should those be new subnets, and not the ones associated with my RDS instances? AKA does a lambda need to share a subnet with an RDS in order to access it?
If the Lambda function only needs to access VPC resources and S3, then the easiest way to configure this is to add an S3 Endpoint to your VPC. If your Lambda function needs to access VPC resources plus other resources besides S3 and DynamoDB (the only 2 services that currently support VPC endpoints) then your Lambda function has to be in a private subnet with a NAT Gateway.
Instances in a public subnet have the option of having a public IP address, but it isn't a requirement. Lambda functions in a VPC do not ever get public IP addresses, which is why Lambda functions inside a VPC have to be in a private subnet with NAT gateway in order to have Internet access.
The only time Lambda functions get a public IP is when they are not in a VPC at all. In that instance they can access anything except resources in your VPC.
A note about your "same security group" comment: Being in the same security group does not allow resources to access each other. The Lambda function needs to be in a security group that the RDS security group has granted access to. Regarding subnets, the Lambda simply needs to be in any subnet in the same VPC, it does not need to be in the same subnet as the RDS instance.

How to connect elasticache and dynamoDb from aws-lambda without using NAT Gateway

I need to connect dynamoDb and elasticache from aws-lambda (otherthan using NAT Gateway).
ElastiCache provides essential caching methods along with help in making the Lambda state-ful. The concern is that for Lambda to work nice with DynamoDB it should be set to NoVPC.
If we have to use ElastiCache, Lambda and both have to be in the same VPC.TO use Both ElastiCache and DynamoDB together is quite a challenge specially with Lambda. Given the VPC challenges.Do you have any suggestions to make this easier?
A Lambda function would have to have VPC access to connect to ElastiCache, and it would have to have access to resources outside the VPC to access DynamoDB so it would require a NAT gateway. There is no way to provide access to both of those services to a single Lambda function without enabling VPC access and setting up a NAT gateway.
If you just need a Redis server and aren't required specifically to use ElasiCache, then you could use a RedisLabs instance which wouldn't require you to enable VPC access on your Lambda function.
There is now a relatively easy solution for DynamoDb access from a VPC: VPC Endpoints.
"Previously, if you wanted your EC2 (elroy: or lambda) instances in your VPC to be able to access DynamoDB, you had two options. You could use an Internet Gateway (with a NAT Gateway or assigning your instances public IPs) or you could route all of your traffic to your local infrastructure via VPN or AWS Direct Connect and then back to DynamoDB."
"A VPC endpoint for DynamoDB enables Amazon EC2 instances in your VPC to use their private IP addresses to access DynamoDB with no exposure to the public Internet...Your EC2 instances do not require public IP addresses, and you do not need an Internet gateway, a NAT device, or a virtual private gateway in your VPC. You use endpoint policies to control access to DynamoDB. Traffic between your VPC and the AWS service does not leave the Amazon network. "
The above quotes come from the links below. Note the the references to "EC2 instances" apply to lambda contexts as well.
See https://aws.amazon.com/blogs/aws/new-vpc-endpoints-for-dynamodb/
and
http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/vpc-endpoints-dynamodb.html