AWS VPC with internet access sometimes timeouts with outside requests - amazon-web-services

I've created a VPC (due to the RDS connectivity needs inside the lambdas) in AWS which has internet access most of the time, but some times my outside requests timeout (mostly these happen with SES as they're the majority of outside requests). I've configured my VPC the following way (sorry, not in the created order, just reading them off AWS):
VPC with 172.30.0.0/16 CIDR
3 private subnets with 172.30.0.0/24, 172.30.1.0/24, 172.30.2.0/24 and a different availability zone for each (1a, 1b, 1c) with 0.0.0.0/0 route targeting my NAT
1 public subnet with 172.30.3.0/24 to 1a availability zone with a 0.0.0.0/0 route targeting my IGW
2 route tables (private and public) with the 3 private subnets in the private route table and the public one in it's own
Security groups for lambdas directing all outbound traffic to 0.0.0.0/0
Lambdas are configured to use these subnets and the given security group.
I'm not understanding why my internet requests some times fail from inside the VPC, it's almost as if the lambda gets started at some availability zone and that specific one does not have access to the internet inside the vpc.
EDIT: Resolved! I had the public subnet listed in my lambda function which caused the timeouts

AWS Lambda functions that are connected to a VPC should always be configured to use private subnets.
If those Lambda functions also require Internet access, they can use a NAT Gateway or NAT Instance to reach the Internet. These NAT services should be configured to use the public subnet(s).
When the Lambda function is connected to a private subnet, then traffic destined for the Internet will be routed from the private subnet, through the NAT Gateway/NAT Instance, and out to the Internet. This will not work if the Lambda function is connected to a public subnet. (And a Lambda function cannot connect directly from a VPC to the Internet.)

Related

AWS VPC with both internet gateway and NAT

I am lost on how to provide outbound internet access to AWS Lambda in our VPC while also having internet gateway to support inbound access (from Internet) to certain resources in our VPC.
From the documentation provided (below), I understand we need to create a private and public subnet (with NAT), and have one route table pointing to IGW, and another to the NAT.
https://aws.amazon.com/premiumsupport/knowledge-center/internet-access-lambda-function/
Our setup is as follows.
VPC
Private subnet
Public subnet
Route Table
Table1
Public subnet
0.0.0.0/0 - IGW
Table2
Private subnet
0.0.0.0/0 - NAT
Lambda
VPC
Private subnet
RDS (Need access from outside of VPC)
Under VPC
With this setup, Lambda can access internet but the setup stops external inbound access to our resources in the VPC.
If we reroute our 0.0.0.0/0 in our private subnet to IGW, we can access our resources in VPC from external network but the Lambda loses connectivity to Internet.
Any one has clarity on how to set this up?
Appreciate any views on this.
Just move the resources that need to be publicly accessible into a public subnet (a subnet with a route to the Internet Gateway). The Lambda function has to remain in a private subnet (a subnet with a route to a NAT Gateway).
So in your case the RDS instance should be in the public subnet, and the Lambda function should be in the private subnet.

NAT Gateway with Internet Gateway and lambda

We use elastic beanstalk to run our main application out of EC2, we also have an RDS instance in that VPC. Those instances have public IPs so it can use a standard internet gateway to access the internet. No problems there.
Now I have created a lambda function, associated it with the 3 subnets corresponding to the 3 AZs the EC2 instances live in. Everything is still good. My lambda can connect to those resources just fine.
My problem is I need my lambda to reach the internet. Normally I'd route the subnets it's in to 0.0.0.0/0 and route it out through a NAT gateway. However, because the EC2 and RDS instances in the subnets of the VPC my lambda is associated with have public IPs putting a NAT gateway in breaks their internet connectivity. How should I go about giving my lambda internet access, without breaking the IGW for the other Ec2 instances?
I was thinking of maybe creating 3 new subnets within the 3 AZs, associating that with my lambda function, create a NAT gateway in each AZ subnet, make the corresponding routes for each subnet. If I did that would my lambda still be able to access the EC2\RDS instances within the other subnets? I have a lambda sg and an ec2 sg and the lambda sg is permitted access to the ec2 sg. Hopefully this makes sense!
As it is not possible to attach public IP addresses to Lambda functions, you have to launch them in private subnets and forward internet traffic to a NAT gateway/instance to let your functions access the Internet.
It looks like you have only created public subnets in your VPC. As you have already suggested, you need to create private subnets that hosts your lambda functions.
Create 1 public and 1 private subnet per AZ.
Launch NAT Gateways in public subnets.
Update the routing table of the private subnets and forward all internet traffic to the NAT GW.
Private subnet RT
0.0.0.0/0 --> NAT GW
Public subnet RT
0.0.0.0/0 --> IGW

Lambda function access rds instance (with Internet Gateway)

The link explains that need to use NAT Gateway for the public subnet to make it possible to access the internet and the lambda function access the RDS instance. First does it realy have to be NAT Gateway can't use instead a Internet Gateway for that purpose?
Second have two Route Tables one named PublicNetwork that haves two subnets and the route haves one Internet Gateway,
the second Route Table that haves only one subnet called PrivateNetwork.
when had only one Route Table with all 3 subnets could access the rds (db) instance with Microsoft SQL Server Management (security group with inbound rules of type MS SQL and Source my ip address) now with the changes can't access anymore.
My database in RDS haves in Subnet group my default-vpc that haves the 3 subnets is it needed to create another vpc and transfer the private subnet to it to be able to access my database again?
All subnets in a VPC can communicate with each other
An Internet Gateway connects the VPC to the Internet
Any subnet that has a Route Table pointing 0.0.0.0/0 to the Internet Gateway is called a Public Subnet (because it can directly communicate with the Internet)
Any subnet that does not have such a Route table entry is called a Private Subnet
If a resource in a private subnet needs to communicate with the Internet, it must send the network traffic via a NAT Gateway in the Public Subnet. The NAT Gateway will forward the traffic to the Internet, then return any response that is received.
If you are having difficulty connecting to resources within the same VPC, then the Security Group is the most likely cause of the problem.

AWS Lambda - NAT Gateway internet access results in timeout

I have a AWS Lambda function which:
checks a Redis Elasticache instance,
if the item is not found in the cache, goes to Google Places API service.
The Redis instance is in a private subnet; so, to fetch it, I added the VPC and the subnet in which the instance resides. I also specified the security group which allows all the outbound traffic. The Network ACL is the default one which is supposed to all the inbound and the outbound traffic.
When adding VPC to Lambda function like that via the console, it prompts:
When you enable VPC, your Lambda function will lose default internet access. If you require external internet access for your function, ensure that your security group allows outbound connections and that your VPC has a NAT gateway.
So, in the Route Table of the private subnet, I added a NAT gateway too. However, at the point where the Google Places API service call is made from the Lambda function it is always doomed to result in timeout.
In short, I doubt that the NAT gateway properly allows internet access of the Lambda function. How can I check what goes wrong with it?
Do NAT Gateways log the calls or the call attempts being tried through it somehow in CloudWatch etc.?
I want to elaborate on the answer from #vahdet. I was losing my mind trying to reconcile how the NAT Gateway was supposed to be in the public and private subnets simultaneously. It seemed like the official AWS documentation here was wrong, but of course it's not. There is a very subtle detail that myself and others have missed.
The NAT Gateway has to be "hot-wired" across two different subnets simultaneously in order to bridge private addresses to a public one that is internet facing.
First, I tried to put the NAT Gateway in the same route table as the IGW, but of course that doesn't work because you can't have two identical routes (0.0.0.0/0) with different targets.
The guide was saying to put the NAT Gateway in the route table for the Private Network, which I did, but that didn't seem to work.
The critical detail I was missing was that the NAT Gateway has to be created in a public subnet. The documentation actually says this, but it seems confusing because we are later told to put the route for NAT Gateway in the private table.
Both things are true. Create the NAT Gateway in the public subnet and then only add a route table entry in the private route table.
The documentation tells you to create the following network resources in the VPC:
two new subnets
two new route tables
one new NAT Gateway
I already had a route table and some subnets, so I tried to only add one new subnet and one new route table and this is where I got into trouble. It really was better to create two of each as documented.
Here's what it the subnets look like for me:
subnet-public 10.8.9.0/24 us-east-1a
subnet-private 10.8.8.0/24 us-east-1a
Then create the NAT Gateway in subnet-public.
It will be pending for a couple of minutes, which is important, because it must go to available status before it can be referenced in a route table entry.
Here are the route tables:
route-table-public
10.8.0.0/16 local
0.0.0.0/0 igw-xyz
subnet-association: subnet-public
route-table-private
10.8.0.0/16 local
0.0.0.0/0 nat-abc
subnet-association: subnet-private
Do you see what happened there? It's really subtle. The NAT Gateway is cross-wired. It "lives" in the public subnet it was created in, but all traffic in the private subnet gets routed to it.
If you create the NAT Gateway in the private subnet like I did at first, then the NAT Gateway is just as isolated as everything else in the private subnet, and has no way to route traffic out to the internet. It must be created in the public subnet to have internet access, because it must have an IP address inside the public subnet. My NAT Gateway got an internal IP of 10.8.9.127 and an external IP in the 54.X.X.X range.
By making the NAT Gateway the 0.0.0.0/0 route in the private routing table, we are telling all non-10.8.0.0/16 traffic to go straight to the NAT Gateway, even though it isn't actually inside the private subnet.
The VPC itself knows how to bridge traffic across its own subnets, and is able to send the 10.8.8.X traffic to the NAT Gateway's 10.8.9.X IP. It then acts as a bridge, and translates all of that traffic across it's internal IP to its external IP. Because it is in a public subnet that is in a route table with an internet gateway, the external IP has a clear path to the internet.
Now my private VPC lambda in subnet-private has verified internet access through the NAT Gateway.
The following steps are required
An IAM role with full VPC permission assigned to your lambda function.
VPC with public and private subnet
while creating a NAT Gateway
a)the subnet has to be public subnet
b)Elastic IP creat a new one or allocate one
Create the route table and add another route with target as our NAT gateway we created above.
And your lambda should be happy now
The problem for my case turned out to the fact that, I had created the NAT Gateway in the private subnet.
Make sure you place the NAT Gateways in the public subnet.
By the way, there are metrics but no direct logging records available in CloudWatch for NAT Gateway.

Trouble getting bastion instance to jump to RDS/Lambda instances

I am trying to setup a nice and secure VPC for my lambda and RDS work. Essentially, I want my lambda to hit a site, get some data, and shove it into a database.
In isolation the parts all work. However the second I go to harden everything it all falls apart. Here is what I do:
Disable "Publically Available" from the RDS instance
Change the RDS instance to only accept connections from inside the VPC using the security group
Associate the lambda with a VPC (this kills the internet access)
Following this tutorial I created a NAT gateway, deleted the internet gateway from the VPC subnet, and replaced it with the NAT. Now, as expected, nothing can talk inbound, but things can talk outbound.
At this point I knew I needed a bastion instance, so I fired up an EC2 instance.
The EC2 instance is set to the same subnet the RDS and Lambda are on, and unfortunately this means that I have a problem - the NAT gateway is currently soaking up all the traffic via 0.0.0.0/0, which means there's no room for the internet gateway. Without the internet gateway I (obviously) can't SSH into my bastion instance so I can jump to access my RDS database.
How can I configure this all correctly? My guess is that I need to split the subnet up somehow and make a private and public subnet, the public having the bastion and internet gateway in it. However, I'm not sure how this will all work so the bastion instance can still properly jump to the RDS.
I'm really quite new to setting up AWS services so I'm hoping I didn't mess anything up long the way.
Following this tutorial I created a NAT gateway, deleted the internet
gateway from the VPC subnet, and replaced it with the NAT. Now, as
expected, nothing can talk inbound, but things can talk outbound.
Short Answer
The short answer is you shouldn't have "Killed the Internet Gateway"; thats not a step in the link you provided :) Leave the internet gateway as is in your current subnet. You're going to need a public subnet and the one that was routing 0.0.0.0/0 to IGW is an example of one you can could use.
The work involved is placing your NAT gateway in the Public Subnet, placing your bastion host in the public subnet, placing your lambda function in the private subnet, routing traffic in the private subnet to the NAT gateway in the public subnet, and providing your lambda function with access to your security group by putting it in its own lambda security group and "white listing" the lambda security group in the inbound rules for the security group protecting your database.
Background
Below I have an expanded answer providing background as to public/private subnets, granting internet access to private subnets, and allowing lambda access through security groups. If you don't feel like reading the background then jump to very end where I give a bullet point summary of the steps you'll need.
Public Subnet
A public subnet is one in which traffic originating outside your VPC, or destined for a target outside your VPC (internet), is routed through an internet gateway (IGW). AWS gives you initial default public subnets configured this way; you can identify them in the console by looking at their route table and seeing that under "destination" you find "0.0.0.0/0" targeting an IGW. This means a public subnet is more of a design pattern for "internet accessible" subnet made possible by simply configuring its default route to point to an IGW. If you wish to create a new public subnet you can create a new route table as well that point internet traffic at an IGW and link that route table to your new subnet. This is fairly easy in the console.
Private Subnet
A private subnet is a subnet with no IGW and not directly reachable from the internet, meaning you cannot connect to a public IP address of a system on a private subnet. With the exception of the AWS pre-configured default subnets, this is how new subnets your manually create are setup, as black boxes till you specify otherwise.
Granting Internet Access to Private Subnet
When you want things in your private subnet to be able to reach out to external internet services you can do this by using an intermediary known as a NAT gateway. Configure a route table the same as in the public subnet with the only difference being traffic destined for 0.0.0.0/0(Internet) you target for a NAT gateway sitting inside the public subnet. That last part is critical. Your NAT gateway needs to be in the public subnet but your private subnet is using it as the target for external traffic.
Security Group Access for Lambda
One simple way to allow your lambda function through your security group/firewall is to create a security group just for your lambda function and configure the security group protecting your RDS so that it allows traffic from the lambda security group.
In other words, in security group settings you don't have to specify just IP addresses as sources, you can specify other security groups and this is a pretty neat way of grouping items without having to know their IP address. Your lambda functions can run in the "Lambda Security Group" and anything protected by a security group that you want them to access can be configured to accept traffic from the "Lambda Security Group". Just make sure you actually associate your lambda function with the lambda security group as well as place it in the private subnet.
Lambda VPC Steps in a Nutshetll
Create a new NAT gateway and place it in the public subnet. This
point is important, the NAT gateway goes in the public subnet ( a
subnet whose route table routes 0.0.0.0/0 to an IGW)
Create a new subnet, you can call it Private-Lambda-Subnet. Unlike
the
default pre-configured subnets AWS gives you, this new subnet is
immediately private out of the box.
Create a new route table and link it to your Private-Lambda-Subnet
In the new route table for your private subnet add an entry that
routes 0.0.0.0/0 to a target of the NAT gateway. This is how your
private subnet will indirectly access the internet, by forwarding
traffic to the NAT which will then forward it to the IGW.
Your bastion host and anything else you want to be be publicly
accessible will need to be in the public subnet. This is probably
where you already have your RDS instances, which is fine if they are
firewalled/security group protected.
Create a new security group for your lambda function(s). You can
call it LambdaSecurityGroup.
Configure the inbound rules of your RDS guarding security group to
allow traffic from the LambdaSecurityGroup. This is possible because
you can use other security groups as sources in the firewall
settings, not just ip addresses.
You need a public subnet (default route is the Internet Gateway) and a private subnet (default route is the NAT Gateway). The NAT Gateway, itself, goes on the public subnet, so that it can access the Internet on behalf of the other subnets for which it is providing services. The bastion also goes on the public subnet, but Lambda and RDS go on the private subnet.
Anything can talk to anything on any subnet within a VPC as long as security groups allow it (and Network ACLs, but don't change these unless you have a specific reason to -- if you aren't sure, then the default settings are sufficient).