I've been trying to connect to S3 bucket from a lambda residing in a private subnet. I did the exact same thing for Ec2 instance and it worked like a charm, I'm not sure why with lambda it's such an issue. My lambda times out after a certain defined interval.
Here's my lambda's VPC configuration
Here's the security group output configuration:
Below are the outbound rules of the subnet associated with lambda
As you can see, I created a VPC endpoint to route my traffic through the VPC but it doesn't work. I'm not sure what am I missing here. Below is the VPC Endpoint configuration.
I've given full access to S3 in policy like this:
{
"Statement": [
{
"Action": "*",
"Effect": "Allow",
"Resource": "*",
"Principal": "*"
}
]
}
When I run my lambda code, I get timeout error as below:
You can access Amazon S3 objects using VPC endpoint only when the S3 objects are in the same Region as the Amazon S3 gateway VPC endpoint. Confirm that your objects and endpoint are in the same Region.
To reproduce your situation, I performed the following steps:
Created an AWS Lambda function that calls ListBuckets(). Tested it without attaching to a VPC. It worked fine.
Created a VPC with just a private subnet
Added an Amazon S3 Endpoint Gateway to the VPC and subnet
Reconfigured the Lambda function to use the VPC and subnet
Tested the Lambda function -- it worked fine
I suspect your problem might lie with the Security Group attached to the Lambda function. I left my Outbound rules as "All Traffic 0.0.0.0/0" rather than restricting it. Give that a try and see if it makes things better.
Related
Slightly tearing my hair out with this one... I am trying to run a Docker image on Fargate in a VPC in a Public subnet. When I run this as a Task I get:
ResourceInitializationError: unable to pull secrets or registry auth: pull
command failed: : signal: killed
If I run the Task in a Private subnet, through a NAT, it works. It also works if I run it in a Public subnet of the default VPC.
I have checked through the advice here:
Aws ecs fargate ResourceInitializationError: unable to pull secrets or registry auth
In particular, I have security groups set up to allow all traffic. Also Network ACL set up to allow all traffic. I have even been quite liberal with the IAM permissions, in order to try and eliminate that as a possibility:
The task execution role has:
{
"Action": [
"kms:*",
"secretsmanager:*",
"ssm:*",
"s3:*",
"ecr:*",
"ecs:*",
"ec2:*"
],
"Resource": "*",
"Effect": "Allow"
}
With trust relationship to allow ecs-tasks to assume this role:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "ecs-tasks.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}
The security group is:
sg-093e79ca793d923ab All traffic All traffic All 0.0.0.0/0
And the Network ACL is:
Inbound
Rule number Type Protocol Port range Source Allow/Deny
100 All traffic All All 0.0.0.0/0 Allow
* All traffic All All 0.0.0.0/0 Deny
Outbound
Rule number Type Protocol Port range Destination Allow/Deny
100 All traffic All All 0.0.0.0/0 Allow
* All traffic All All 0.0.0.0/0 Deny
I set up flow logs on the subnet, and I can see that traffic is Accept Ok in both directions.
I do not have any Interface Endpoints set up to reach AWS services without going through the Internet Gateway.
I also have Public IP address assigned to the Fargate instance upon creation.
This should work, since the Public subnet should have access to all needed services through the Internet Gateway. It also works in the default VPC or a Private subnet.
Can anyone suggest what else I should check to debug this?
One of the potential problems for ResourceInitializationError: unable to pull secrets or registry auth: pull command failed: : signal: killed is disabled Auto-assign public IP. After I enabled it (recreating service from the scrath), task run properly without issues.
For those unlucky souls, there is one more thing to check.
I already had an internet gateway in my VPC, DNS was enabled for that VPC, all containers were getting public IPs and the execution role already had access to ECR. But even so, I was still getting the same error.
Turns out the problem was about Routing Table. The routing table of my VPC didn't include a route for directing outbound traffic to internet gateway so my subnet had no internet access.
Adding the second line to the table that routes 0.0.0.0/0 traffic to internet gateway solved the issue.
Edited answer based of feedback from #nathan and #howard-swope
checklist:
The VPC has "DNS hostnames" and "DNS resolution" enabled
"Task execution role" has access to ECR. e.g. has role AmazonECSTaskExecutionRolePolicy
if task is running on a PUBLIC subnet:
The subnets have access to internet. i.e. assigning internet gateway to the subnets.
Enable "assign public IP" when creating the task.
if task is running on a PRIVATE subnet:
The subnets have access to internet. i.e. assigning NAT gateway to the subnets.
... NAT gateway resides on a public subnet
I was facing the same issue. But in my case, I was triggering the Fargate Container from the Lambda function using the RunTask operation. So In the RunTask operation, I was not passing the below parameter:
assignPublicIp: ENABLED
After adding this, Container was triggering without any issues.
It turns out that I did not have DNS support enabled for the VPC. Once this is enabled, it works.
I did not see DNS support explicitly mentioned in any docs for Fargate - I guess its pretty obvious or how else will it look up the various AWS services it needs. But thought it worth noting in an answer against this error message.
For AWS Batch using Fargate, this error was triggered by the 'Assign public IP' setting being disabled.
This setting is configurable during Job Definition step. However, it is not configurable in the UI after the Job Definition had already been created.
AWS container runner needs to access to the container repositories, and AWS service.
If you're on a public subnet, the easiest is to "Auto-assign public IP" to have your containers access to internet, even if your app do not need egress access to internet.
Otherwise, if you're using only AWS services (ECR, and no images pulled from docker.io), then you could use VPC endpoints to access ECR/S3/Cloudwatch, and enabling DNS options on your VPC.
For private subnet, it's the same.
If you're using docker.io images, then you need egress access to internet in your subnet anyway.
In my case of dealing with the above error, while running the run-task command(yes, not via Service route), I was not specifying the security group in the aws ecs run-task --network-configuration. This was resulting in the default SG being picked up from the task VPC. My default SG in that VPC had no inbound/outbound rules defined. I added ONLY the outbound rule to allow all traffic to everywhere and the error went away.
My setup is that the ECS/Fargate task will run in a private subnet with ECR connectivity via VPC Interface endpoints. I had the checklist, mentioned above, checked and in addition added the SG rule.
I am receiving "Could not connect to the endpoint URL: "https://s3.amazonaws.com/" from inside EC2 instance running inside private subnet
Note: We are using our corporate shared AWS account instead of Federated account for this exercise.
Here is a configuration:
Created one VPC with 1 private(Attached to VPC endpoints for S3 and Dynamodb) and 1 public (attached to Internet Gateway) subnet. There is no NAT gateway or instance.
Launched EC2 instance(Amazon Linux AMI) one inside each subnet.
Attached IAM roles to access dynamodb and S3 to both the EC2 instance
Connected to EC2 from terminal. Configured my access keys using aws configure
Policy for S3 VPC endpoint:
"Statement": [
{
"Action": "*",
"Effect": "Allow",
"Resource": "*",
"Principal": "*"
}
]
}
Routing is automatically added to the VPC routing where destination is pl-xxxxxxxx(com.amazonaws.us-east-1.s3) and target is the endpoint created in
Opened all traffic in the outbound rules in Security Group for the private subnet to destination prefix s3 endpoint starting with pl-xxxxxxxx
Now entered following command in private ec2 instance on terminal
aws s3 ls --debug --region us-west-2
I got following error
"ConnectionTimeout: Could not connect to the endpoint URL https://sts.us-west-2.amazonaws.com:443"
I read almost all the resources on google and they follow same steps that I have been following but it is not working out for me.
The only difference is that they are using federated AWS account whereas I am using a shared AWS account.
Same goes for dynamodb access.
Similar stackoverflow issue: Connecting to S3 bucket thru S3 VPC Endpoint inside EC2 instance timing out
But I could not benefit from it much.
Thanks a lot in advance.
Update: I was able to resolve the issue with STS endpoint by creating STS interface endpoint in the private subnet and then accessing the Dynamodb and S3 by assuming role inside the EC2 instance
I've created some sort of private documentation for my infra team, uploaded to S3 Bucket and would like to make it private, accessible only on our VPN.
I tried to allow those vpn ip ranges: 173.12.0.0/16 and 173.11.0.0/16 but i keep getting 403 - forbidden (inside vpn).
Can someone help me debug or find where im messing up?
My bucket policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "vpnOnly",
"Effect": "Allow",
"Principal": "*",
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::calian.io/*",
"Condition": {
"IpAddress": {
"aws:SourceIp": [
"173.12.0.0/16",
"173.11.0.0/16"
]
}
}
}
]
}
By default, S3 requests go via the Internet, so the requests would 'appear' to be coming from a public IP address.
Alternatively, you could add a VPC Endpoint for S3, which would make the request come 'from' the private IP addresses.
You might also consider using Amazon S3 Access Points to control the access to the bucket.
Since VPC endpoints are only accessible from Amazon EC2 instances inside a VPC, a local instance must proxy all remote requests before they can
utilize a VPC endpoint connection. The following sections outline a DNS-based proxy solution that directs appropriate traffic from a corporate network to
a VPC endpoint for Amazon S3 as depicted in the following diagram.
From one of the machines you are attempting to access the S3 bucket from go to the "AWS Check My IP" endpoint at https://checkip.amazonaws.com/
From there confirm that the IP address you're seeing is inside of the range you have defined in your policy. My guess is it'll be different- instead you'll see the public ip address of your VPN or NAT Gateway/Instance, as your traffic is likely going over the internet to get to S3.
Once you've identified the IP address you're using you can either update the security group to include it, or look into solutions such as a VPC Endpoint to keep traffic on your private network.
it is possible to call a lambda function that lives within a VPC from another lambda in another VPC.
I'm trying to do it with an AWS VPC Endpoint but I can't do it. It marks error 403. I am following these steps: https://aws.amazon.com/es/blogs/compute/introducing-amazon-api-gateway-private-endpoints/.
And https://cedrus.digital/aws-privatelink-with-api-gateway-and-lambda-functions/
I am not sure, if the VPC Endpoint should be created in the VPC where the lambda will be called or where it will receive the request.
Even, the API Gateway Resource Policies has put it like this:
{
"Statement": [
{
"Principal": "*",
"Action": [
"execute-api:Invoke"
],
"Effect": "Allow",
"Resource": "*"
}
]
}
And the VPC endpoint policy to Full access.
To invoke an AWS Lambda function via an API call, the calling entity must have access to the Internet. It doesn't matter whether the calling entity is in the same VPC, a different VPC, or even not in a VPC. All that matters is that the request can be sent to the AWS Lambda API endpoint.
If the calling Lambda function is in a VPC, make sure that it has access to the Internet. This typically requires:
The Lambda function is in a private subnet
There is a NAT Gateway in a public subnet
The Route Table for the private subnet directs 0.0.0.0/0 traffic to the NAT Gateway
Alternatively, if the calling Lambda function is not connected to a VPC, then it automatically receives access to the Internet.
It also does not matter to what the "called" Lambda function is connected (VPC or not). The control plane that activates this Lambda function is on the Internet, which is unrelated to where the Lambda function itself is connected.
There are few ways that you can invoke a lambda from another lambda.
Lambda invokes other lambda directly
when you invoke a lambda(caller) from another lambda(callee) using aws-sdk's invoke function, as mentioned on a answer already, the lambda(caller) should have internet connectivity. because aws-sdk calls are by default made over the internet.
Therefore either the lambda should be deployed on a public subnet (not recommended) or you should have a Nat Gateway (or Nat instance is cheaper), so that the lambda can invoke the other lambda over the internet.
Lambda invokes the other lambda through Api Gateway
You don't even need to consider this option if the calling lambda has internet connectivity.
You can indeed create a private VPC endpoint for api gateway in the destination lambda end. Then the calling lambda can make a https call via the VPC endpoint's dns url.
For this to work, your VPC endpoint should be accessible from the other VPC from where you are going to make the http call.
therefore a vpc peering between the VPCs will make it possible. The good news is VPC endpoints are now accessible through vpc peering.
Hope this helps.
Reference:
https://aws.amazon.com/about-aws/whats-new/2019/03/aws-privatelink-now-supports-access-over-vpc-peering/
I want to set up an EC2 instance running on a private VPC. It can connect to the Internet from the private VPC but can not access from outside. And there is a lambda function to trigger the EC2 to initiate some interactions with external resources (S3, Dynamo, Internet).
I have set up a VPC as following:
An EC2 instance running docker in a private VPC subnet
An ALB(application load-balancer) configured as internal and in private subnets (same as the EC2 subnet)
A NAT Gateway which is working
A lambda function which will do HTTPs GET and POST to the Internet and ALB
Route53 private Hostzone has a record set that route "abcd.internal/api" to the ALB.
Here is the problem. The lambda function can connect to the Internet with HTTPs, but when it fails to HTTPs GET to the ALB with the private Hostzone record("abcd.internal").
My understanding is my ALB, EC2, lambda, NAT Gateway and Route53 are configured in the same VPC, they should be able to talk to each other with the private DNS name. I don't know why it fails.
Note: Before setting up a internal ALB, I did try setting up a internet-facing ALB in a public subnet, then configure a public Hostzone record set "abcd.public" to this ALB. It can talk to the EC2 instance and the EC2 instance can interact with the Internet through the NAT Gateway. So the "EC2 to Internet" part is working.
Update:
I finally dig some error messages in lambda log as follows:
Error: Hostname/IP doesn't match certificate's altnames: "Host: abcd.internal. is not in the cert's altnames: DNS:.public"]
reason: 'Host: abcd.internal. is not in the cert\'s altnames: DNS:.public',
host: 'abcd.internal.',
That is interesting. I do have a public hostzone co-exist with the private hostzone, but the public hostzone is for other purpose. I dont know why the lambda function use the public DNS rather than the private DNS since it was configured inside a private subnet.
Thanks for everyone who post comments and gave suggestions.
To solve this problem, I have almost found every possible solutions online. I put everything at the right position. Lambda function, ELB and EC2 are in the same VPC private subnet. Route53, NAT and IGW are properly set up. I did try playing with the DHCP options set, didn't work. Maybe I don't fully understand this DHCP and I can't find an example.
It turns out the HTTPS protocol is not working. Before I move to private VPC, I have the same thing setup in a public VPC and resources are using HTTPS to communicate. For example, the lambda function will GET/POST to the EC2 instance or ELB. After I move stuffs into a private VPC, HTTPS commands can not use the internal DNS names.
However, if I use HTTP protocol, resources finally can find each other by internal DNS names.
I still dont know why HTTPS can't be used in the private VPC, but I can live with this solution.
I had the same problem.
The ALB was not added as a trigger for the Lambda which was causing a similar certificate issue for me.
The security group was configured wrongly in my case.
I noticed that the role that I assigned to Lambda should include a policy with create/delete ENI permissions
Sometimes the ALB updates were not quick. so I recreated with the same settings, it started to work.
Did you make sure to check if the IAM role attached to your Lambda has access to ec2 Network related actions? Here's an example IAM policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"ec2:CreateNetworkInterface",
"ec2:DescribeNetworkInterfaces",
"ec2:DeleteNetworkInterface",
"ec2:DescribeSecurityGroups",
"ec2:DescribeSubnets"
],
"Resource": [
"*"
],
"Effect": "Allow"
}
]
}