I have a VPC with two ISOLATED subnets, one for my RDS Serverless cluster, and one for my Lambda functions.
But my Lambda functions all timeout when they're calling my RDS.
My question is; is this VPC + isolated subnets a working structure for API Gateway -> Lambda -> RDS, or am I trying something impossible?
Lambda:
import * as AWS from 'aws-sdk';
const rdsDataService = new AWS.RDSDataService();
const query = `SELECT * FROM information_schema.tables;`;
export const handler = async (event) => {
const params = {
secretArn: `secret arn`,
resourceArn: "rds arn",
sql: query,
database: 'db name'
};
const res = await rdsDataService.executeStatement(params).promise();
return { statusCode: 200, body: {
message: 'ok',
result: res
}};
};
My RDS and Lambda share a Security Group in which I've opened for ALL traffic (I know this isn't ideal) and my Lambda has a role with Admin Rights (also not ideal) but still only times out.
You are using the Aurora Serverless Data API. The API does not exist inside your VPC. You have chosen isolated subnets, which have no access to anything that exists outside your VPC. You will either need to switch to private subnets, or add an RDS endpoint to your VPC.
It is important to call out that RDS API != RDS Data API; the two are different. You use the RDS API for standard RDS instances; for something like Aurora Serverless, you use the RDS Data API.
For anyone running across this in the future, there is now some helpful documentation describing how to create an Amazon VPC endpoint to allow access to the RDS Data API.
If you're using Terraform to create the VPC endpoint, here's a snippet that essentially replicates the instructions from the tutorial above:
resource "aws_vpc_endpoint" "rds-data" {
vpc_id = <your-vpc-id-here>
service_name = "com.amazonaws.<your-region-here>.rds-data"
vpc_endpoint_type = "Interface"
private_dns_enabled = true
security_group_ids = [
<your-security-group-ids-here>
]
subnet_ids = [
<your-subnet-ids-here>
]
}
Related
According to this documentation in GCP if a cloud function is connected to a VPC through Serverless VPC accessor and all the egress traffic from the function is routed to Serverless VPC accessor, the cloud function should not have access to internet. This is working fine for other APIs but Google APIs are still accessible.
I was connecting BigQuery from Cloud Function(Nodejs) privately using Serverless VPC accessor. So I expect the function to fail when I route all the egress traffic from the function to the Serverless VPC accessor because I haven't added the Cloud NAT. If I try to access any API other than google APIs (using node-fetch) it is failing as expected.
Please note: Default internet gateway is present in the VPC subnet
Cloud function codes:
Cloud function accessing BigQuery - It is expected to fail but working
const {BigQuery} = require('#google-cloud/bigquery');
const bigquery = new BigQuery();
exports.getCountries = async (req, res) => {
const query = "SELECT DISTINCT country_name FROM `bigquery-358204.covid_19_data.covid19_open_data` ORDER BY country_name;";
const options = {
query: query,
location: 'US',
};
// Run the query as a job
const [job] = await bigquery.createQueryJob(options);
console.log(`Job ${job.id} started.`);
// Wait for the query to finish
const [rows] = await job.getQueryResults();
let message = JSON.stringify(rows);
res.status(200).send(message);
};
Cloud function accessing an API other than google APIs - It is failing as expected
const fetch = require('node-fetch')
exports.helloWorld = async (req, res) => {
let resp = await fetch('https://www.7timer.info/bin/astro.php?lon=113.2&lat=23.1&ac=0&unit=metric&output=json&tzshift=0');
let body = await resp.json()
res.status(200).send(body)
};
I am building a system using Python flavored AWS CDK.
I have a lambda function with an attached EFS. To use EFS, I am required to put the lambda function inside a VPC. The problem is, I also want this lambda function to retrieve files from a particular S3 bucket (in the same region). I am getting Timeout errors when doing the retrieval, and upon some research it seems that I need either a NAT Gateway (too expensive) or a VPC endpoint to allow access.
How can I build a VPC endpoint in CDK to allow my lambda function to talk to my S3 bucket?
Edit: The comment below from #gshpychka is correct - only the gateway_endpoint in the vpc definition is required.
Here is what I came up with that seems to work after following the ideas in this guide.
You need to create both an S3 access point as well as a VPC Endpoint.
You make the VPC Endpoint when creating the VPC. This allows S3 buckets to be accessible from the VPC. You can later add a policy to restrict this access.
self.vpc = ec2.Vpc(
scope=self,
id="VPC",
vpc_name="my_VPC",
gateway_endpoints={
"s3": ec2.GatewayVpcEndpointOptions(
service=ec2.GatewayVpcEndpointAwsService.S3
)
},
nat_gateways=0,
)
You later create an S3 access point after creating the S3 bucket. This allows access to the bucket.
self.bucket_access = s3.CfnAccessPoint(
scope=self,
id="s3_access",
bucket=self.my_bucket.bucket_name,
name="bucket-access-point",
vpc_configuration=s3.CfnAccessPoint.VpcConfigurationProperty(
vpc_id=self.vpc.vpc_id
),
)
export class YourStack extends cdk.Stack {
constructor(scope: Construct, id: string, props?: cdk.StackProps) {
super(scope, id, props);
const vpc = ec2.Vpc.fromLookup(this, 'vpc', { isDefault: true });
const s3BucketAcessPoint = vpc.addGatewayEndpoint('s3Endpoint', {
service: ec2.GatewayVpcEndpointAwsService.S3,
});
s3BucketAcessPoint.addToPolicy(
new iam.PolicyStatement({
principals: [new iam.AnyPrincipal()],
actions: ['s3:*'],
resources: ['*'],
}),
);
}
}
How I can create an Athena data source in AWS CDK which is a JDBC connection to a MySQL database using the AthenaJdbcConnector?
I believe I can use aws-sam's CfnApplication to create the AthenaJdbcConnector Lambda, but how can I connect it to Athena?
I notice a lot of Glue support in CDK which would transfer to Athena (data catalog), and there are several CfnDataSource types in other modules such as QuickSight, but I'm not seeing anything under Athena in CDK.
See the image and references below.
References:
https://docs.aws.amazon.com/athena/latest/ug/athena-prebuilt-data-connectors-jdbc.html
https://github.com/awslabs/aws-athena-query-federation/tree/master/athena-jdbc
https://serverlessrepo.aws.amazon.com/applications/us-east-1/292517598671/AthenaJdbcConnector
I have been playing with the same issue. Here is what I did to create the Lambda for federated queries (Typescript):
const vpc = ec2.Vpc.fromLookup(this, 'my-project-vpc', {
vpcId: props.vpcId
});
const cluster = new rds.ServerlessCluster(this, 'AuroraCluster', {
engine: rds.DatabaseClusterEngine.AURORA_POSTGRESQL,
parameterGroup: rds.ParameterGroup.fromParameterGroupName(this, 'ParameterGroup', 'default.aurora-postgresql10'),
defaultDatabaseName: 'MyDB',
vpc,
vpcSubnets: {
onePerAz: true
},
scaling: {autoPause: cdk.Duration.seconds(0)} // Optional. If not set, then instance will pause after 5 minutes
});
let password = cluster.secret!.secretValueFromJson('password').toString()
let spillBucket = new Bucket(this, "AthenaFederatedSpill")
let lambdaApp = new CfnApplication(this, "MyDB", {
location: {
applicationId: "arn:aws:serverlessrepo:us-east-1:292517598671:applications/AthenaJdbcConnector",
semanticVersion: "2021.42.1"
},
parameters: {
DefaultConnectionString: `postgres://jdbc:postgresql://${cluster.clusterEndpoint.hostname}/MyDB?user=postgres&password=${password}`,
LambdaFunctionName: "crossref_federation",
SecretNamePrefix: `${cluster.secret?.secretName}`,
SecurityGroupIds: `${cluster.connections.securityGroups.map(value => value.securityGroupId).join(",")}`,
SpillBucket: spillBucket.bucketName,
SubnetIds: vpc.privateSubnets[0].subnetId
}
})
This creates the lambda with a default connection string like you would have it, if you used the AWS Console wizard in Athena to connect to a DataSource. Unfortunately it is NOT possible to add a Athena-catalog specific connection string via CDK. It should be set as an Environment Variable on the Lambda, and I found no way to do that. The Application template simply don't allow it, so this is a post-process by hand. I would sure like to hear from anybody if they have a solution for that!
Also notice that I add the user/password in the jdbc URL directly. I wanted to use SecretsManager, but because the Lambda is deployed in a VPC, it simply refuses to connect to the secretsmanager. I think this might be solvable by added a private VPN connection to SSM. Again - I would like to hear from anybody have tried that.
I am running a Node(12.x) Lambda in AWS. The purpose of this lambda is to interact with Cloudformation stacks, and I'm doing that via the aws-sdk. When testing this lambda locally using lambda-local, it executes successfully and the stack can be seen in CREATING state in AWS console.
However, when I push and run this lambda in AWS, it fails after 15 seconds, and I get this error:
{"errorType":"TimeoutError","errorMessage":"Socket timed out without establishing a connection","code":"TimeoutError","message":"Socket timed out without establishing a connection","time":"2020-06-29T03:10:27.668Z","region":"us-east-1","hostname":"cloudformation.us-east-1.amazonaws.com","retryable":true,"stack":["TimeoutError: Socket timed out without establishing a connection"," at Timeout.connectTimeout [as _onTimeout] (/var/task/node_modules/aws-sdk/lib/http/node.js:69:15)"," at listOnTimeout (internal/timers.js:549:17)"," at processTimers (internal/timers.js:492:7)"]}
This lead me to investigate the lambda timeout and the possible configuration changes I could make found in https://aws.amazon.com/premiumsupport/knowledge-center/lambda-function-retry-timeout-sdk/ and https://aws.amazon.com/premiumsupport/knowledge-center/lambda-vpc-troubleshoot-timeout/ but nothing worked.
I found a couple of similar issues such as AWS Lambda: Task timed out which include possible suggestions such as lambda timeout and lambda memory issues, but Ive set mine to 30 seconds and the logs show max memory used is 88MB out of possible 128MB, but I tried with an increase anyway, and no luck.
The curious part is that it fails without establishing a connection to hostname cloudformation.us-east-1.amazonaws.com. How is that possible when the role assigned to the lambda has full Cloudformation privileges? I'm completely out of ideas so any help would be greatly appreciated. Heres my code:
TEST EVENT:
{
"stackName": "mySuccessfulStack",
"app": "test"
}
Function my handler calls (createStack):
const AWS = require('aws-sdk');
const templates = {
"test": {
TemplateURL: "https://<bucket>.s3.amazonaws.com/<path_to_file>/test.template",
Capabilities: ["CAPABILITY_IAM"],
Parameters: {
"HostingBucket": "test-hosting-bucket"
}
}
}
async function createStack(event) {
AWS.config.update({
maxRetries: 2,
httpOptions: {
timeout: 30000,
connectTimeout: 5000
}
});
const cloudformation = new AWS.CloudFormation();
const { app, stackName } = event;
let stackParams = templates[app];
stackParams['StackName'] = app + "-" + stackName;
let formattedTemplateParams = [];
for (let [key, value] of Object.entries(stackParams.Parameters)) {
formattedTemplateParams.push({"ParameterKey":key, "ParameterValue": value})
}
stackParams['Parameters'] = formattedTemplateParams;
const result = await cloudformation.createStack(stackParams).promise();
return result;
}
Lambda function in a VPC does not public IP address nor internet access. From docs:
Connect your function to private subnets to access private resources. If your function needs internet access, use NAT. Connecting a function to a public subnet does not give it internet access or a public IP address.
There are two common solutions for that:
place lambda function in a private subnet and setup NAT gateway in public subnet. Then set route table from private subnet to the NAT device. This will enable the lambda to access the internet and subsequently CloudFormation service.
setup a VPC interface endpoint for CloudFormation. This will allow your lambda function in private subnet to access CloudFormation without the internet.
I'm using Terraform to build an API + corresponding lambda functions.
I have some other infrastructure, which I'd like to think is nicely set up (maybe I'm wrong?):
2 VPCs (let's just call'em test and prod)
Private & public subnets in each VPC
RDS DBs launched in the private subnets
All resources are identical on both VPCs; e.g. there's a test-private-subnet and a prod-private-subnet with exactly the same specs, same for DBs, etc.
Now, I'm working on the API and the lambdas that will power said API.
I don't feel like I need a test & prod API gateway and test & prod lambdas:
lambda code will be the same, just acting on different DB
you can use API stage_variables, with different ips, to achieve a test vs prod environment for the API
But when I try and setup a lambda with the vpc_config block (cause I need it to be associated with the security group that's allowed ingress on the DBs), I get the following error:
Error applying plan:
1 error(s) occurred:
* module.lambdas.aws_lambda_function.api-lambda-users: 1 error(s) occurred:
* aws_lambda_function.api-lambda-users: Error creating Lambda function: InvalidParameterValueException: Security Groups are required to be in the same VPC.
status code: 400, request id: xxxxx-xxxx-xxxx-xxxx-xxxxxxxxxx
My lambda config looks like this:
resource "aws_lambda_function" "api-lambda-users" {
provider = "PROVIDER"
function_name = "users"
s3_key = "users/${var.lambda-package-name}"
s3_bucket = "${var.api-lambdas-bucket}"
role = "${aws_iam_role.lambda-role.arn}"
handler = "${var.handler-name}"
runtime = "${var.lambda-runtime}"
vpc_config {
security_group_ids = [
//"${data.aws_security_group.prod-lambda.id}",
"${data.aws_security_group.test-lambda.id}"
]
subnet_ids = [
//"${data.aws_subnet.prod-primary.id}",
"${data.aws_subnet.test-primary.id}"
]
}
}
Notice I'd ideally like to just specify them, together, in their corresponding lists.
Am I missing something?
Suggestions?
Any help, related or not, is much appreciated.
Lambda running inside a vpc is subject to the same networking "rules" as ec2 instances. So it can't "exist" in two VPC's. If the lambda function needs to talk vpc resources in two separate VPC's you could use something like VPC peering or just run two copies of the function in the two different vpc's.
When you add VPC configuration to a Lambda function, it can only access resources in that VPC. If a Lambda function needs to access both VPC resources and the public Internet, the VPC needs to have a Network Address Translation (NAT) instance inside the VPC and a VPC Peering connection.