I am building a system using Python flavored AWS CDK.
I have a lambda function with an attached EFS. To use EFS, I am required to put the lambda function inside a VPC. The problem is, I also want this lambda function to retrieve files from a particular S3 bucket (in the same region). I am getting Timeout errors when doing the retrieval, and upon some research it seems that I need either a NAT Gateway (too expensive) or a VPC endpoint to allow access.
How can I build a VPC endpoint in CDK to allow my lambda function to talk to my S3 bucket?
Edit: The comment below from #gshpychka is correct - only the gateway_endpoint in the vpc definition is required.
Here is what I came up with that seems to work after following the ideas in this guide.
You need to create both an S3 access point as well as a VPC Endpoint.
You make the VPC Endpoint when creating the VPC. This allows S3 buckets to be accessible from the VPC. You can later add a policy to restrict this access.
self.vpc = ec2.Vpc(
scope=self,
id="VPC",
vpc_name="my_VPC",
gateway_endpoints={
"s3": ec2.GatewayVpcEndpointOptions(
service=ec2.GatewayVpcEndpointAwsService.S3
)
},
nat_gateways=0,
)
You later create an S3 access point after creating the S3 bucket. This allows access to the bucket.
self.bucket_access = s3.CfnAccessPoint(
scope=self,
id="s3_access",
bucket=self.my_bucket.bucket_name,
name="bucket-access-point",
vpc_configuration=s3.CfnAccessPoint.VpcConfigurationProperty(
vpc_id=self.vpc.vpc_id
),
)
export class YourStack extends cdk.Stack {
constructor(scope: Construct, id: string, props?: cdk.StackProps) {
super(scope, id, props);
const vpc = ec2.Vpc.fromLookup(this, 'vpc', { isDefault: true });
const s3BucketAcessPoint = vpc.addGatewayEndpoint('s3Endpoint', {
service: ec2.GatewayVpcEndpointAwsService.S3,
});
s3BucketAcessPoint.addToPolicy(
new iam.PolicyStatement({
principals: [new iam.AnyPrincipal()],
actions: ['s3:*'],
resources: ['*'],
}),
);
}
}
Related
In our environment there is a dedicated AWS account that contains registered domain as well as hosting zone in Route53. Also an IAM role is created that allows specific set of other accounts to create records in that hosted zone.
Using AWS CDK (v2) is there a way to create API Gateway in one account with DNS record (A Record?) created for it in that dedicated one?
This is an example of setup:
export class CdkRoute53ExampleStack extends Stack {
constructor(scope: Construct, id: string, props?: StackProps) {
super(scope, id, props);
const backend = new lambda.Function(this, 'HelloHandler', {
runtime: lambda.Runtime.NODEJS_14_X,
code: lambda.Code.fromAsset('src'),
handler: 'hello.handler'
});
const restApi = new apigw.LambdaRestApi(this, 'Endpoint', {
handler: backend,
domainName: {
domainName: `cdk53.${Config.domainName}`,
certificate: acm.Certificate.fromCertificateArn(
this,
"my-cert",
Config.certificateARN
),
},
endpointTypes: [apigw.EndpointType.REGIONAL]
});
new route53.ARecord(this, "apiDNS", {
zone: route53.HostedZone.fromLookup(this, "baseZone", {
domainName: Config.domainName,
}),
recordName: "cdk53",
target: route53.RecordTarget.fromAlias(
new route53targets.ApiGateway(restApi)
),
});
}
}
Basically I need that last ARecord construct to be created under credentials from assumed role in another account.
As far as I am aware, a CDK stack is built and deployed entirely within the context of a single IAM user (aka identity). I.e. you can't run different bits of the stack as different IAM users. (As an aside, code which uses the regular AWS SDK - such as lambdas - can switch identities using STS.)
The solution therefore is to do as much as you can using the CDK (in account B). After this is complete, then the final step - registering the DNS record - is done using a different identity which operates within account A.
Registering the DNS record could be done using AWS CLI commands, or you could even create another (mini) stack just for this purpose.
Either way you would execute the second step as an identity which is allowed to write records to the hosted zone in account A.
This could be achieved by using a different --profile with your CLI or CDK commands. Or you could use STS to assume a role which is allowed to create the DNS record in account A.
Using STS has the advantage that you don't need to know credentials of account A. But I've found STS to have a steep learning curve and can be a little confusing to get right.
EDIT: it seems the CDK stack in account B can actually switch roles when registering a DNS record by virtue of the CrossAccountZoneDelegationRecord construct and the delegationRole attribute - see https://stackoverflow.com/a/72097522/226513 This means that you can keep all your code in the account B CDK stack.
How I can create an Athena data source in AWS CDK which is a JDBC connection to a MySQL database using the AthenaJdbcConnector?
I believe I can use aws-sam's CfnApplication to create the AthenaJdbcConnector Lambda, but how can I connect it to Athena?
I notice a lot of Glue support in CDK which would transfer to Athena (data catalog), and there are several CfnDataSource types in other modules such as QuickSight, but I'm not seeing anything under Athena in CDK.
See the image and references below.
References:
https://docs.aws.amazon.com/athena/latest/ug/athena-prebuilt-data-connectors-jdbc.html
https://github.com/awslabs/aws-athena-query-federation/tree/master/athena-jdbc
https://serverlessrepo.aws.amazon.com/applications/us-east-1/292517598671/AthenaJdbcConnector
I have been playing with the same issue. Here is what I did to create the Lambda for federated queries (Typescript):
const vpc = ec2.Vpc.fromLookup(this, 'my-project-vpc', {
vpcId: props.vpcId
});
const cluster = new rds.ServerlessCluster(this, 'AuroraCluster', {
engine: rds.DatabaseClusterEngine.AURORA_POSTGRESQL,
parameterGroup: rds.ParameterGroup.fromParameterGroupName(this, 'ParameterGroup', 'default.aurora-postgresql10'),
defaultDatabaseName: 'MyDB',
vpc,
vpcSubnets: {
onePerAz: true
},
scaling: {autoPause: cdk.Duration.seconds(0)} // Optional. If not set, then instance will pause after 5 minutes
});
let password = cluster.secret!.secretValueFromJson('password').toString()
let spillBucket = new Bucket(this, "AthenaFederatedSpill")
let lambdaApp = new CfnApplication(this, "MyDB", {
location: {
applicationId: "arn:aws:serverlessrepo:us-east-1:292517598671:applications/AthenaJdbcConnector",
semanticVersion: "2021.42.1"
},
parameters: {
DefaultConnectionString: `postgres://jdbc:postgresql://${cluster.clusterEndpoint.hostname}/MyDB?user=postgres&password=${password}`,
LambdaFunctionName: "crossref_federation",
SecretNamePrefix: `${cluster.secret?.secretName}`,
SecurityGroupIds: `${cluster.connections.securityGroups.map(value => value.securityGroupId).join(",")}`,
SpillBucket: spillBucket.bucketName,
SubnetIds: vpc.privateSubnets[0].subnetId
}
})
This creates the lambda with a default connection string like you would have it, if you used the AWS Console wizard in Athena to connect to a DataSource. Unfortunately it is NOT possible to add a Athena-catalog specific connection string via CDK. It should be set as an Environment Variable on the Lambda, and I found no way to do that. The Application template simply don't allow it, so this is a post-process by hand. I would sure like to hear from anybody if they have a solution for that!
Also notice that I add the user/password in the jdbc URL directly. I wanted to use SecretsManager, but because the Lambda is deployed in a VPC, it simply refuses to connect to the secretsmanager. I think this might be solvable by added a private VPN connection to SSM. Again - I would like to hear from anybody have tried that.
I have a VPC with two ISOLATED subnets, one for my RDS Serverless cluster, and one for my Lambda functions.
But my Lambda functions all timeout when they're calling my RDS.
My question is; is this VPC + isolated subnets a working structure for API Gateway -> Lambda -> RDS, or am I trying something impossible?
Lambda:
import * as AWS from 'aws-sdk';
const rdsDataService = new AWS.RDSDataService();
const query = `SELECT * FROM information_schema.tables;`;
export const handler = async (event) => {
const params = {
secretArn: `secret arn`,
resourceArn: "rds arn",
sql: query,
database: 'db name'
};
const res = await rdsDataService.executeStatement(params).promise();
return { statusCode: 200, body: {
message: 'ok',
result: res
}};
};
My RDS and Lambda share a Security Group in which I've opened for ALL traffic (I know this isn't ideal) and my Lambda has a role with Admin Rights (also not ideal) but still only times out.
You are using the Aurora Serverless Data API. The API does not exist inside your VPC. You have chosen isolated subnets, which have no access to anything that exists outside your VPC. You will either need to switch to private subnets, or add an RDS endpoint to your VPC.
It is important to call out that RDS API != RDS Data API; the two are different. You use the RDS API for standard RDS instances; for something like Aurora Serverless, you use the RDS Data API.
For anyone running across this in the future, there is now some helpful documentation describing how to create an Amazon VPC endpoint to allow access to the RDS Data API.
If you're using Terraform to create the VPC endpoint, here's a snippet that essentially replicates the instructions from the tutorial above:
resource "aws_vpc_endpoint" "rds-data" {
vpc_id = <your-vpc-id-here>
service_name = "com.amazonaws.<your-region-here>.rds-data"
vpc_endpoint_type = "Interface"
private_dns_enabled = true
security_group_ids = [
<your-security-group-ids-here>
]
subnet_ids = [
<your-subnet-ids-here>
]
}
I have created a VPC using CDK, and now have a need to change the number of subnets and NAT gateways without destroying the (now in production) ec2 instances in it or the (now in production) EIPs that are whitelisted in other systems.
I tried this using CDK and it tried to destroy the VPC and re-provision everything, which is undesirable in production.
Since it is apparently not possible to do what I want by changing the stack definition, I need to get rid of the stack, without removing the VPC.
https://docs.aws.amazon.com/cdk/api/latest/docs/#aws-cdk_core.RemovalPolicy.html
I found this document here about a RemovalPoliy enum in CDK, but there is nothing saying where that can be applied. Google searches resulted in only references to S3 and RDS, even though this enum is in CDK core. I tried applying it to the following CDK resources: stack, vpc, security group, etc, but none of those constructs accept the removalPolicy parameter.
eg:
const vpc = new ec2.Vpc(this, 'PRODVPC', {
maxAzs: 2,
removalPolicy: cdk.RemovalPolicy.RETAIN
});
new VPCPRODStack(app, 'RDVPCPRODStack', {
env: {region: 'ca-central-1', account: '12345'},
removalPolicy: cdk.RemovalPolicy.RETAIN
});
What am I missing? Is it even possible to set a cloudformation removal policy using CDK at the stack level?
I think the one you're requesting for is https://github.com/aws/aws-cdk-rfcs/issues/72
For now
You can define a custom resource backed by lambda which calls cloudformation set-stack-policy api and updates policy.
You have https://docs.aws.amazon.com/cdk/api/latest/docs/#aws-cdk_core.CfnResource.html#apply-wbr-removal-wbr-policypolicy-options which you can use to applyRemovalPolicy i.e to either destroy or retain.
ex: If I want to apply destroy policy to all my cfnresources
import { CfnResource, IAspect, IConstruct, RemovalPolicy } from '#aws-cdk/core';
apply_destroy_removal_policy_aspect
export class ApplyDestroyRemovalPolicyAspect implements IAspect {
public visit(construct: IConstruct): void {
if (CfnResource.isCfnResource(construct)) {
construct.applyRemovalPolicy(RemovalPolicy.DESTROY);
}
}
}
and apply it on all the resources by a stack
const destroyResourcesAspect = new ApplyDestroyRemovalPolicyAspect();
core.Aspects.of(safetyStack).add(destroyResourcesAspect));
This I've seen lot of people using to ensure it destroys any resources created as part of Personal Stacks in CDK.
Intermittently getting the following error while running aws glue job,
Error downloading script: fatal error: An error occurred (404) when calling the HeadObject operation:
Not sure why it would be intermittent, but this is likely an issue connecting to S3. A few things to check:
Glue Jobs run with an IAM role. You can check your Job details to see what it's currently set to. You should make sure that role has privileges to access the S3 bucket that has your job code in it.
Glue Jobs require a VPC endpoint. You should check to make sure that you have one properly created for the VPC you're using
It is possible to configure a VPC endpoint without associating it with any subnets. Check your VPC Endpoint for the correct routing.
Below is a bit of reference code written with AWS CDK, in case it's helpful
IAM Role
new iam.Role(this, `GlueJobRole`, {
assumedBy: new iam.ServicePrincipal(`glue.amazonaws.com`),
managedPolicies: [
iam.ManagedPolicy.fromAwsManagedPolicyName(
`service-role/AWSGlueServiceRole`
),
],
});
VPC Endpoint
const vpc = ec2.Vpc.fromLookup(this, `VPC`, { vpcId: VPC_ID });
new ec2.GatewayVpcEndpoint(this, `S3VpcEndpoint`, {
service: ec2.GatewayVpcEndpointAwsService.S3,
subnets: vpc.publicSubnets,
vpc,
});
May be your bucket is enabled customer managed key. You need to add Glue role to kms to fixed this issue.