Currently i'm working on app that collects data and processes them. All data collection is handled by AWS Lambda.First lambda get data from APIs, data is processed and sent to SQS. Everything works fine, but unfortunately NAT costs are higher than expected. Lambda downloads around 10TB monthly (I'm planning to increase that number), so i'm charged about 500$ monthly just for NAT traffic. Other services don't use NAT. Is there any NAT alternative or way to reduce costs? What i thought about is to replace SQS with ElasticCache in private subnet, but that's 'just' 3 TB of traffic less.
If an AWS Lambda function is connected to a VPC, it can communicate with resources in the VPC. For example, it might need to connect to an Amazon RDS database in the VPC.
To access the Internet, the Lambda function would need to be connected to a private subnet and then communicate with the Internet via a NAT Gateway or NAT Instance. (A NAT Instance is cheaper than a NAT Gateway, but is less reliable.)
However, if the Lambda function does not require access to resources inside the VPC, then do not associate a VPC with the Lambda function. This will provide direct access to the Internet.
Related
I have two lambdas. LambdaA is the parent lambda that invokes LambdaB in parallel using the Event InvocationType (boto3). In every invocation, LambdaA sends a payload of 5MB to LambdaB. Both the lambdas are in the same VPC and in the same two private subnets (and same security group).
Now, assuming that LambdaA invokes LambdaB 5000 times in parallel for further invocation, a total payload of 25GB would be transferred between LambdaA and LambdaB.
I am trying to find out if I would be charged for the 50GB of data transfer as a data transfer cost, given that the data transfer is within the same VNet and the same two private subnets (and same security group).
Would I also be charged if there are in the same VPC and in the same private subnet (only one and same security group)?
When an AWS Lambda function invokes another AWS Lambda function, it would be sending traffic to the endpoint of the AWS Lambda service (not to the other Lambda function itself). Since your first Lambda function is connected to a VPC and the AWS Lambda service endpoint is on the Internet, the request would need to exit the VPC to access the Internet.
From EC2 On-Demand Instance Pricing – Amazon Web Services:
Data transferred “in” to and “out” from public or Elastic IPv4 address is charged at $0.01/GB in each direction.
However, if your first Lambda function was not connected to a VPC, then there would be no such charge since the Lambda function would be directly connected to the Internet. Typically, you should only connect an AWS Lambda function to a VPC if it specifically needs to access resources in that VPC (eg an Amazon RDS database).
Alternatively, you could use a VPC Endpoint to directly connect to to the AWS Lambda service. From Configuring interface VPC endpoints for Lambda - AWS Lambda:
If you use Amazon Virtual Private Cloud (Amazon VPC) to host your AWS resources, you can establish a connection between your VPC and Lambda. You can use this connection to invoke your Lambda function without crossing the public internet.
This would allow your Lambda function to connect to the VPC, but also connect to the AWS Lambda service without 'exiting' the VPC, thereby avoiding the 1c/GB charge.
The main thing to realise is that the two Lambda functions are not directly communicating. Rather, the communication is to the AWS Lambda service, which is then responsible for provisioning and invoking the second Lambda function.
Yes, you will be charged the EC2 AZ to AZ ingress and egress cost.
If the data was downloaded via S3 there would be no cost.
I have a lambda function which runs every 15 minutes and saves some data in DynamoDB.
Now I want to secure the DynamoDB call made by my lambda so that the request does not go via the Internet, rather through Amazon internal network. There is no EC2 instance involved here though.
I have seen a few recommendations for using PrivateLink which binds the Dynamo to VPC endpoints so that calls made from EC2 instances always go via internal network bypassing Internet.
I was wondering such a configuration is possible for lamda calling DynamoDB since lamda itself does not run in any EC2 instance and is rather serverless?
The first thing I would say is that all of your traffic between Lambda and DynamoDB is signed and encrypted, so that's typically sufficient.
There are use cases, most typically compliance reasons, when this is not sufficient. In that case you can deploy the Lambda function into a VPC of your making and configure the VPC with a private VPC endpoint for DynamoDB. Typically, the VPC would be configured without an internet gateway or NAT so that it has no egress route to the public internet. Be aware that your Lambda function startup latency will be higher than usual, because each Lambda function environment needs to attach an ENI for access to the private endpoint.
See Configuring a Lambda Function to Access Resources in an Amazon VPC.
If you don't need to access resources in a VPC, AWS recommends not to run AWS Lambda functions in a VPC. From AWS Lambda Best Practices:
Don't put your Lambda function in a VPC unless you have to. There is no benefit outside of using this to access resources you cannot expose publicly, like a private Amazon Relational Database instance. Services like Amazon Elasticsearch Service can be secured over IAM with access policies, so exposing the endpoint publicly is safe and wouldn't require you to run your function in the VPC to secure it.
Running Lambda functions in VPC adds additionally complexity, which can negatively effect scalability and performance. Each Lambda function in a VPC needs an Elastic Network Interface (ENI). Provisioning ENI's is slow and the amount of ENI's you can have is limited, so when you scale up you can run into a shortage of ENI's, preventing your Lambda functions to scale up further.
This is one way to do it.
Step 1) Deploy your lambda inside VPC.
Step 2) Create VPC Endpoint to the DynamoDB.
This should help: https://aws.amazon.com/blogs/aws/new-vpc-endpoints-for-dynamodb/
I have a Lambda running within a VPC which accesses secure resources (ex: RDS), but I also need it to publish an SNS notification. Is there a way to do this without having a NAT gateway?
Alternatively, I'm thinking of writing to a DynamoDB table which triggers another lambda but wanted to know if there's a simpler approach.
The simple answer is no. SNS is not currently available as a VPC endpoint so you will need to continue doing what you are already doing in order to reach RDS via lambda (NAT gateway in the private subnet).
In other words, this answer from 2016 is still relevant today -> How to let AWS lambda in a VPC to publish SNS notification?
Option A: using a NAT gateway
Your lambda is in a private subnet which means it cannot have communication with the outside world (the internet), so unless you make that a public subnet, which is of course not recommended, you cannot access the outside world. A NAT gateway allows your resources to have that access through it, and it's really not that hard to implement.
Here's a handy tutorial on how to do that:
https://github.com/naguibihab/aws-goodies/blob/master/how-to-setup-lambda-to-talk-to-internet-and-vpc.md
Option B: using a NAT instance Similar to using a NAT gateway, you can also use a NAT instance which requires a bit more administration and might have less availability but it can also be cheaper. A NAT instance works in the same way as a NAT gateway, it needs to sit in a public subnet and any lambda functions in the private subnet can access the internet through it.
As for your DynamoDB alternative, you can create a VPC Endpoint for Dynamo db as Khalid T suggested: https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/vpc-endpoints-ddb.html
Edit
I have been corrected, edited my answer.
I have an AWS Lambda function that makes use of an ElastiCache Redis cluster.
Since the Redis cluster is "locked" in a VPC, the Lambda function must reside in that VPC too.
For some reason, if the Lambda is allocated an IP of a public subnet, which has an Internet gateway - it still cannot make connections to the outside (the internet), thus making it impossible to use Kinesis.
For that, they suggest using a NAT gateway which lets the Lambda connect to the outside.
Basically, this works for me - but my issue is the money.
This solution is expensive for large amount of data transfers and I'm looking for some way to make it cheaper.
For a small POC that I've made, I paid ~$10.
This is too much for ~30GB as my production pipeline will run hundreds of gigabytes / month.
How do you suggest I let the Lambda function connect the outside (specifically Kinesis) without using a NAT gateway?
Thank you!
without using a NAT gateway?
Use a NAT instance.
You have to have one of these two things for anything in VPC to access the Internet from a private IP address.
NAT instances were exactly how this was always done in VPC, until the relatively new NAT Gateway service was rolled out.
You can also use a NAT gateway, which is a managed NAT service that provides better availability, higher bandwidth, and requires less administrative effort. For common use cases, we recommend that you use a NAT gateway rather than a NAT instance.
http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_NAT_Instance.html
Sure, it's easier, but it costs more. A lot more. The most significant difference in this case is that with a NAT instance, you pay a flat rate for use of the hardware, which could be an inexpensive t2.nano, $5/mo.
The NAT Gateway service is a high powered solution with nearly infinite scaling capacity, and is priced accordingly. A NAT instance is only as good as the hardware you choose to run it on, but I find t2.nano and t2.micro quite adequate for workloads requiring less than 250 Mbit/s of Internet connectivity.
Use the link, above, to learn more.
Lambda function instances will never be assigned a public IP address, regardless of the type of VPC subnet you place them in. A NAT gateway is the only solution to provide a Lambda function inside a VPC with access to resources that reside outside the VPC (like Kinesis).
If that isn't going to work for you due to cost, you might look into running a Redis server on an EC2 instance with an Elastic IP, which would allow the Lambda function to connect without being inside the VPC. A similar alternative would be to use RedisLabs instead of ElastiCache.
I need to connect dynamoDb and elasticache from aws-lambda (otherthan using NAT Gateway).
ElastiCache provides essential caching methods along with help in making the Lambda state-ful. The concern is that for Lambda to work nice with DynamoDB it should be set to NoVPC.
If we have to use ElastiCache, Lambda and both have to be in the same VPC.TO use Both ElastiCache and DynamoDB together is quite a challenge specially with Lambda. Given the VPC challenges.Do you have any suggestions to make this easier?
A Lambda function would have to have VPC access to connect to ElastiCache, and it would have to have access to resources outside the VPC to access DynamoDB so it would require a NAT gateway. There is no way to provide access to both of those services to a single Lambda function without enabling VPC access and setting up a NAT gateway.
If you just need a Redis server and aren't required specifically to use ElasiCache, then you could use a RedisLabs instance which wouldn't require you to enable VPC access on your Lambda function.
There is now a relatively easy solution for DynamoDb access from a VPC: VPC Endpoints.
"Previously, if you wanted your EC2 (elroy: or lambda) instances in your VPC to be able to access DynamoDB, you had two options. You could use an Internet Gateway (with a NAT Gateway or assigning your instances public IPs) or you could route all of your traffic to your local infrastructure via VPN or AWS Direct Connect and then back to DynamoDB."
"A VPC endpoint for DynamoDB enables Amazon EC2 instances in your VPC to use their private IP addresses to access DynamoDB with no exposure to the public Internet...Your EC2 instances do not require public IP addresses, and you do not need an Internet gateway, a NAT device, or a virtual private gateway in your VPC. You use endpoint policies to control access to DynamoDB. Traffic between your VPC and the AWS service does not leave the Amazon network. "
The above quotes come from the links below. Note the the references to "EC2 instances" apply to lambda contexts as well.
See https://aws.amazon.com/blogs/aws/new-vpc-endpoints-for-dynamodb/
and
http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/vpc-endpoints-dynamodb.html