AWS Lambda - use Kinesis under VPC - amazon-web-services

I have an AWS Lambda function that makes use of an ElastiCache Redis cluster.
Since the Redis cluster is "locked" in a VPC, the Lambda function must reside in that VPC too.
For some reason, if the Lambda is allocated an IP of a public subnet, which has an Internet gateway - it still cannot make connections to the outside (the internet), thus making it impossible to use Kinesis.
For that, they suggest using a NAT gateway which lets the Lambda connect to the outside.
Basically, this works for me - but my issue is the money.
This solution is expensive for large amount of data transfers and I'm looking for some way to make it cheaper.
For a small POC that I've made, I paid ~$10.
This is too much for ~30GB as my production pipeline will run hundreds of gigabytes / month.
How do you suggest I let the Lambda function connect the outside (specifically Kinesis) without using a NAT gateway?
Thank you!

without using a NAT gateway?
Use a NAT instance.
You have to have one of these two things for anything in VPC to access the Internet from a private IP address.
NAT instances were exactly how this was always done in VPC, until the relatively new NAT Gateway service was rolled out.
You can also use a NAT gateway, which is a managed NAT service that provides better availability, higher bandwidth, and requires less administrative effort. For common use cases, we recommend that you use a NAT gateway rather than a NAT instance.
http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_NAT_Instance.html
Sure, it's easier, but it costs more. A lot more. The most significant difference in this case is that with a NAT instance, you pay a flat rate for use of the hardware, which could be an inexpensive t2.nano, $5/mo.
The NAT Gateway service is a high powered solution with nearly infinite scaling capacity, and is priced accordingly. A NAT instance is only as good as the hardware you choose to run it on, but I find t2.nano and t2.micro quite adequate for workloads requiring less than 250 Mbit/s of Internet connectivity.
Use the link, above, to learn more.

Lambda function instances will never be assigned a public IP address, regardless of the type of VPC subnet you place them in. A NAT gateway is the only solution to provide a Lambda function inside a VPC with access to resources that reside outside the VPC (like Kinesis).
If that isn't going to work for you due to cost, you might look into running a Redis server on an EC2 instance with an Elastic IP, which would allow the Lambda function to connect without being inside the VPC. A similar alternative would be to use RedisLabs instead of ElastiCache.

Related

How to reduce NAT traffic costs - Lambda and SQS

Currently i'm working on app that collects data and processes them. All data collection is handled by AWS Lambda.First lambda get data from APIs, data is processed and sent to SQS. Everything works fine, but unfortunately NAT costs are higher than expected. Lambda downloads around 10TB monthly (I'm planning to increase that number), so i'm charged about 500$ monthly just for NAT traffic. Other services don't use NAT. Is there any NAT alternative or way to reduce costs? What i thought about is to replace SQS with ElasticCache in private subnet, but that's 'just' 3 TB of traffic less.
If an AWS Lambda function is connected to a VPC, it can communicate with resources in the VPC. For example, it might need to connect to an Amazon RDS database in the VPC.
To access the Internet, the Lambda function would need to be connected to a private subnet and then communicate with the Internet via a NAT Gateway or NAT Instance. (A NAT Instance is cheaper than a NAT Gateway, but is less reliable.)
However, if the Lambda function does not require access to resources inside the VPC, then do not associate a VPC with the Lambda function. This will provide direct access to the Internet.

How to choose Elastic IP when my aws lambda function execute

I want to select specific Elastic IP my own when my lambda function executed.
my service has to respond to several situations, and by user's attributes.
Could I write code in a lambda function, that can choose specific my own elastic IP?
I had searched for this. but old information says it cannot do.
but recently I heard about it is possible by using Network Load Balancer or Application Load Balancer.
But I don't know how to use this for the problem.
No. You cannot associate an Elastic IP (EIP) address with an AWS Lambda function.
Well, actually you can, but I wouldn't recommend it. When a Lambda function is associated with a VPC, it connects via an Elastic Network Interface (ENI). It is possible to attach an EIP to an ENI. This also grants access to the Internet if it is attached to a public subnet.
So why avoid it? Because Lambda might create additional ENIs, especially if the Lambda function is frequently invoked and run in parallel. This means it will not have a consistent ENI.
An alternative method is:
Attach the AWS Lambda function to a private subnet
Put a NAT Gateway in a public subnet
Associate an Elastic IP address with the NAT Gateway
All traffic from the Lambda function to the Internet will then come from the NAT Gateway's EIP (however, I don't think you can change that EIP)
Looking at #John Rotenstein's reply: for small systems, with limited calls to the same lambda adding an EIP to the ENI for a lambda could work - if you put a queue in front of the lambda to handle the requests and limit the concurrency of the lambda to 1. That's cheaper than a NAT Gateway (saves around $30) per month. For larger systems, this may not be an issue and you may need the concurrency to be more than one - in that case the NAT gateway is the only way out.

Static IP for outbound API calls

A new api service we use requires that we give them a list of all the IP addresses our calls will be coming from; if we make an api call from any other IP address, the call will fail.
This question has been asked before here, but I'm wondering if in 2019 there is any simpler/easier/lower cost solution.
Our Setup
Elastic Beanstalk, which currently scales to anywhere from 5 - 50 ec2 instances for our web application based on traffic
An Application Load Balancer
Also have a worker tier, which would be available for use if that might be helpful
Typically these api calls would be coming from any of our web tier ec2 instances, as the calls will be based on a user interaction. We can of course set up something different, e.g. have the worker tier make the calls
Solutions I've Found
Give each ec2 instance an elastic (static) ip address. This is not a great solution for us, because as we hopefully continue to scale the number of ip addresses needed will continue to grow {ref}
Set up two NAT instances (one not being sufficient as it would be a single point of failure). I'm hoping there is something simpler and lower cost than this option. {ref} {ref}
Create new ec2 instances and put them behind a Network Load Balancer. Again, complex and costly. {ref}
Are there any new, easier, less costly solutions? I have never used AWS Lambda before; maybe it is be possible to run Lambda functions all from one IP address? I don't have many ideas beyond that at this point. Thanks for your time.
A NAT is the best solution, and shouldn't cost you much more than a web-server.
The simplest way to use a NAT is the NAT Gateway. Pricing depends on region, but it's around $0.05/hour, which is a little more than the price of a t3.medium EC2 instance. You're also charged a per-GB rate for data, which can add up quickly. On the positive side, Amazon manages the infrastructure for you, including patches and high-availability.
A NAT Instance is an EC2 instance running a specially-configured AMI. You could probably get away with running this on a t3.micro instance, at $0.01 per hour, which is probably much less than any of your webservers. You will be responsible for applying patches and waking up in the middle of the night if anything goes wrong.
You can probably get away with a single NAT, of either type. You will pay for cross-AZ traffic by doing this ($0.01/GB), so it will be false economy if you move a lot of data across the NAT. It's a tossup on whether you'll get higher availability from two NATs, because you can only reference one at a time in your routing tables. So if one goes down you'll have to update the routing tables to point at the other, which will probably take as much time as bringing up a new instance.
You can't use a Lambda, because it needs to have a permanent IP address assignment and you can't control that with Lambda. You could write your own proxy server, running on EC2, but the costs for that are the same as a NAT Instance.
Here is prescriptive guidance from AWS: https://docs.aws.amazon.com/prescriptive-guidance/latest/patterns/generate-a-static-outbound-ip-address-using-a-lambda-function-amazon-vpc-and-a-serverless-architecture.html
"This pattern describes how to generate a static outbound IP address in the Amazon Web Services (AWS) Cloud by using a serverless architecture..."
Essentially, you have an AWS Lambda function that uses an Elastic IP address as the outbound IP address. In the guidance, you will create "a Lambda function and a virtual private cloud (VPC) that routes outbound traffic through an internet gateway with a static IP address. To use the static IP address, you attach the Lambda function to the VPC and its subnets. "

AWS Lambda potential alternatives to connect to RDS in VPC

I am using a lambda function in a VPC to connect to an RDS instance in the same VPC. I am considering removing the lambda from the VPC to massively reduce the cold-start time but I want to keep my RDS instance in the VPC.
Can anyone foresee major problems with making the lambda function use an SSH tunnel to connect to a bastion instance within the VPC and subsequently to the RDS instance? Or something similar with a VPN?
There will obviously be some over-head as the traffic has an extra 'jump' so to speak, but would it be significant enough to make this approach non-feasible? Or is the only current approach to keep the Lambda in the same VPC and try to keep and few invocations running?
I also pay for a NAT gateway so my Lambda in a VPC can access the internet. If I can get it out of the VPC by using an SSH tunnel to connect to the RDS instance it will also simplify my architecture here & reduce my operating costs.
Cold starts because of Lambda's in VPC are a big issue, especially when you want to use a relational database. Luckily, AWS has acknowledged this issue and there is hope on the horizon;
Aurora Serverless now supports the Data API that allows to run SQL queries using the AWS SDK over https. This is released on Nov 20 ('18) and is in beta and only in us-east-1, but it's a start.
During re:Invent '18 an improvement on the VPC-cold-start issue was announced (but no release date yet) in which they basically create an ENI for a group of Lambda's and have that ENI ready even if there are no Lambda's warm.

AWS lambda call dynamo db through private network or bypass internet traffic

I have a lambda function which runs every 15 minutes and saves some data in DynamoDB.
Now I want to secure the DynamoDB call made by my lambda so that the request does not go via the Internet, rather through Amazon internal network. There is no EC2 instance involved here though.
I have seen a few recommendations for using PrivateLink which binds the Dynamo to VPC endpoints so that calls made from EC2 instances always go via internal network bypassing Internet.
I was wondering such a configuration is possible for lamda calling DynamoDB since lamda itself does not run in any EC2 instance and is rather serverless?
The first thing I would say is that all of your traffic between Lambda and DynamoDB is signed and encrypted, so that's typically sufficient.
There are use cases, most typically compliance reasons, when this is not sufficient. In that case you can deploy the Lambda function into a VPC of your making and configure the VPC with a private VPC endpoint for DynamoDB. Typically, the VPC would be configured without an internet gateway or NAT so that it has no egress route to the public internet. Be aware that your Lambda function startup latency will be higher than usual, because each Lambda function environment needs to attach an ENI for access to the private endpoint.
See Configuring a Lambda Function to Access Resources in an Amazon VPC.
If you don't need to access resources in a VPC, AWS recommends not to run AWS Lambda functions in a VPC. From AWS Lambda Best Practices:
Don't put your Lambda function in a VPC unless you have to. There is no benefit outside of using this to access resources you cannot expose publicly, like a private Amazon Relational Database instance. Services like Amazon Elasticsearch Service can be secured over IAM with access policies, so exposing the endpoint publicly is safe and wouldn't require you to run your function in the VPC to secure it.
Running Lambda functions in VPC adds additionally complexity, which can negatively effect scalability and performance. Each Lambda function in a VPC needs an Elastic Network Interface (ENI). Provisioning ENI's is slow and the amount of ENI's you can have is limited, so when you scale up you can run into a shortage of ENI's, preventing your Lambda functions to scale up further.
This is one way to do it.
Step 1) Deploy your lambda inside VPC.
Step 2) Create VPC Endpoint to the DynamoDB.
This should help: https://aws.amazon.com/blogs/aws/new-vpc-endpoints-for-dynamodb/