How does an AWS Lambda function scale inside a VPC subnet? - amazon-web-services

I understand the AWS Lambda is a serverless concept wherein a piece of code can be triggered on some event.
I want to understand how does the Lambda handle scaling?
For eg. if my Lambda function sits inside a VPC subnet as it wants to access VPC resources, and that the subnet has a CIDR of 192.168.1.0/24, which would result in 251 available IPs after subtracting the AWS reserved 5 IPs
Would that mean if my AWS Lambda function gets 252 invocations at the exact same time,Only 251 of the requests would be served and 1 would either timeout or will get executed once one of the 252 functions completes execution?
Does the Subnet size matter for the AWS Lambda scaling?
I am following this reference doc which mentions concurrent execution limits per region,
Can I assume that irrespective of whether an AWS Lambda function is No VPC or if it's inside a VPC subnet, it will scale as per mentioned limits in the doc?

Vladyslav's answer is still technically correct (Subnet size does matter), but things have changed significantly since it was written and subnet size is much less of a consideration. See aws' announcement:
Because the network interfaces are shared across execution environments, typically only a handful of network interfaces are required per function. Every unique security group:subnet combination across functions in your account requires a distinct network interface. If a combination is shared across multiple functions in your account, we reuse the same network interface across functions.
Your function scaling is no longer directly tied to the number of network interfaces and Hyperplane ENIs can scale to support large numbers of concurrent function executions

Yes, you are right. Subnet size definitely does matter, you have to be careful with your CIDR blocks. With that one last invocation (252nd), it depends on the way your lambda is invoked: synchronously (e.g. API Gateway) or asynchronously (e.g. SQS). If it is called synchronously, it'll be just throttled and your API will respond with 429 HTTP status, which stands for "too many requests". If it is asynchronous, it'll be throttled and will be retried within a six hour period window. More detailed description you can find on this page.
Also I recently published a post in my blog, which is related to your question. You may find it useful.

Related

AWS Lambda inside VPC. 504 Gateway Timeout (ENI?)

I have a Serverless .net core web api lambda application deployed on AWS.
I have this sitting inside a VPC as I access ElasticSearch service inside that same VPC.
I have two API microservices that connect to the Elasticsearch service.
After a period of non use (4 hours, 6 hours, 18 hours - I'm not sure exactly but seems random), the function becomes unresponsive and I get a 504 gateway timeout error, "Endpoint cannot be found"
I read somewhere that if "idle" for too long, the ENI is released back into the AWS system and that triggering the Lambda again should start it up.
I can't seem to "wake" up the function by calling it as it keeps timing out with the above error (I have also increased the timeouts from default).
Here's the kicker - If I make any changes to the specific lambda function, and save those changes (this includes something as simple as changing the timeout value) - My API calls (BOTH of them, even though different lambdas) start working again like it has "kicked" back in. Obviously the changes do this, but why?
Obviously I don't want timeouts in a production environment regardless of how much, OR how little the lambda or API call is used.
I need a bulletproof solution to this. Surely it's a config issue of some description but I'm just not sure where to look.
I have altered Route tables, public/private subnets, CIDR blocks, created internet gateways, NAT etc. for the VPC. This all works, but these two lambdas, that require VPC access, keeps falling "asleep" somehow.
The is because of Cold Start of Lambda.
There is a new feature which was release in reInvent 2019, where in there is a provisioned concurrency for lambda (don't get confused with reserved concurrency).
Ensure the provisioned concurrency to minimum 1 (or the amount of requests to be served in parallel) to have lambda warm always and serve requests
Ref: https://aws.amazon.com/blogs/aws/new-provisioned-concurrency-for-lambda-functions/
To get more context, Lambda in VPC uses hyperplane ENI and functions in the same account that share the same security group:subnet pairing use the same network interfaces.
If Lambda functions in an account go idle for sometime (typically no usage for 40 mins across all functions using that ENI, as I got this time info from AWS support), the service will reclaim the unused Hyperplane resources and so very infrequently invoked functions may still see longer cold-start times.
Ref: https://aws.amazon.com/blogs/compute/announcing-improved-vpc-networking-for-aws-lambda-functions/

Lambda function timing out

I have a Lambda function (initiated by API Gateway) that accesses an Aurora Cluster in private subnets (no errors in CloudWatch, just that the function timed out).
If I invoke the function several times, at around 5 concurrent executions it is timing out and returning 502 errors from API Gateway.
I know there can be issues with cold start times when accessing a VPC, due to ENI's being created, and I need to make sure to have enough IP's for the ENI's to be created. But using the formula from AWS (my function has 512MB, and the private subnets has a /24 range, meaning 254 usable IP's):
IP's for ENI's = Projected peak concurrent executions * (Memory in GB / 3GB)
254 = Projected peak concurrent executions * (512/3000)
Projected peak concurrent executions = 1400+
Way higher than ~5. Have I missed something? Do I somehow need to manually create the ENI's, or is making sure I have enough available IP's enough?
Any guidance would be appreciated.

Lambda was throttled while using the Lambda Execution Role to set up for the Lambda function

I use API GATEWAY with Lambda for my application.
In one of my functions I suddenly get 502 with this error:
{ "Message": "Lambda was throttled while using the Lambda Execution
Role to set up for the Lambda function. ", "Type": "User" }
I checked online and did not find anything related to that error.
I checked my ConcurrentExecutions and did not over the limit.
My lambdas use VPC in every lambda, and Maybe there is a connection?
Will be glad for any help.
Thank you
To enable your Lambda function to access resources inside your private VPC, you must provide additional VPC-specific configuration information that includes VPC subnet IDs and security group IDs.
AWS Lambda uses this information to set up elastic network interfaces (ENIs) that enable your function to connect securely to other resources within your private VPC.
If your VPC does not have sufficient ENIs or subnet IPs, your Lambda function will not scale as requests increase, and you will see an increase in invocation errors with EC2 error types like EC2ThrottledException.
one of the option to avoid this exception is , You can Specifying Multiple subnets in each of the Availability Zones, your Lambda function can run in another Availability Zone if one goes down or runs out of IP addresses.
This error means your request got throttled by EC2 rate limit while connecting to your VPC.
As per lambda documentation
"Because Lambda depends on Amazon EC2 to provide Elastic Network Interfaces for VPC-enabled Lambda functions, these functions are also subject to Amazon EC2's rate limits as they scale. If your Amazon EC2 rate limits prevent VPC-enabled functions from adding 500 concurrent invocations per minute, please request a limit increase by following the instructions on the AWS Lambda Limits page.
Beyond this rate (i.e. for applications taking advantage of the full Immediate concurrency increase), your application should handle Amazon EC2 throttling (502 EC2ThrottledException) through client-side retry and backoff. For more details, see Error Retries and Exponential Backoff in AWS."
Ref : https://docs.aws.amazon.com/lambda/latest/dg/scaling.html#scaling-behavior

Amazon Web Service Lambda Low Invocations from SQS Trigger

I have an AWS Lambda Function setup with a trigger from a SQS queue. Current the queue has about 1.3m messages available. According to CloudWatch the Lambda function has only ever reached 431 invocations in a given minute. I have read that Lambda supports 1000 concurrent functions running at a time, so I'm not sure why it would be maxing out at 431 in a given minute. As well it looks like my function only runs for about 5.55s or so on average, so each one of those 1000 available concurrent slots should be turning over multiple times per minute, therefor giving a much higher rate of invocations.
How can I figure out what is going on here and get my Lambda function to process through that SQS queue in a more timely manner?
The 1000 concurrent connection limit you mention assumes that you have provided enough capacity.
Take a look at this, particularly the last bit.
https://docs.aws.amazon.com/lambda/latest/dg/vpc.html
If your Lambda function accesses a VPC, you must make sure that your
VPC has sufficient ENI capacity to support the scale requirements of
your Lambda function. You can use the following formula to
approximately determine the ENI capacity.
Projected peak concurrent executions * (Memory in GB / 3GB)
Where:
Projected peak concurrent execution – Use the information in Managing Concurrency to determine this value.
Memory – The amount of memory you configured for your Lambda function.
The subnets you specify should have sufficient available IP addresses
to match the number of ENIs.
We also recommend that you specify at least one subnet in each
Availability Zone in your Lambda function configuration. By specifying
subnets in each of the Availability Zones, your Lambda function can
run in another Availability Zone if one goes down or runs out of IP
addresses.
Also read this article which points out many things that might be affecting you: https://read.iopipe.com/5-things-to-know-about-lambda-the-hidden-concerns-of-network-resources-6f863888f656
As a last note, make sure your SQS Lambda trigger has a batchSize of 10 (max available).

API Gateway+Lambda+VPC timeout issue

Good morning, Could you please help us with next problem:
I have an API Gateway + Java Lambda Handler. this Lambda uses httpconnection to get some Internet REST API.
when we use this Lambda without VPC it works fine. but when we are using VPC with configured internet access - sometimes Lambda fails with timeout errors. it fails in 20% of all requests (80% requests works fine) with next errors at log.
REPORT RequestId: 16214561-b09a-11e6-a762-7546f12e61bd Duration: 15000.26 ms Billed Duration: 15000 ms Memory Size: 512 MB Max Memory Used: 47 MB
09:57:49
2016-11-22T09:57:49.245Z 16214561-b09a-11e6-a762-7546f12e61bd Task timed out after 15.00 seconds
According to my logs lambda cannot send GET request. I'm not sure where the problem at. Is this Lambda issue, VPC issue or some cofiguration issue.
Also I did try many different REST Api endpoints, so it's definetly not an endpoint issue.
Appreciate any help.
When you place a Lambda function inside your VPC it will not have access to anything outside the VPC. To enable your Lambda function to access resources outside the VPC you have to add a NAT Gateway to your VPC.
The problem is solved.
Lambda VPC configuration had public subnet attached.
Thanks to #Michael-sqlbot
I had pretty much the same issue a few months ago, and here is my solution:
Assuming you set up your Lambda manually, in the Configuration -> Advanced settings you will find the VPC and then choose subnet and security groups.
The Subnet you selected should be in the same subnet with other services the lambda function invokes. In your case, your lambda service uses httpconnection to Internet rest API, that's fine, but you may need DB connection with RDS or triggered by SQS or SNS. So make sure the subnet is correct.
The Security Groups is more important. Again, in your case, you need the access to Internet, so ensure the security group's outbound rules has external connections. Normally, I give all ports and all destination available for simplicity, and of course, you can limit to use port 80 and the API's IP address you need.
Since the executor is "locked" behind a VPC - all internet
communications are blocked.
That results in any http(s) calls to be timed out as they request
packet never gets to the destination.
That is why all actions done by aws-sdk result in a timeout.
Please refer to https://stackoverflow.com/a/39206646
From your log,
Billed Duration: 15000 ms
Memory Size: 512 MB
Max Memory Used: 47 MB
Solution:
It is timeout issue. You need to increase the execution time 15 seconds to 30 seconds or more if necessary.
In some cases, you also need to increase the memory size. It may also make effect. But I think time is the main fact for you, not memory size.
For timing issue and testing issue, you can go through the followings:
Q: How long can an AWS Lambda function execute?
Soluiton: All calls made to AWS Lambda must complete execution within 300 seconds. The default timeout is 3 seconds, but you can set the timeout to any value between 1 and 300 seconds.
To determine why your Lambda function is not working as expected:
You can test your code locally as you would any other Node.js function, or you can test it within the Lambda console using the console's test invoke functionality, or you can use the AWS CLI Invoke command. Each time the code is executed in response to an event, it writes a log entry into the log group associated with a Lambda function, which is /aws/lambda/.
If you see a timeout exceeded error in the log, your timeout setting
exceeds the run time of your function code. This may be because the
timeout is too low, or the code is taking too long to execute.
For solution:
Test your code with different memory settings.
If your code is taking too long to execute, it could be that it does not have enough compute resources to execute its logic. Try increasing the memory allocated to your function and testing the code again, using the Lambda console's test invoke functionality. You can see the memory used, code execution time, and memory allocated in the function log entries. Changing the memory setting can change how you are charged for duration. For information about pricing, see AWS Lambda.
Resource Link:
Troubleshooting and Monitoring AWS Lambda Functions with Amazon
CloudWatch
For testing, a full code example is given here: http://qiita.com/c9katayama/items/b9a30cdfaaa91cba23ad