I have an architectural question about the design and organisation of AWS Serverless resources using CloudFormation.
Currently I have multiple stack organised by the domain specific purpose and this works well. Most of the stack that contain Lambdas have to transformed using Serverless (using SAM for all). The async communication is facilitated using a combination of EventBridge and S3+Events and works well. The issue I have is with synchronous communication.
I don't want to reference Lambdas from other stacks using their exported names from other stacks and invoke them directly as this causes issues with updating and versions (if output exports are referenced in other stacks, I cannot change the resource unless the reference is removed first, not ideal for CI/CD and keeping the concerns separate).
I have been using API Gateway as an abstraction but that feels rather heavy handed. It is nice to have that separation but having to have domain and DNS resolving + having the API GW exposed externally doesn't feel right. Maybe there is a better way to configure API GW to be internal only. If you had success with this, could you please point me in the direction?
Is there a better way to abstract invocation of Lambda functions from different stacks in a synchronous way? (Common template patterns for CF or something along those lines?)
I see two questions:
Alternatives for Synchronous Lambda Functions with API Gateway .
Api Gateway is one easy way, with IAM Authentication to make it secure. HTTP Api is much simplified and cheaper option compared to REST APIs. We can choose Private Api rather than a Regional/Edge, which is not exposed outside VPC to make it even move secure.
we can have a private ALB with target as Lambda functions, for a simple use case that doesn't need any API gateway features.(this will cost some amount every month)
We can always call lambdas directly with AWS SDK invoke.
Alternatives to share resources between templates.
Exporting and Importing will be bit of problem if we need to delete and recreate the resource, shouldn't be a problem if we are just updating it though.
We can always store the Arn of the Lambda function in an SSM parameter in source template and resolve the value of the Arn from SSM parameter in destination template. This is completely decoupled. This is better than simply hard coding the value of Arn.
I used to think aws lambda was best suited to handle background tasks which did not require immediate results. However more and more I have seen aws lambda being used to handle real-time requests as well, for example fetch users from a db in a http get.
API Gateway -> AWS Lambda -> Results
Is this a standard approach or is this the improper use of lambda ?
Use of API Gateway to provide a front end for the Lambda function invocation is the standard way of executing Lambda function code on the fly. If you are concerned about the cold starts on the function; and want to minize the latency, you can consider Provisioned Concurrency to keep 'n' active containers at a small cost.
Is there any way to connect two different AWS Lambda layers?
Usually, we could invoke one lambda function by another lambda function. Is that possible in the lambda layer as well?
Lambda layers are used for dependencies only and do not include application code that can be directly invoked. This provides the ability to create one set of dependencies and share them across lambda functions reducing the chance of issues with versioning of dependencies as well as reducing the over all amount of lambda code storage used by your account in the region. Per this link, AWS Provides 75GB of storage for lambda layers and function code per region.
https://docs.aws.amazon.com/lambda/latest/dg/limits.html
You can attach more than one layer to a lambda function. They will apply in a layer order until all layers have been added. This can be done using the web console. There is a "layers" button in the center of the console. Select it, then select a layer you have created and the version of the layer code.
To learn how to create a lambda layer for python, or see an example of lambda layers in use, please see these step by step video instructions: https://geektopia.tech/post.php?blogpost=Create_Lambda_Layer_Python
I'm looking for a specific piece of documentation about the scaling of AWS Lambda.
How I think the scaling works:
Scenario: high traffic
AWS spins up multiple instances of the same Lambda Function
AWS distributes the events (probably evenly) among the instances
So what am I looking for specifically?
Is there a document where AWS states how lambda works internally or any information that concerns the process I described above (I need something to quote).
Thank you.
Officially, none of the implementation details of how AWS Lambda operates should impact your usage of the service. All you need to know is that something triggers a Lambda function, it runs and exits.
There is a limit on the number of simultaneous functions that can run (but you can ask for an increase in this limit). There is no guarantee that the functions run in a specific order.
The reality, however, is that Lambda functions are deployed as containers and those containers might be reused. For example, if you have a function that runs once per second for 200ms, it is quite likely that the container will be reused. The benefit of this is that there is no initialization time for the container if it is reused. This is particularly beneficial for Java functions that require creation of a JVM.
It also means that your function should assume that the environment will be reused — it should cleanup temporary files and reset global variables.
For more details, see: Understanding Container Reuse in AWS Lambda | AWS Compute Blog
I'm serving static JS files over from my S3 Bucket over CloudFront and I want to monitor whoever accesses them, and I don't want it to be done over CloudWatch and such, I want to log it on my own.
For every request to the CloudFront I'd like to trigger a lambda function that inserts data about the request to my MySQL RDS instance.
However, CloudFront limits Viewer Request Viewer Response triggers too much, such as 1-second timeout (which is too little to connect to MySQL), no VPC configuration to the lambda (therefore I can't even access the RDS subnet) and such.
What is the most optimal way to achieve that? Setup an API Gateway and how would I send a request to there?
The typical method to process static content (or any content) accessed from CloudFront is to enable logging and then process the log files.
To enable CloudFront Edge events, which can include processing and changing an event, look into Lambda#Edge.
Lambda#Edge
I would enable logging first and monitor the traffic for a while. When the bad actors hit your web site (CloudFront Distribution) they will generate massive traffic. This could result in some sizable bills using Lambda Edge. I would also recommend looking in Amazon WAF to help mitigate Denial of Service attacks which may help with the amount of Lambda processing.
This seems like a suboptimal strategy, since CloudFront suspends request/response processing while the trigger code is running -- the Lambda code in a Lambda#Edge trigger has to finish executing before processing of the request or response continues, hence the short timeouts.
CloudFront provides logs that are dropped multiple times per hour (depending on the traffic load) into a bucket you select, which you can capture from an S3 event notification, parse, and insert into your database.
However...
If you really need real-time capture, your best bet might be to create a second Lambda function, inside your VPC, that accepts the data structures provided to the Lambda#Edge trigger.
Then, inside the code for the viewer request or viewer response trigger, all you need to do is use the built-in AWS SDK to invoke your second Lambda function asynchronously, passing the event to it.
That way, the logging task is handed off, you don't wait for a response, and the CloudFront processing can continue.
I would suggest that if you really want to take this route, this will be the best alternative. One Lambda function can easily invoke a second one, even if the second function is not in the same account, region, or VPC, because the invocation is done by communicating with the Lambda service's endpoint API.
But, there's still room for some optimization, because you have to take another aspect of Lambda#Edge into account, and it's indirectly related to this:
no VPC configuration to the lambda
There's an important reason for this. Your Lambda#Edge trigger code is run in the region closest to the edge location that is handling traffic for each specific viewer. Your Lambda#Edge function is provisioned in us-east-1, but it's then replicated to all the regions, ready to run if CloudFront needs it.
So, when you are calling that 2nd Lambda function mentioned above, you'll actually be reaching out to the Lambda API in the 2nd function's region -- from whichever region is handling the Lambda#Edge trigger for this particular request.
This means the delay will be more, the further apart the two regions are.
This your truly optimal solution (for performance purposes) is slightly more complex: instead of the L#E function invoking the 2nd Lambda function asynchronously, by making a request to the Lambda API... you can create one SNS topic in each region, and subscribe the 2nd Lambda function to each of them. (SNS can invoke Lambda functions across regional boundaries.) Then, your Lambda#Edge trigger code simply publishes a message to the SNS topic in its own region, which will immediately return a response and asynchronously invoke the remote Lambda function (the 2nd function, which is in your VPC in one specific region). Within your Lambda#Edge code, the environment variable process.env.AWS_REGION gives you the region where you are currently running, so you can use this to identify how to send the message to the correct SNS topic, with minimal latency. (When testing, this is always us-east-1).
Yes, it's a bit convoluted, but it seems like the way to accomplish what you are trying to do without imposing substantial latency on request processing -- Lambda#Edge hands off the information as quickly as possible to another service that will assume responsibility for actually generating the log message in the database.
Lambda and relational databases pose a serious challenge around concurrency, connections and connection pooling. See this Lambda databases guide for more information.
I recommend using Lambda#Edge to talk to a service built for higher concurrency as the first step of recording access. For example you could have your Lambda#Edge function write access records to SQS, and then have a background worker read from SQS to RDS.
Here's an example of Lambda#Edge interacting with STS to read some config. It could easily be refactored to write to SQS.