Using aws lambda to handle real time requests

Using aws lambda to handle real time requests - amazon-web-services

I used to think aws lambda was best suited to handle background tasks which did not require immediate results. However more and more I have seen aws lambda being used to handle real-time requests as well, for example fetch users from a db in a http get.
API Gateway -> AWS Lambda -> Results
Is this a standard approach or is this the improper use of lambda ?

Use of API Gateway to provide a front end for the Lambda function invocation is the standard way of executing Lambda function code on the fly. If you are concerned about the cold starts on the function; and want to minize the latency, you can consider Provisioned Concurrency to keep 'n' active containers at a small cost.

Related

Using AWS API in order to invoke Lambda functions Asynchronously

I have been researching AWS Documentation on how to invoke lambda functions, and I've come across different ways to do that. Mainly, Lambda invocation is done by calling Invoke() function which can be used to invoke lambda functions synchronously or asynchronously.
Currently I am invoking my Lambda functions via HTTP Request (as REST API), but, HTTP Request times out after 30 seconds, while asynchronous calls as far as I know times out after 15min.
What are the advantages, besides time that I have already mentioned, of asynchronous lambda invocation compared to invoking lambda with HTTP Request. Also, what are best (recommended) ways to invoke lambdas in production? On AWS docs (SDK for Go - https://docs.aws.amazon.com/sdk-for-go/api/service/lambda/#InvokeAsyncInput) I see that InvokeAsyncInput and InvokeAsyncOutput have been depricated. So I am wondering how async implementation would actually look like.

Lambda really is about event-driven-computing. This means Lambda always gets triggered in response to an event. This event can originate from a wide range of AWS Services as well as the AWS CLI and SDK.
All of these events invoke the Lambda function and pass some kind of information in the form of an event and context object. How this event looks like depends on the service that triggered lambda. You can find more information about the context in this documentation.
There is no real "best" way to invoke Lambda - this mostly depends on your use case - if you're building a webservice, let API Gateway invoke Lambda for you. If you want to process new files on S3 - let S3 trigger Lambda. If you're just testing the Lambda function you can invoke it via the CLI. If you have custom software that needs to trigger a Lambda function you can use the SDK. If you want to run Lambda on a schedule, configure CloudWatch events...
Please provide more information about your use case if you require a more detailed evaluation of the available options - right now this is very broad.

Is AWS Lambda good for real-time API Rest?

I'm learning about AWS Lambda and I'm worried about synchronized real-time requests.
The fact the lambda has a "cold start" it doesn't sounds good for handling GET petitions.
Imagine a user is using the application and do a GET HTTP Request to get a Product or a list of Products, if the lambda is sleeping, then it will take 10 seconds to respond, I don't see this as an acceptable response time.
Is it good or bad practice to use AWS Lambda for classic (sync responses) API Rest?

Like most things, I think you should measure before deciding. A lot of AWS customers use Lambda as the back-end for their webapps quite successfully.
There's a lot of discussion out there on Lambda latency, for example:
2017-04 comparing Lambda performance using Node.js, Java, C# or Python
2018-03 Lambda call latency
2019-09 improved VPC networking for AWS Lambda
2019-10 you're thinking about cold starts all wrong
In December 2019, AWS Lambda introduced Provisioned Concurrency, which improves things. See:
2019-12 AWS Lambda announces Provisioned Concurrency
2020-09 AWS Lambda Cold Starts: Solving the Problem
You should measure latency for an environment that's representative of your app and its use.
A few things that are important factors related to request latency:
cold starts => higher latency
request patterns are important factors in cold starts
if you need to deploy in VPC (attachment of ENI => higher cold start latency)
using CloudFront --> API Gateway --> Lambda (more layers => higher latency)
choice of programming language (Java likely highest cold-start latency, Go lowest)
size of Lambda environment (more RAM => more CPU => faster)
Lambda account and concurrency limits
pre-warming strategy
Update 2019-12: see Predictable start-up times with Provisioned Concurrency.
Update 2021-08: see Increasing performance of Java AWS Lambda functions using tiered compilation.

As an AWS Lambda + API Gateway user (with Serverless Framework) I had to deal with this too.
The problem I faced:
Few requests per day per lambda (not enough to keep lambdas warm)
Time critical application (the user is on the phone, waiting for text-to-speech to answer)
How I worked around that:
The idea was to find a way to call the critical lambdas often enough that they don't get cold.
If you use the Serverless Framework, you can use the serverless-plugin-warmup plugin that does exactly that.
If not, you can copy it's behavior by creating a worker that will invoke the lambdas every few minutes to keep them warm. To do this, create a lambda that will invoke your other lambdas and schedule CloudWatch to trigger it every 5 minutes or so. Make sure to call your to-keep-warm lambdas with a custom event.source so you can exit them early without running any actual business code by putting the following code at the very beginning of the function:
if (event.source === 'just-keeping-warm) {
console.log('WarmUP - Lambda is warm!');
return callback(null, 'Lambda is warm!');
}
Depending on the number of lamdas you have to keep warm, this can be a lot of "warming" calls. AWS offers 1.000.000 free lambda calls every month though.

We have used AWS Lambda quite successfully with reasonable and acceptable response times. (REST/JSON based API + AWS Lambda + Dynamo DB Access).
The latency that we measured always had the least amount of time spent in invoking functions and large amount of time in application logic.
There are warm up techniques as mentioned in the above posts.

How To Prevent AWS Lambda Abuse by 3rd-party apps

Very interested in getting hands-on with Serverless in 2018. Already looking to implement usage of AWS Lambda in several decentralized app projects. However, I don't yet understand how you can prevent abuse of your endpoint from a 3rd-party app (perhaps even a competitor), from driving up your usage costs.
I'm not talking about a DDoS, or where all the traffic is coming from a single IP, which can happen on any network, but specifically having a 3rd-party app's customers directly make the REST calls, which cause your usage costs to rise, because their app is piggy-backing on your "open" endpoints.
For example:
I wish to create an endpoint on AWS Lambda to give me the current price of Ethereum ETH/USD. What would prevent another (or every) dapp developer from using MY lambda endpoint and causing excessive billing charges to my account?

When you deploy an endpoint that is open to the world, you're opening it to be used, but also to be abused.
AWS provides services to avoid common abuse methods, such as AWS Shield, which mitigates against DDoS, etc., however, they do not know what is or is not abuse of your Lambda function, as you are asking.
If your Lambda function is private, then you should use one of the API gateway security mechanisms to prevent abuse:
IAM security
API key security
Custom security authorization
With one of these in place, your Lambda function can only by called by authorized users. Without one of these in place, there is no way to prevent the type of abuse you're concerned about.

Unlimited access to your public Lambda functions - either by bad actors, or by bad software developed by legitimate 3rd parties, can result in unwanted usage of billable corporate resources, and can degrade application performance. It is important to you consider ways of limiting and restricting access to your Lambda clients as part of your systems security design, to prevent runaway function invocations and uncontrolled costs.
Consider using the following approach to preventing execution "abuse" of your Lambda endpoint by 3rd party apps:
One factor you want to control is concurrency, or number of concurrent requests that are supported per account and per function. You are billed per request plus total memory allocation per request, so this is the unit you want to control. To prevent run away costs, you prevent run away executions - either by bad actors, or by bad software cause by legitimate 3rd parties.
From Managing Concurrency
The unit of scale for AWS Lambda is a concurrent execution (see
Understanding Scaling Behavior for more details). However, scaling
indefinitely is not desirable in all scenarios. For example, you may
want to control your concurrency for cost reasons, or to regulate how
long it takes you to process a batch of events, or to simply match it
with a downstream resource. To assist with this, Lambda provides a
concurrent execution limit control at both the account level and the
function level.
In addition to per account and per Lambda invocation limits, you can also control Lambda exposure by wrapping Lambda calls in an AWS API Gateway, and Create and Use API Gateway Usage Plans:
After you create, test, and deploy your APIs, you can use API Gateway
usage plans to extend them as product offerings for your customers.
You can provide usage plans to allow specified customers to access
selected APIs at agreed-upon request rates and quotas that can meet
their business requirements and budget constraints.
What Is a Usage Plan? A usage plan prescribes who can access one or
more deployed API stages— and also how much and how fast the caller
can access the APIs. The plan uses an API key to identify an API
client and meters access to an API stage with the configurable
throttling and quota limits that are enforced on individual client API
keys.
The throttling prescribes the request rate limits that are applied to
each API key. The quotas are the maximum number of requests with a
given API key submitted within a specified time interval. You can
configure individual API methods to require API key authorization
based on usage plan configuration. An API stage is identified by an
API identifier and a stage name.
Using API Gateway Limits to create Gateway Usage Plans per customer, you can control API and Lambda access prevent uncontrolled account billing.

#Matt answer is correct, yet incomplete.
Adding a security layer is a necessary step towards security, but doesn't protect you from authenticated callers, as #Rodrigo's answer states.
I actually just encountered - and solved - this issue on one of my lambda, thanks to this article: https://itnext.io/the-everything-guide-to-lambda-throttling-reserved-concurrency-and-execution-limits-d64f144129e5
Basically, I added a single line on my serverless.yml file, in my function that gets called by the said authirized 3rd party:
reservedConcurrency: 1
And here goes the whole function:
refresh-cache:
handler: src/functions/refresh-cache.refreshCache
# XXX Ensures the lambda always has one slot available, and never use more than one lambda instance at once.
# Avoids GraphCMS webhooks to abuse our lambda (GCMS will trigger the webhook once per create/update/delete operation)
# This makes sure only one instance of that lambda can run at once, to avoid refreshing the cache with parallel runs
# Avoid spawning tons of API calls (most of them would timeout anyway, around 80%)
# See https://itnext.io/the-everything-guide-to-lambda-throttling-reserved-concurrency-and-execution-limits-d64f144129e5
reservedConcurrency: 1
events:
- http:
method: POST
path: /refresh-cache
cors: true
The refresh-cache lambda was invoked by a webhook triggered by a third party service when any data change. When importing a dataset, it would for instance trigger as much as 100 calls to refresh-cache. This behaviour was completely spamming my API, which in turn was running requests to other services in order to perform a cache invalidation.
Adding this single line improved the situation a lot, because only one instance of the lambda was running at once (no concurrent run), the number of calls was divided by ~10, instead of 50 calls to refresh-cache, it only triggered 3-4, and all those call worked (200 instead of 500 due to timeout issue).
Overall, pretty good. Not yet perfect for my workflow, but a step forward.
Not related, but I used https://epsagon.com/ which tremendously helped me figuring out what was happening on AWS Lambda. Here is what I got:
Before applying reservedConcurrency limit to the lambda:
You can see that most calls fail with timeout (30000ms), only the few first succeed because the lambda isn't overloaded yet.
After applying reservedConcurrency limit to the lambda:
You can see that all calls succeed, and they are much faster. No timeout.
Saves both money, and time.
Using reservedConcurrency is not the only way to deal with this issue, there are many other, as #Rodrigo stated in his answer. But it's a working one, that may fit in your workflow. It's applied on the Lambda level, not on API Gateway (if I understand the docs correctly).

Trigger RDS lambda on CloudFront access

I'm serving static JS files over from my S3 Bucket over CloudFront and I want to monitor whoever accesses them, and I don't want it to be done over CloudWatch and such, I want to log it on my own.
For every request to the CloudFront I'd like to trigger a lambda function that inserts data about the request to my MySQL RDS instance.
However, CloudFront limits Viewer Request Viewer Response triggers too much, such as 1-second timeout (which is too little to connect to MySQL), no VPC configuration to the lambda (therefore I can't even access the RDS subnet) and such.
What is the most optimal way to achieve that? Setup an API Gateway and how would I send a request to there?

The typical method to process static content (or any content) accessed from CloudFront is to enable logging and then process the log files.
To enable CloudFront Edge events, which can include processing and changing an event, look into Lambda#Edge.
Lambda#Edge
I would enable logging first and monitor the traffic for a while. When the bad actors hit your web site (CloudFront Distribution) they will generate massive traffic. This could result in some sizable bills using Lambda Edge. I would also recommend looking in Amazon WAF to help mitigate Denial of Service attacks which may help with the amount of Lambda processing.

This seems like a suboptimal strategy, since CloudFront suspends request/response processing while the trigger code is running -- the Lambda code in a Lambda#Edge trigger has to finish executing before processing of the request or response continues, hence the short timeouts.
CloudFront provides logs that are dropped multiple times per hour (depending on the traffic load) into a bucket you select, which you can capture from an S3 event notification, parse, and insert into your database.
However...
If you really need real-time capture, your best bet might be to create a second Lambda function, inside your VPC, that accepts the data structures provided to the Lambda#Edge trigger.
Then, inside the code for the viewer request or viewer response trigger, all you need to do is use the built-in AWS SDK to invoke your second Lambda function asynchronously, passing the event to it.
That way, the logging task is handed off, you don't wait for a response, and the CloudFront processing can continue.
I would suggest that if you really want to take this route, this will be the best alternative. One Lambda function can easily invoke a second one, even if the second function is not in the same account, region, or VPC, because the invocation is done by communicating with the Lambda service's endpoint API.
But, there's still room for some optimization, because you have to take another aspect of Lambda#Edge into account, and it's indirectly related to this:
no VPC configuration to the lambda
There's an important reason for this. Your Lambda#Edge trigger code is run in the region closest to the edge location that is handling traffic for each specific viewer. Your Lambda#Edge function is provisioned in us-east-1, but it's then replicated to all the regions, ready to run if CloudFront needs it.
So, when you are calling that 2nd Lambda function mentioned above, you'll actually be reaching out to the Lambda API in the 2nd function's region -- from whichever region is handling the Lambda#Edge trigger for this particular request.
This means the delay will be more, the further apart the two regions are.
This your truly optimal solution (for performance purposes) is slightly more complex: instead of the L#E function invoking the 2nd Lambda function asynchronously, by making a request to the Lambda API... you can create one SNS topic in each region, and subscribe the 2nd Lambda function to each of them. (SNS can invoke Lambda functions across regional boundaries.) Then, your Lambda#Edge trigger code simply publishes a message to the SNS topic in its own region, which will immediately return a response and asynchronously invoke the remote Lambda function (the 2nd function, which is in your VPC in one specific region). Within your Lambda#Edge code, the environment variable process.env.AWS_REGION gives you the region where you are currently running, so you can use this to identify how to send the message to the correct SNS topic, with minimal latency. (When testing, this is always us-east-1).
Yes, it's a bit convoluted, but it seems like the way to accomplish what you are trying to do without imposing substantial latency on request processing -- Lambda#Edge hands off the information as quickly as possible to another service that will assume responsibility for actually generating the log message in the database.

Lambda and relational databases pose a serious challenge around concurrency, connections and connection pooling. See this Lambda databases guide for more information.
I recommend using Lambda#Edge to talk to a service built for higher concurrency as the first step of recording access. For example you could have your Lambda#Edge function write access records to SQS, and then have a background worker read from SQS to RDS.
Here's an example of Lambda#Edge interacting with STS to read some config. It could easily be refactored to write to SQS.

AWS Lambda - sync vs async

I've been playing with Lambda recently and am working on creating an API using API Gateway and Lambda. I have a lambda function in place that returns a JSON and an API Gateway endpoint that invokes the function. Everything works well with this simple setup.
I tried loadtesting the API gateway endpoint with the loadtest npm module. While Lambda processes the concurrent requests (albeit with an increase in mean latency over the course of execution), when I send it 40 requests per second or so, it starts throwing errors, only partially completing the requests.
I read in the documentation that by default, Lambda invocation is of type RequestResponse (which is what the API does right now) which is synchronous in nature, and it looks like it is non-blocking. For asynchronous invocation, the invocation type is Event. But lambda discards the return type for async invocations and the API returns nothing.
Is there something I am missing either with the sync, async or concurrency definitions in regards to AWS? Is there a better way to approach this problem? Any insight is helpful. Thank you!

You will have to use Synchronous execution if you want to get a return response from API Gateway. It doesn't make sense to use Async execution in this scenario. I think what you are missing is that while each Lambda execution is blocking, single threaded, there will be multiple instances of your function running in multiple Lambda server environments.
The default number of concurrent Lambda executions is fairly low, for safety reasons. This is to prevent you from accidentally writing a run-away Lambda process that would cost lots of money while you are still learning about Lambda. You need to request an increase in the Lambda concurrent execution limit on your account.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js