Prevent AWS Lambda flooding - amazon-web-services

I'm considering about moving my service from a VPS to AWS Lambda + DynamoDB to use it as a FaaS, because it's basically 2 API GET calls that fetch info from the database and serve it, and the normal use of those API calls are really rare (about 50 times a week)
But it makes me wonder... As I can't setup a limit on how many calls I want to serve each month, some attacker could theoretically flood my service by calling it a couple thousands times a day and make my AWS bill extremely expensive. Setting up a limit per month wouldn't be a nice idea either, because the attacker could flood the first day and I won't have more requests to serve. The ideal thing would be to set up a limit on request rate per client.
Anyone knows how could I protect it? I've seen that AWS also offers a Firewall, but that's for CloudFront. Isn't there any way to make it work with Lambda directly?

You can put AWS CloudFront in front API Gateway and Lambda so that, the traffic will be served to outside through CloudFront.
In addition by configuring AWS WAF with rate base blocking, it is possible to block high frequencies of access by attackers.
However when configuring AWS CloudFront in front of API Gateway and Lambda, you also need to restrict direct access to API Gateway (Since API Gateway will be publicly accessible by default). This can be achieved in following ways.
Enable API Keys for API Gateway and use the API Key in AWS CloudFront Headers in the Origin.
Use a Token Header and Verify it using a Custom Authorizer Lambda function.

Two options spring to mind:
place API Gateway in front of Lambda so that API requests
have to be authenticated. API Gateway also has built-in throttles and other useful features.
invoke the Lambda directly, which will require the client
invoking the Lambda to have the relevant IAM credentials.

Related

How can I prevent that bots are generating costs at my AWS Lambda Function

I was planning a concept for my mobile Game and using AWS Lambda (or Firebase Functions where its the same). Couldn't a bot permanently do a request to my Lambda-Function and generate massive costs by only spamming my Endpoint?
Is there any protection from Amazon / Google or how would you guys secure you Endpoint for this kind of attacks?
See Protecting API Endpoints, and more generally read AWS Best Practices for DDoS Resiliency.
You would use a combination of:
API Gateway (with authenticated clients and, potentially, throttling)
CloudFront
WAF

Is it possible to remove API Gateway from the equation to serve Lambda over public internet?

Currently, my application resides in lambda which I serve using HTTP API (API Gateway V2). This setup exists in multiple regions. Meaning, API Gateway invokes lambda in the same region which accesses DynamoDB Global Table in the same region. I use Route 53 to serve nearest API Gateway to user.
The problem I faced: API Gateway doesn't support redirection from http to https. I can achieve this with CloudFront. But, it'll increase cost as well as latency.
Can I remove API Gateway from the equation and use Lambda#Edge to access DynamoDB Table near the user? Can CloudFront be used to replace API Gateway?
Yes you can. The docs write:
Functions triggered by origin request and response events as well as functions triggered by viewer request and response events can make network calls to resources on the internet, and to AWS services such as Amazon S3 buckets, DynamoDB tables, or Amazon EC2 instances.
However, there are many limitations to what lambda#edge can do, as compared to a regular lambda. Examples are:
only python and nodejs,
difficulty in debugging, as lambda logs will be in region when it runs, not in one central region,
timeout limits on calls to DynamoDb (5 or 30 seconds) depending if its origin or viewer function,
no lambda layers
max memory of 128 MB for viewer side functions
deployment package size can be max 1 MB for viewer side functions
Thus if you can work with these and other limitations of lambda#edge, then you can use it to work with DynamoDb.

How do we address/what are good practices for "serverless" resource abuse?

If I create a public endpoint using AWS API Gateway, the entire world could access it. This would be a problem because the end point would trigger an AWS Lambda function. If we assume that I can't query a data source to determine the frequency that the incoming IP address queried the resource in the past, what would be the best practice for protecting this end point from abuse? Do I have any other security options?
I realize I could use a reCaptcha but this would still invoke the AWS Lambda function and would incur costs if done a million times over a short window of time.
A very simple way of protecting your API gateway
Use AWS Cloudfront with TTL 0 and pass custom headers from AWS Cloudfront to API gateway
Use AWS WAF with AWS Cloudfront
AWS API Gateway also handles some basic level of DDOS attacks.
Kindly also view these blogs for securing AWS API Gateway
https://aws.amazon.com/blogs/compute/protecting-your-api-using-amazon-api-gateway-and-aws-waf-part-i/
https://aws.amazon.com/blogs/compute/protecting-your-api-using-amazon-api-gateway-and-aws-waf-part-2/
You are probably looking for throttling limit configuration or usage plan definition:
To prevent your API from being overwhelmed by too many requests,
Amazon API Gateway throttles requests to your API using the token
bucket algorithm, where a token counts for a request. Specifically,
API Gateway sets a limit on a steady-state rate and a burst of request
submissions against all APIs in your account. In the token bucket
algorithm, the burst is the maximum bucket size.
When request submissions exceed the steady-state request rate and
burst limits, API Gateway fails the limit-exceeding requests and
returns 429 Too Many Requests error responses to the client. Upon
catching such exceptions, the client can resubmit the failed requests
in a rate-limiting fashion, while complying with the API Gateway
throttling limits.
As an API developer, you can set the limits for individual API stages
or methods to improve overall performance across all APIs in your
account. Alternatively, you can enable usage plans to restrict client
request submissions to within specified request rates and quotas. This
restricts the overall request submissions so that they don't go
significantly past the account-level throttling limits.
References:
https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-request-throttling.html
https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-create-usage-plans-with-console.html#api-gateway-usage-plan-create

AWS serverless architecture – Why should I use API gateway?

Here is my use case:
Static react frontend hosted on s3
Python backend on lambda conduction long running data analysis
Postgres database on rds
Backend and frontend communicate exclusively with JSON
Occasionally backend creates and stores powerpoint files in s3 bucket and then serves them up by sending s3 link to frontend
Convince me that it is worthwhile going through all the headaches of setting up API gateway to connect the frontend and backend rather than invoking lambda directly from the frontend!
Especially given the 29s timeout which is not long enough for my app meaning I need to implement asynchronous processing and add a whole other layer of aws architecture (messaging, queuing and polling with SNS and SQS) which increases cost, time and potential for problems. I understand there are some security concerns, is there no way to securely invoke a lambda function?
You are talking about invoking a lambda directly from JavaScript running on a client machine.
I believe the only way to do that would be embedding the AWS SDK for JavaScript in your react frontend. See:
https://docs.aws.amazon.com/sdk-for-javascript/v2/developer-guide/browser-invoke-lambda-function-example.html
There are several security concerns with this, only some of which can be mitigated.
First off, you will need to hardcode AWS credentials in to your frontend for the world to see. The access those credentials have can be limited in scope, but be very careful to get this right, or otherwise you'll be paying for someone's cryptomining operation.
Assuming you only want certain people to upload files to a storage service you are paying for, you will need some form of authentication and authorisation. API Gateway doesn't really do authentication, but it can do authorisation, albeit by connecting to other AWS services like Cognito or Lambda (custom authorizers). You'll have to build this into your backend Lambda yourself. Absolutely doable and probably not much more effort than using a custom authorizer from the API Gateway.
The main issue with connecting to Lambda direct is that Lambda has the ability to scale rapidly, which can be an issue if someone tries to hit you with a denial of service attack. Lambda is cheap, but running 1000 concurrent instances 24 hours a day is going to add up.
API Gateway allows you rate limit per second/minute/hour/etc., Lambda only allows you to limit the number of concurrent instances at any given time. So if you were to set that limit at 1, an attacker could cause that 1 instance to run for 24 hours a day.

Deploying AWS Global infrastructure with API Gateway, Lambda, Cognito, S3, Dynamodb

Let say I need an API Gateway that is going to run Lambdas and I want to make the best globally distributed performing infrastructure. Also, I will use Cognito for authentication, Dynamodb, and S3 for user data and frontend statics.
My app is located at myapp.com
First the user get the static front end from the nearest location:
user ===> edge location at CloudFront <--- S3 at any region (with static front end)
After that we need to comunicate with API Gateway.
user ===> API Gateway ---> Lambda ---> S3 || Cognito || Dynamodb
API Gateway can be located in several regions, and even though is distributed with CloudFront, each endpoint is pointing to a Lambda located at a given region: Let say I deploy an API at eu-west-1. If a request is sent from USA, even if my API is on CloudFront, the Lambda it runs is located at eu-west-1, so latency will be high anyway.
To avoid that, I need to deploy another API at us-east-1 and all my Lambdas too. That API will be pointing to those Lambdas
If I deploy one API for every single region, I would need one endpoint for each one of them, and the frontend should decide which one to request. But how could we know which one is the nearest location?
The ideal scenario is a single global endpoint at api.myapp.com, which is going to go to the nearest API Gateway which runs the Lambdas located in that region too. Can I configure that using Route 53 latency routing with multiple A records pointing to each api gateway?
If this is not right way to do this, can you point me in the right direction?
AWS recently announced support for regional API endpoints using which you can achieve this.
Below is an AWS Blog which explains how to achieve this:
Building a Multi-region Serverless Application with Amazon API Gateway and AWS Lambda
Excerpt from the blog:
The default API endpoint type in API Gateway is the edge-optimized API
endpoint, which enables clients to access an API through an Amazon
CloudFront distribution. This typically improves connection time for
geographically diverse clients. By default, a custom domain name is
globally unique and the edge-optimized API endpoint would invoke a
Lambda function in a single region in the case of Lambda integration.
You can’t use this type of endpoint with a Route 53 active-active
setup and fail-over.
The new regional API endpoint in API Gateway moves the API endpoint
into the region and the custom domain name is unique per region. This
makes it possible to run a full copy of an API in each region and then
use Route 53 to use an active-active setup and failover.
Unfortunately, this is not currently possible. The primarily blocker here is CloudFront.
MikeD#AWS provides the info on their forums:
When you create a custom domain name it creates an associated CloudFront distribution for the domain name and CloudFront enforces global uniqueness on the domain name.
If a CloudFront distribution with the domain name already exists, then the CreateCloudFrontDistribution will fail and API Gateway will return an error without saving the domain name or allowing you to define it's associated API(s).
Thus, there is currently (Jun 29, 2016) no way to get API Gateway in multiple regions to handle the same domain name.
AWS has no update on providing the needful since confirming existence of an open feature request on July 4, 2016. AWS Form thread for updates
Checkout Lambda#Edge
Q: What is Lambda#Edge? Lambda#Edge allows you to run code across AWS
locations globally without provisioning or managing servers,
responding to end users at the lowest network latency. You just upload
your Node.js code to AWS Lambda and configure your function to be
triggered in response to Amazon CloudFront requests (i.e., when a
viewer request lands, when a request is forwarded to or received back
from the origin, and right before responding back to the end user).
The code is then ready to execute across AWS locations globally when a
request for content is received, and scales with the volume of
CloudFront requests globally. Learn more in our documentation.
Usecase, minimizing latency for globally distributed users
Q: When should I use Lambda#Edge? Lambda#Edge is optimized for latency
sensitive use cases where your end viewers are distributed globally.
Ideally, all the information you need to make a decision is available
at the CloudFront edge, within the function and the request. This
means that use cases where you are looking to make decisions on how to
serve content based on user characteristics (e.g., location, client
device, etc) can now be executed and served right from the edge in
Node.js-6.10 without having to be routed back to a centralized server.