How to protect Serverless Framework endpoints from abuse / DoS? - amazon-web-services

I plan to have the following setup:
Completely STATIC front-end web interface (built with AngularJS or the likes)
Serverless Framework back-end APIs
I want to store my front-end in S3 and my back-end in Lambda.
Since I'm charged every time the lambda function gets executed, I don't want everyone to be able to make requests directly to it. On the other hand, I want to store my front-end simply in S3 as opposed to a server.
How do I go about protecting my back-end API from abuse or DoS?

I'm not sure you can protect your front end from people calling it more than they should since that's extremely hard to determine.
However for real DDoS or DoS protection you would probably want to use the features of API Gateway (check the question about threats or abuse) or AWS's new WAF. I know WAF has the ability to block ranges of IP addresses and the like.

what #Boushley said +
you may want to checkout Cloudflare: https://www.cloudflare.com/ddos

Actually, Amazon API Gateway automatically protects your backend systems from distributed denial-of-service (DDoS) attacks, whether attacked with counterfeit requests (Layer 7) or SYN floods (Layer 3).

In your serverless.yml you can now provide a provider.usagePlan property, assuming you are using AWS.
provider:
...
usagePlan: # limit expenditures
quota:
limit: 5000
period: DAY
throttle:
burstLimit: 200
rateLimit: 100
While this does not mean that you cannot be DDoSed (as #mrBorna mentioned AWS tries to prevent this by default), it should mean that if you are DDoSed, you will not be significantly affected from a financial perspective.

Related

DDoS attack handling strategies

I am building a website to carry out a survey and this would store some answers and user data. Obviously, I want to keep costs low and within what the free tier offers. I am trying to build a low-cost solution for mitigating DDoS attacks. Here is what I have come up with but not sure if I am going in the right direction. I plan to put my frontend as well backend service behind CloudFront. I would put AWS WAF and Shield on this CloudFront. Along with that, I plan to add two WAF rules:
Every request should have a "user-agent" header
Requests should originate only from a specific country i.e the one
with my target audience
Along with this, I plan to add a Recaptcha to ensure only human users interact with my application just as a deterrent from cost perspective. Any other suggestion or feedback is really appreciated. Please note: cost is a huge factor.
AWS Shield, CloudFront and WAF should be sufficient for your use case. Use the geo restrictions but I don’t think a header check will add any value as it’s so easy to spoof. Additionally you may think about using auto scaling for your backend to achieve more resilience but be careful with the scaling cost, have a proper scaling policy and set alarms (especially a billing alarm if you don’t have one already) and notifications for scaling events.
Check this whitepaper for more information: https://docs.aws.amazon.com/whitepapers/latest/aws-best-practices-ddos-resiliency/aws-best-practices-ddos-resiliency.pdf
FluxCDN, DDoS-Guard, Cloudflare, Stackpath would be an good option.
Use GEO restrictions (China, India ...) and Challenge the requests (Captcha).
Block empty user-agent and user agents from bots for example "python-requests" if you don't need them.
Block an IP from accessing your site if it reaches the Threshold (Rate limit).
Block bad ASN from accessing your site.
Use an JS challenge to challenge the legitmacy of the request.
Cache static files (pngs, htmls, ...)
Block HTTP/1.0 HTTP/1.1 HTTP/1.2 if not needed (Blocks 99% of all DDoS attacks)

AWS Elasticache Vs API Gateway Cache

I am new to Serverless architecture using AWS Lambda and still trying to figure out how some of the pieces fit together. I have converted my website from EC2 (React client, and node API) to a serverless architecture. The React Client is now using s3 static web hosting and the API has been converted over to use AWS Lambda and API Gateway.
In my previous implementation I was using redis as a cache for caching responses from other third party API's.
API Gateway has the option to enable a cache, but I have also looked into Elasticache as an option. They are both comparable in price with API Gateway cache being slightly costlier.
The one issue I have run into when trying to use Elasticache is that it needs to be running in a VPC and I can no longer call out to my third party API's.
I am wondering if there is any benefit to using one over the other? Right now the main purpose of my cache is to reduce requests to the API but that may change over time. Would it make sense to have a Lambda dedicated to checking Elasticache first to see if there is a value stored and if not triggering another Lambda to retrieve the information from the API or is this even possible. Or for my use case would API Gateway cache be the better option?
Or possibly a completely different solution all together. Its a bit of a shame that mainly everything else will qualify for the free tier but having some sort of cache will add around $15 a month.
I am still very new to this kind of setup so any kind of help or direction would be greatly appreciated. Thank you!
I am wondering if there is any benefit to using one over the other?
Apigateway internally uses Elasticache to support caching so functionally they both behave in same way. Advantage of using api gateway caching is that ApiGateway checks chache before invoking backend lambda, thus you save cost of lambda invocation for response which are served by cache.
Another difference will be that when you use api gateway cache , cache lookup time will not be counted towards "29s integration timeout" limit for cache miss cases.
Right now the main purpose of my cache is to reduce requests to the API but that may change over time.
I will suggest to make your decision about cache based on current use case. You might use completely new cache or different solution for other caching requirement.
Would it make sense to have a Lambda dedicated to checking Elasticache first to see if there is a value stored and if not triggering another Lambda to retrieve the information from the API or is this even possible. Or for my use case would API Gateway cache be the better option?
In general, I will not suggest to have additional lambda just for checking cache value ( just to avoid latency and aggravate lambda's cold start problem ). Either way, as mentioned above this way you will end up paying for lambda invokation even for requests which are being served by cache. If you use api gateway cache , cached requests will not even reach lambda.

AWS API Gateway + Lamda - how to handle 1 million requests per second

we would like to create serverless architecture for our startup and we would like to support up to 1 million requests per second and 50 millions active users. How can we handle this use case with AWS architecture?
Regarding to AWS documentation API Gateway can handle only 10K requests/s and lamda can process 1K invocations/s and for us this is unacceptable.
How can we overcome this limitation? Can we request this throughput with AWS support or can we connect somehow to another AWS services (queues)?
Thanks!
Those numbers you quoted are the default account limits. Lambda and API Gateway can handle more than that, but you have to send a request to Amazon to raise your account limits. If you are truly going to receive 1 million API requests per second then you should discuss it with an AWS account rep. Are you sure most of those requests won't be handled by a cache like CloudFront?
The gateway is NOT your API Server. Lambda's are the bottleneck.
While the gateway can handle 100000 messages/sec (because it is going through a message queue), Lambdas top out at around 2,200 rps even with scaling (https://amido.com/blog/azure-functions-vs-aws-lambda-vs-google-cloud-functions-javascript-scaling-face-off/)
This differs dramatically from actually API framework implementations wherein the scale goes up to 3,500+ rps...
I think you should go with Application Load Balancer.
It is limitless in terms of RPS and can potentially be even cheaper for a large number of requests. It does have fewer integrations with AWS services though, but in general, it has everything you need for a gateway.
https://dashbird.io/blog/aws-api-gateway-vs-application-load-balancer/

Is significant latency introduced by API Gateway?

I'm trying to figure out where the latency in my calls is coming from, please let me know if any of this information could be presented in a format that is more clear!
Some background: I have two systems--System A and System B. I manually (through Postman) hit an endpoint on System A that invokes an endpoint on System B.
System A is hosted on an EC2 instance.
When System B is hosted on a Lambda function behind API Gateway, the
latency for the call is 125 ms.
When System B is hosted on an
EC2 instance, the latency for the call is 8 ms.
When System B is
hosted on an EC2 instance behind API Gateway, the latency for the
call is 100 ms.
So, my hypothesis is that API Gateway is the reason for increased latency when it's paired with the Lambda function as well. Can anyone confirm if this is the case, and if so, what is API Gateway doing that increases the latency so much? Is there any way around it? Thank you!
It might not be exactly what the original question asks for, but I'll add a comment about CloudFront.
In my experience, both CloudFront and API Gateway will add at least 100 ms each for every HTTPS request on average - maybe even more.
This is due to the fact that in order to secure your API call, API Gateway enforces SSL in all of its components. This means that if you are using SSL on your backend, that your first API call will have to negotiate 3 SSL handshakes:
Client to CloudFront
CloudFront to API Gateway
API Gateway to your backend
It is not uncommon for these handshakes to take over 100 milliseconds, meaning that a single request to an inactive API could see over 300 milliseconds of additional overhead. Both CloudFront and API Gateway attempt to reuse connections, so over a large number of requests you’d expect to see that the overhead for each call would approach only the cost of the initial SSL handshake. Unfortunately, if you’re testing from a web browser and making a single call against an API not yet in production, you will likely not see this.
In the same discussion, it was eventually clarified what the "large number of requests" should be to actually see that connection reuse:
Additionally, when I meant large, I should have been slightly more precise in scale. 1000 requests from a single source may not see significant reuse, but APIs that are seeing that many per second from multiple sources would definitely expect to see the results I mentioned.
...
Unfortunately, while cannot give you an exact number, you will not see any significant connection reuse until you approach closer to 100 requests per second.
Bear in mind that this is a thread from mid-late 2016, and there should be some improvements already in place. But in my own experience, this overhead is still present and performing a loadtest on a simple API with 2000 rps is still giving me >200 ms extra latency as of 2018.
source: https://forums.aws.amazon.com/thread.jspa?messageID=737224
Heard from Amazon support on this:
With API Gateway it requires going from the client to API Gateway,
which means leaving the VPC and going out to the internet, then back
to your VPC to go to your other EC2 Instance, then back to API
Gateway, which means leaving your VPC again and then back to your
first EC2 instance.
So this additional latency is expected. The only way to lower the
latency is to add in API Caching which is only going to be useful is
if the content you are requesting is going to be static and not
updating constantly. You will still see the longer latency when the
item is removed from cache and needs to be fetched from the System,
but it will lower most calls.
So I guess the latency is normal, which is unfortunate, but hopefully not something we'll have to deal with constantly moving forward.
In the direct case (#2) are you using SSL? 8 ms is very fast for SSL, although if it's within an AZ I suppose it's possible. If you aren't using SSL there, then using APIGW will introduce a secure TLS connection between the client and CloudFront which of course has a latency penalty. But usually that's worth it for a secure connection since the latency is only on the initial establishment.
Once a connection is established all the way through, or when the API has moderate, sustained volume, I'd expect the average latency with APIGW to drop significantly. You'll still see the ~100 ms latency when establishing a new connection though.
Unfortunately the use case you're describing (EC2 -> APIGW -> EC2) isn't great right now. Since APIGW is behind CloudFront, it is optimized for clients all over the world, but you will see additional latency when the client is on EC2.
Edit:
And the reason why you only see a small penalty when adding Lambda is that APIGW already has lots of established connections to Lambda, since it's a single endpoint with a handful of IPs. The actual overhead (not connection related) in APIGW should be similar to Lambda overhead.

AWS Lambda w/ API Gateway for Angular back-end?

I'm still trying to wrap my mind around the limitations of AWS Lambda, especially now that AWS API Gateway opens up a lot of options for serving REST requests with Lambda.
I'm considering building a web app in Angular with Lambda serving as the back-end.
For simple CRUD stuff it seems straightforward enough, but what about authentication? Would I be able to use something like Passport within Lambda to do user authentication?
Yes, you can do pretty much anything, just store your session on an AWS hosted database (RDS, Dynamo, etc). But be aware exactly you are buying with lambda. It has a lot of trade-offs.
Price: An EC2 server costs a fixed price per month, but lambda has a cost per call. Which is cheaper depends on your usage patterns. Lambda is cheaper when nobody is using your product, EC2 is most likely cheaper as usage increases.
Scale: EC2 can scale (in many ways), but it's more "manual" and "chunky" (you can only run 1 server or 2, not 1.5). Lambda has fine-grained scaling. You don't worry about it, but you also have less control over it.
Performance: Lambda is a certain speed, and you have very little control. It may have huge latencies in some cases, as they spin up new containers to handle traffic. EC2 gives you many more options for performance tuning. (Box size, on-box caches, using the latest node.js, removing un-needed services from the box, being able to run strace, etc) You can pay for excess capacity to ensure low latency.
Code: The way you code will be slightly different in Lambda vs EC2. Lambda forces you to obey some conventions that are mostly best practice. But EC2 allows you to violate them for performance, or just speed of development. Lambda is a "black box" where you have less control and visibility when you need to troubleshoot.
Setup: Lambda is easier to setup and requires less knowledge overall. EC2 requires you to be a sysadmin and understand acronyms like VPC, EBS, VPN, AMI, etc.
Posting this here, since this is the first thread I found when searching for running NodeJS Passport authentication on Lamdba.
Since you can run Express apps on Lamda, you really could run Passport on Lambda directly. However, Passport is really middleware specifically for Express, and if you're designing for Lamda in the first place you probably don't want the bloat of Express (Since the API Gateway basically does all that).
As #Jason has mentioned you can utilizing a custom authorizer. This seems pretty straight-forward, but who wants to build all the possible auth methods? That's one of the advantages of Passport, people have already done this for you.
If you're using the Servlerless Framework, someone has built out the "Serverless-authentication" project. This includes modules for many of the standard auth providers: Facebook, Google, Microsoft. There is also a boilerplate for building out more auth providers.
It took me a good bunch of research to run across all of this, so hopefully it will help someone else out.
but what about authentication?
The most modular approach is to use API Gateway's Custom Authorizers (new since Feb'16) to supply an AWS Lambda function that implement Authentication and Authorization.
I wrote a generic Custom Authorizer that works with Auth0 a the 3rd-party Single-Sign-On service.
See this question also: How to - AWS Rest API Authentication
Would I be able to use something like Passport within Lambda to do user authentication?
Not easily. Passport relies on callback URLs which you would have to create and configure.