Deploying AWS Global infrastructure with API Gateway, Lambda, Cognito, S3, Dynamodb - amazon-web-services

Let say I need an API Gateway that is going to run Lambdas and I want to make the best globally distributed performing infrastructure. Also, I will use Cognito for authentication, Dynamodb, and S3 for user data and frontend statics.
My app is located at myapp.com
First the user get the static front end from the nearest location:
user ===> edge location at CloudFront <--- S3 at any region (with static front end)
After that we need to comunicate with API Gateway.
user ===> API Gateway ---> Lambda ---> S3 || Cognito || Dynamodb
API Gateway can be located in several regions, and even though is distributed with CloudFront, each endpoint is pointing to a Lambda located at a given region: Let say I deploy an API at eu-west-1. If a request is sent from USA, even if my API is on CloudFront, the Lambda it runs is located at eu-west-1, so latency will be high anyway.
To avoid that, I need to deploy another API at us-east-1 and all my Lambdas too. That API will be pointing to those Lambdas
If I deploy one API for every single region, I would need one endpoint for each one of them, and the frontend should decide which one to request. But how could we know which one is the nearest location?
The ideal scenario is a single global endpoint at api.myapp.com, which is going to go to the nearest API Gateway which runs the Lambdas located in that region too. Can I configure that using Route 53 latency routing with multiple A records pointing to each api gateway?
If this is not right way to do this, can you point me in the right direction?

AWS recently announced support for regional API endpoints using which you can achieve this.
Below is an AWS Blog which explains how to achieve this:
Building a Multi-region Serverless Application with Amazon API Gateway and AWS Lambda
Excerpt from the blog:
The default API endpoint type in API Gateway is the edge-optimized API
endpoint, which enables clients to access an API through an Amazon
CloudFront distribution. This typically improves connection time for
geographically diverse clients. By default, a custom domain name is
globally unique and the edge-optimized API endpoint would invoke a
Lambda function in a single region in the case of Lambda integration.
You can’t use this type of endpoint with a Route 53 active-active
setup and fail-over.
The new regional API endpoint in API Gateway moves the API endpoint
into the region and the custom domain name is unique per region. This
makes it possible to run a full copy of an API in each region and then
use Route 53 to use an active-active setup and failover.

Unfortunately, this is not currently possible. The primarily blocker here is CloudFront.
MikeD#AWS provides the info on their forums:
When you create a custom domain name it creates an associated CloudFront distribution for the domain name and CloudFront enforces global uniqueness on the domain name.
If a CloudFront distribution with the domain name already exists, then the CreateCloudFrontDistribution will fail and API Gateway will return an error without saving the domain name or allowing you to define it's associated API(s).
Thus, there is currently (Jun 29, 2016) no way to get API Gateway in multiple regions to handle the same domain name.
AWS has no update on providing the needful since confirming existence of an open feature request on July 4, 2016. AWS Form thread for updates

Checkout Lambda#Edge
Q: What is Lambda#Edge? Lambda#Edge allows you to run code across AWS
locations globally without provisioning or managing servers,
responding to end users at the lowest network latency. You just upload
your Node.js code to AWS Lambda and configure your function to be
triggered in response to Amazon CloudFront requests (i.e., when a
viewer request lands, when a request is forwarded to or received back
from the origin, and right before responding back to the end user).
The code is then ready to execute across AWS locations globally when a
request for content is received, and scales with the volume of
CloudFront requests globally. Learn more in our documentation.
Usecase, minimizing latency for globally distributed users
Q: When should I use Lambda#Edge? Lambda#Edge is optimized for latency
sensitive use cases where your end viewers are distributed globally.
Ideally, all the information you need to make a decision is available
at the CloudFront edge, within the function and the request. This
means that use cases where you are looking to make decisions on how to
serve content based on user characteristics (e.g., location, client
device, etc) can now be executed and served right from the edge in
Node.js-6.10 without having to be routed back to a centralized server.

Related

How to route requests to right tenant api gateway?

I am creating a multi tenant silo mode architecture to support a SAAS application. Following this link.
I am able to register new tenants and create their respective stack like this:
So far so good, the next step is to create each tenant its own domain, for example: tenant1.admin.foo.com, to access the same CloudFront distribution (the web front end must be the same for all). I can make this by creating a record in Route53 *.admin.foo.com that has access to CloudFront
THE PROBLEM:
I need to route every request to their respective tenant stack, for example: tenant1.api.foo.com/whatever should route to the api gateway created for tenant1.
At first I thought of creating an origin in CloudFront that routes to the api gateway, the problem with this is that CloudFront origins are limited to 25.
I was thinking in creating a record in Route53 to point to their respective api gateway, but the problem is that I will have to use custom domain in the api gateway, because they are limited to 120, and I expect to have more tenants than 120.
How can I make this routing?
Here is an illustration of a use case:
PS: Any advice is welcome.
You can setup a distribution with a wildcard (*.api.foo.com) set for the Alternate Domain Name (CNAMEs). If you attach a Lambda#Edge to the Origin Request (Under Cache Behavior settings), you can dynamically modify the host header to point to the appropriate API Gateway host (xxxxxx.execute-api.us-east-1.amazonaws.com).
AWS Blog where they did this, with S3 buckets for the origin.
It should translate fairly closely to APIGateway hostnames instead:
https://aws.amazon.com/blogs/networking-and-content-delivery/dynamically-route-viewer-requests-to-any-origin-using-lambdaedge/

AWS HTTP API Gateway as a proxy to private S3 bucket

I have a private S3 bucket with lots of small files. I'd like to expose the contents of the bucket (only read-only access) using AWS API Gateway as a proxy. Both S3 bucket and AWS API Gateway belong to the same AWS account and are in the same VPC and Availability Zone.
AWS API Gateway comes in two types: HTTP API, REST API. The configuration options of REST API are more advanced, additionally, REST API supports much more AWS services integrations than the HTTP API. In fact, the use case I described above is fully covered in one of the documentation tabs of REST API. However, REST API has one huge disadvantage - it's about 70% more expensive than the HTTP API, the price comes with more configuration options but as for now, I need only one - integration with the S3 service that's why I believe this type of service is not well suited for my use case. I started searching if HTTP API can be integrated with S3, and so far I haven't found any way to achieve it.
I tried creating/editing service-linked roles associated with the HTTP API Gateway instance, but those roles can't be edited (only read-only access). As for now, I don't have any idea where I should search next, or if my goal is even achievable using HTTP API.
I am a fan of AWSs HTTP APIs.
I work daily with an API that serves a very similar purpose. The way I have done it is by using AWS Lambda functions integrated with the APIs paths.
What works for me is this:
Define your API paths, and integrate them with AWS Lambda functions.
Have your integrated Lambda function return a signed URL for any objects you want to provide access to through API calls.
There are several different ways to pass the name of the object(s) you want to the Lambda function servicing the API call.
This is the short answer. I plan to give a longer answer at a later time. But this has worked for me.

Is it possible to remove API Gateway from the equation to serve Lambda over public internet?

Currently, my application resides in lambda which I serve using HTTP API (API Gateway V2). This setup exists in multiple regions. Meaning, API Gateway invokes lambda in the same region which accesses DynamoDB Global Table in the same region. I use Route 53 to serve nearest API Gateway to user.
The problem I faced: API Gateway doesn't support redirection from http to https. I can achieve this with CloudFront. But, it'll increase cost as well as latency.
Can I remove API Gateway from the equation and use Lambda#Edge to access DynamoDB Table near the user? Can CloudFront be used to replace API Gateway?
Yes you can. The docs write:
Functions triggered by origin request and response events as well as functions triggered by viewer request and response events can make network calls to resources on the internet, and to AWS services such as Amazon S3 buckets, DynamoDB tables, or Amazon EC2 instances.
However, there are many limitations to what lambda#edge can do, as compared to a regular lambda. Examples are:
only python and nodejs,
difficulty in debugging, as lambda logs will be in region when it runs, not in one central region,
timeout limits on calls to DynamoDb (5 or 30 seconds) depending if its origin or viewer function,
no lambda layers
max memory of 128 MB for viewer side functions
deployment package size can be max 1 MB for viewer side functions
Thus if you can work with these and other limitations of lambda#edge, then you can use it to work with DynamoDb.

Application Load Balancers vs API Gateway

AWS comes with a service called Application Load Balancer and it could be a trigger to a lambda function. The way to call such a lambda function is by sending an HTTP/HTTPS request to ALB.
Now my question is how this is any different from using the API Gateway? And when should one use ALB over API Gateway (or the way around)?
One of the biggest reasons we use API gateway in front of our lambda functions instead of using an ALB is the native IAM (Identity and Access Management) integration that API GW has. We don't have to do any of the identity work ourselves, it's all delegated to IAM, and in addition to that, API GW has built-in request validation including validation of query string parameters and headers. In a nutshell, there are so many out of the box integrations what come with API GW, you wind up having to do a lot more work if you go the route of using an ALB.
It seems that the request/response limit is lower when using ALB, and WebSockets are not supported:
The maximum size of the request body that you can send to a Lambda
function is 1 MB. For related size limits, see HTTP Header Limits.
The maximum size of the response JSON that the Lambda function can
send is 1 MB.
WebSockets are not supported. Upgrade requests are rejected with an
HTTP 400 code.
See: https://docs.aws.amazon.com/elasticloadbalancing/latest/application/lambda-functions.html
Payload limit with API Gateway is discussed here: Request payload limit with AWS API Gateway
Also the article already mentioned by #matesio provides information about additional things to consider when choosing between ALB and API Gateway.
Notable tweet referenced in the mentioned article:
If you are building an API and want to leverage AuthN/Z, request
validation, rate limiting, SDK generation, direct AWS service backend,
use #APIGateway. If you want to add Lambda to an existing web app
behind ALB you can now just add it to the needed route.
(From: Dougal Ballantyne, the Head of Product for Amazon API Gateway)
API gateways usually are richer in functionality than Load balancers. In addition to load balancing, API gateways often capable to do the following:
Content based based routing (some calls to v1 and some calls to v2 and so on, based on certain criteria)
IAM related functionality (eg: access validation )
Security (eg: SSL offloading, DDOS attack prevention, security credentials translation - eg: translating particular type of token to another, etc)
Payload translation (eg: XML to Json, etc)
Additionally, API gateways may be available in appliance form - and appliances are usually of low-latency, far more secure, etc.
I am not aware of specific features of AWS API gateway, but the above ones are general features of any API gateway. Nevertheless, when you have an option to use either LB or API gateway to offer a service on internet, API gateway is usually a better option, unless there are specific reasons to choose otherwise.

Prevent AWS Lambda flooding

I'm considering about moving my service from a VPS to AWS Lambda + DynamoDB to use it as a FaaS, because it's basically 2 API GET calls that fetch info from the database and serve it, and the normal use of those API calls are really rare (about 50 times a week)
But it makes me wonder... As I can't setup a limit on how many calls I want to serve each month, some attacker could theoretically flood my service by calling it a couple thousands times a day and make my AWS bill extremely expensive. Setting up a limit per month wouldn't be a nice idea either, because the attacker could flood the first day and I won't have more requests to serve. The ideal thing would be to set up a limit on request rate per client.
Anyone knows how could I protect it? I've seen that AWS also offers a Firewall, but that's for CloudFront. Isn't there any way to make it work with Lambda directly?
You can put AWS CloudFront in front API Gateway and Lambda so that, the traffic will be served to outside through CloudFront.
In addition by configuring AWS WAF with rate base blocking, it is possible to block high frequencies of access by attackers.
However when configuring AWS CloudFront in front of API Gateway and Lambda, you also need to restrict direct access to API Gateway (Since API Gateway will be publicly accessible by default). This can be achieved in following ways.
Enable API Keys for API Gateway and use the API Key in AWS CloudFront Headers in the Origin.
Use a Token Header and Verify it using a Custom Authorizer Lambda function.
Two options spring to mind:
place API Gateway in front of Lambda so that API requests
have to be authenticated. API Gateway also has built-in throttles and other useful features.
invoke the Lambda directly, which will require the client
invoking the Lambda to have the relevant IAM credentials.