AWS HTTP API Gateway as a proxy to private S3 bucket

AWS HTTP API Gateway as a proxy to private S3 bucket - amazon-web-services

I have a private S3 bucket with lots of small files. I'd like to expose the contents of the bucket (only read-only access) using AWS API Gateway as a proxy. Both S3 bucket and AWS API Gateway belong to the same AWS account and are in the same VPC and Availability Zone.
AWS API Gateway comes in two types: HTTP API, REST API. The configuration options of REST API are more advanced, additionally, REST API supports much more AWS services integrations than the HTTP API. In fact, the use case I described above is fully covered in one of the documentation tabs of REST API. However, REST API has one huge disadvantage - it's about 70% more expensive than the HTTP API, the price comes with more configuration options but as for now, I need only one - integration with the S3 service that's why I believe this type of service is not well suited for my use case. I started searching if HTTP API can be integrated with S3, and so far I haven't found any way to achieve it.
I tried creating/editing service-linked roles associated with the HTTP API Gateway instance, but those roles can't be edited (only read-only access). As for now, I don't have any idea where I should search next, or if my goal is even achievable using HTTP API.

I am a fan of AWSs HTTP APIs.
I work daily with an API that serves a very similar purpose. The way I have done it is by using AWS Lambda functions integrated with the APIs paths.
What works for me is this:
Define your API paths, and integrate them with AWS Lambda functions.
Have your integrated Lambda function return a signed URL for any objects you want to provide access to through API calls.
There are several different ways to pass the name of the object(s) you want to the Lambda function servicing the API call.
This is the short answer. I plan to give a longer answer at a later time. But this has worked for me.

Related

Does AWS API Gateway RestAPI allow the creation of path based versioning at the API RestAPI level natively?

For example, Azure API Management service allows the creation of an API "proxy" front end and the ability to create an api version such as
https://baseapi.com/apiName1/v1
Here is screenshot of that in Azure platform.
https://learn.microsoft.com/en-us/azure/api-management/api-management-get-started-publish-versions
Does AWS API Gateway RestAPIs allow this type of versioning natively?
If it does, how can I setup for example "v1" of a restAPI?
And if the AWS RestAPI "Stage" is the way to accomplish this, how would I still support the idea of creating stages per environment, while still doing versioning? To me stage seems more associated with environments, whereas versioning is a completely separate concept.
Note: The rest APIs are private

First limitation of private APIs from the documentation is the following:
Custom domain names are not supported for private APIs.
This limits what you can do with API Gateway to support multiple versions.
Run an API Gateway per version - This option grants you complete separation between API versions, however, unfortunately you will need to call a separate endpoint per API.
Deploy a single private API Gateway containing all API versions - This option scopes everything to a single endpoint but as a limitation could become quite complex and hard to manage depending on the number of verbs.
Hopefully in the future this feature from custom domains detailed below will be added.
Previous Answer - Before known using Private API
This can be done in API Gateway through the combination of stages and custom domain configuration.
If you deploy each version of API Gateway to either its own stage i.e. v1, v2 stage you have seperated the schema and actions between the versions.
Alternatively you can have a separate API and stage for each version of your specific API.
Then create a custom domain name for your API endpoint, under base path mappings you can map a specific subfolder v1 to the API and select the v1 stage in your API Gateway endpoint.

Application Load Balancers vs API Gateway

AWS comes with a service called Application Load Balancer and it could be a trigger to a lambda function. The way to call such a lambda function is by sending an HTTP/HTTPS request to ALB.
Now my question is how this is any different from using the API Gateway? And when should one use ALB over API Gateway (or the way around)?

One of the biggest reasons we use API gateway in front of our lambda functions instead of using an ALB is the native IAM (Identity and Access Management) integration that API GW has. We don't have to do any of the identity work ourselves, it's all delegated to IAM, and in addition to that, API GW has built-in request validation including validation of query string parameters and headers. In a nutshell, there are so many out of the box integrations what come with API GW, you wind up having to do a lot more work if you go the route of using an ALB.

It seems that the request/response limit is lower when using ALB, and WebSockets are not supported:
The maximum size of the request body that you can send to a Lambda
function is 1 MB. For related size limits, see HTTP Header Limits.
The maximum size of the response JSON that the Lambda function can
send is 1 MB.
WebSockets are not supported. Upgrade requests are rejected with an
HTTP 400 code.
See: https://docs.aws.amazon.com/elasticloadbalancing/latest/application/lambda-functions.html
Payload limit with API Gateway is discussed here: Request payload limit with AWS API Gateway
Also the article already mentioned by #matesio provides information about additional things to consider when choosing between ALB and API Gateway.
Notable tweet referenced in the mentioned article:
If you are building an API and want to leverage AuthN/Z, request
validation, rate limiting, SDK generation, direct AWS service backend,
use #APIGateway. If you want to add Lambda to an existing web app
behind ALB you can now just add it to the needed route.
(From: Dougal Ballantyne, the Head of Product for Amazon API Gateway)

API gateways usually are richer in functionality than Load balancers. In addition to load balancing, API gateways often capable to do the following:
Content based based routing (some calls to v1 and some calls to v2 and so on, based on certain criteria)
IAM related functionality (eg: access validation )
Security (eg: SSL offloading, DDOS attack prevention, security credentials translation - eg: translating particular type of token to another, etc)
Payload translation (eg: XML to Json, etc)
Additionally, API gateways may be available in appliance form - and appliances are usually of low-latency, far more secure, etc.
I am not aware of specific features of AWS API gateway, but the above ones are general features of any API gateway. Nevertheless, when you have an option to use either LB or API gateway to offer a service on internet, API gateway is usually a better option, unless there are specific reasons to choose otherwise.

Auth between AWS API Gateway and Elastic Cloud hosted Elasticsearch

We're configuring an AWS API Gateway proxy in front of Elasticsearch deployed on Elastic Cloud (for throttling, usage plans, and various other reasons). In order to authenticate between the Gateway and ES, one idea is to configure an integration request on the API Gateway resource to add an Authorization header with creds created in ES. Is this the best strategy? It seems inferior to IAM roles, but that option isn't available as they're not accessible for the ES instance (Elastic Cloud hosts our deployment on AWS, but it's not a resource under our control). The API Gateway itself will require an API key.

I am not an expert at Elasticsearch, but it sounds like you want to securely forward a request from API gateway to another REST web service. Because Elasticsearch is an external REST web service to AWS, you will not have access to IAM roles. I had a similar integration to another cloud rest service (not elasticsearch) will do my best to review the tools in AWS that are available to complete the request.
One idea is to configure an integration request on the API Gateway resource to add an Authorization header with creds created in ES. Is this the best strategy?
This is the most straightforward strategy. In API Gateway, you can map custom headers in the Integration Request. This is where you will map your Authorization header for Elastic Search.
Similarly you can map your Authorization Header as a "Stage Variable" which will make it easier to maintain if the Authorization Header will change across different Elasticsearch environments.
In both strategies, you are storing your Authorization Header in API Gateway. Since the request to Elasticsearch should be HTTPS, the data will be secure in transit. This thread has more information about storing credentials in API Gateway.
From MikeD#AWS: There are currently no known issues with using stage variables to manage credentials; however, stage variables were not explicitly designed to be a secure mechanism for credentials management. Like all API Gateway configuration information, stage variables are protected using standard AWS permissions and policies and they are encrypted when transmitted over the wire. Internally, stage variables are treated as confidential customer information.
I think this applies to your question. You can store the Authorization Header in the API Gateway Proxy, however you have to acknowledge that API Gateway Configuration information was not explicitly designed for sensitive information. That being said, there are no known issues with doing so. This approach is the most straightforward to configure if you are willing to assume that risk.
What is a more "AWS" Approach?
An "AWS" approach would be to use the services designed for the function. For example, using the Key Management Service to store your Elasticsearch Authorization Header.
Similarly to the tutorial referenced in the comments, you will want to forward your request from API Gateway to Lambda. You will be responsible for creating the HTTPS request to Elasticsearch in the language of your choice. There are several tutorials on this but this is the official AWS documentation. AWS provides blueprints as a template to start a Lambda Function. The Blueprint https-request will work.
Once the request is being forwarded from API Gateway to Lambda, configure the authorization header for the Lambda request as an Environment Variable and implement Environment Variable Encryption. This is a secure recommended way to store sensitive data, such as the Elasticsearch authorization header.
This approach will require more configuration but uses AWS services for intended purposes.
My Opinion: I initially used the first approach (Authorization Headers in API Gateway) to authenticate with a dev instance because it was quick and easy, but as I learned more I decided the second approach was more aligned with the AWS Well Architected Framework

AWS API Gateway Lambda as a proxy for microservices

As my project is going to be deployed on AWS, we started thinking about AWS API Gateway as a way to have one main entry point for all of our microservices(frankly speaking, we also would like to use by some other reasons like security). I was playing with API Gateway REST API and I had feeling that it it a bit incovinient if we have to register there every REST service we have.
I found very good option of using AWS API Gateway and lambda function as a proxy. It is described here:
https://medium.com/wolox-driving-innovation/https-medium-com-wolox-driving-innovation-building-microservices-api-aws-e9a455cc3456
https://aws.amazon.com/blogs/compute/using-api-gateway-with-vpc-endpoints-via-aws-lambda
I would like to know your opinion about this approach. May be you could also share some other approaches that can simplify API Gateway configuration for REST API?

There are few considerations when you proxy your existing services through API Gateway.
If your backend is not publicly then you need to setup a VPC and a site to site VPN connection from the VPC to your backend Network and use Lambda's to proxy your services.
If you need do any data transformations or aggregations, you need to use Lambda's(Inside VPC is optional unless VPN connection is needed).
If you have complex integrations behind the API gateway for your services, you can look into having ESB or Messaging Middleware running in your on-premise or AWS then proxy to API Gateway.
You can move data model schema validations to API Gateway.
You can move service authentication to API Gateway by writing a Custom Authorizer Lambda.
If you happen to move your User pool and identity service to AWS, you can migrate to AWS Cognito Manage Service and use AWS Cognito Authorizer in API Gateway to authenticate.

For usecases when you adopt dumb pipes (as described on martinfowler.com) AWS API Gateway is a reasonable option.
For AWS API Gateway I'd suggest to describe/design your API first with RAML or OpenAPI/Swagger and then import into AWS using AWS API Importer.
As soon as you plan to move logic in there, such as dynamic routing, detailed monitoring, alerting, etc, I'd suggest considering other approaches, such as:
Apigee
Mulesoft
WSO2
You can also host them on an EC2 within your VPC or opt-in for the hosted version. (which does have a significant pricetag in some cases)
For describing APIs you can use RAML (for Mulesoft) or OpenAPI (ex-Swagger, for Apigee and WSO2). You can also convert between them using APIMATIC which enables you to migrate your specification across various API Gateways (even AWS).

How to secure the APIGateway generated URL?

I have a serverless backend that operates with APIGateway and Lambda. Here is my architecture:
Currently, anyone with my APIGateway's URL can query or mutate the data. How do I protect the URL, so that only the client(react app) can access it. So, here is my concern, anyone can open the network tab in chrome console and get my APIGateway's URL and can use it using curl or postman. I want to prevent that.
Solutions I had in my mind:
Set up a CORS, so that only the origin can access it. But, I have a different lambda that invokes this URL. So, CORS wont work out.
I am sure there are some methods with the APIGateway itself. I am not getting right search term to get it from AWS documentation. I would also like to know what are the best practices to prevent accessing the backend URL apart from the Client(React App)
Update after #Ashan answer:
Thank you #Ashan for the answer. In my case, I use Auth0, so custom authoriser should work for me. I just came across this https://www.youtube.com/watch?v=n4hsWVXCuVI, which pretty much explains all the authorization and authentication possible with APIGateway. I am aware that authentication is possible either by Cognito/Auth0, but I have some simple websites, that has form, whose backend is handled by APIGateway. I can prevent the abuse from scraping bots using captcha, but once the attacker has got the URL, header and request parameters, he can invoke that million times. One thing, we can do is having an API-Key, but it is a static string with no expiration. Once the headers are with him, he can abuse it. So, any idea, how to prevent this in APIGateway. If not any other service apart from AWS that I can look for? Would be glad, If I get an answer for this.

Currently API Gateway does not support private urls, so it will be publicly available.
To restrict access you need to use a authorizer to authenticate and authorize the request using IAM policies. There are two options available at the moment.
IAM authorizer
Custom authorizer
If your authentication flow can directly (AWS STS, IAM user access keys or roles) or indirectly(Using AWS Cognito Userpools or any other SSO provider) can get temporary security credentials, then you can use IAM authorizer. From API Gateway side no code involved and its a matter of selecting the IAM check box for each API Gateway resource. You can use the API Gateway SDKs to invoke API Gateway requests where the SDK will handle the heavy liftings in setting up authentication headers.
If you use your own authentication mechanism, then you can write a seperate Lambda function to validate the tokens. This Lambda function name can be specified at API Gateway with the http hearder name to access the custom token to verify the requests.
To control API usage by authorized consumers, using API Key is the only way native to AWS at the moment.
Since you are using S3 for the react app hosting, you can further reduce the attack surface by using AWS WAF and CloudFront infront your application stack. The API Key can be added to CloudFront headers to forward to your APIGateway origin and since CloudFront and APIGateway communication happens using SSL, its nearly impossible for someone to find the API key. Using AWS WAF you can limit malicious access for common attacks. This includes rate based blocking to limit someone from repeatedly invoking the API.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js