I'm a bit confused by how API Gateway and CloudFront work together. Ultimately, I want to be able to have a custom header and value be considered part of my cache key. I know this can be done by whitelisting (if I'm using CloudFront).
So when I make the following request:
GET /pagesRead/4
Some-Header: fizz
This returns, for instance, '29 pages'
Then there's a post that updates id 4 to '45 pages'
If I make this request
GET /pagesRead/4
Some-Header: buzz
It will now return '45 pages'
But I'm using API Gateway, which obviously has it's own CloudFront behind the scenes. Is there a way I can configure API Gateway to use its 'behind-the-scenes' CloudFront to whitelist my custom header? Does this even need to be done?
According to this documentation: AWS-API-Gatway, It seems like I can just enable API caching in API Gateway, and it will consider my headers as part of the cache key.
Am I understanding this correctly? If all I want is for my headers to be a part of the cache key, what's the difference between 'Enabling API Caching' in API Gateway and adding a CloudFront instance on top of API Gateway and white-listing in CloudFront?
UPDATE:
I've added a header like this in API Gateway:
But on GET, I am getting stale data from the cache.
GET /pagesRead/4 test-header: buzz
The difference is that API Gateway doesn't actually use the CloudFront cache. CloudFront does provide some front-end services for all API Gateway APIs edge-optimized API endpoints¹, but caching does not appear to be one of them, based on the following:
API Gateway enables caching by creating a dedicated cache instance.
...and...
You should not use the X-Cache header from the CloudFront response to determine if your API is being served from your API Gateway cache instance.
https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-caching.html
It is possible to cascade an Edge Optimized API Gateway endpoint behind a CloudFront distribution that you create, but it's not without certain inconveniences. Latency increases somewhat, since you're passing through more systems. Given that configuration, the CloudFront-Is-*-Viewer and CloudFront-Viewer-Country headers, and probably any notion of the client IP will be invalid, because the API Gateway deployment will see attributes of the additional CloudFront distribution that is in front of it, rather than of the real client. X-Forwarded-For will still be right, but will have to be handled with care, because it will contain one extra hop that will have to be correctly handled.
For an application where you want to put API Gateway behind your own CloudFront distribution, use one of the new Regional endpoints to deploy your API stage.
it will consider my headers as part of the cache key.
You do have to configure the cache key explicitly, based on the document you cited, but yes, the API Gateway cache will then cache responses based on the value of that header, and other attributes in the cache key.
¹ edge optimized endpoints. API Gateway now has two different kinds of endpoints. The original design is now called edge-optimized, and the new option is called regional. Regional endpoints do not use front-end services from CloudFront, and may offer lower latency when accessed from EC2 within the same AWS region. All existing endpoints were categorized as edge-optimized when the new regional capability was rolled out. With a regional endpoint, the CloudFront-* headers are not present in the request, unless you use your own CloudFront distribution and whitelist those headers for forwarding to the origin.
When you enable caching in API Gateway,
You can also optionally add,
RequestPath
QueryStringParameters
Http Headers
E.g.,
http://example.com/api/{feature}/?queryparam=queryanswer [ with header customheader=value1 ]
Above url gives you option to cache based on,
Just the URL without PathParameters: http://example.com/api/
Optionally include PathParameter: http://example.com/api/{feature}/
Optionally include QueryStrings: http://example.com/api/{feature}/?queryparam=queryanswer
Optionally include Http Headers: You can either include regular header like User-Agent or Custom headers
Whatever the caching mode you have in API-Gateway, you can also have it under CloudFront as well.
Also to clear up the cache, in your http response send Cache-Control: max-age=0
Hope it helps.
Related
I am new to Lambda so I would like to understand how the following scenario can be deployed:
Lambda connected to API gateway ( which in turn connected to a reverse proxy)
The request from API gateway to lambda needs to be routed to 3 different ALBs each in different (VPCs) private subnets.
What configuration changes I need to bring in to achieve this apart from writing Lambda function (using python) to rewrite the urls?
It would be nice if someone can explain the message flow here.
Thanks
Abhijit
This to me seems to be multiple issues, I'll try to break down whats trying to be achieved.
Using ALBs with API Gateway
There are many options for how API Gateway can use load balancers to serve http traffic. The solution really depends on which type of API Gateway you are trying to use.
Assuming your API is either REST or WebSockets you are left with 2 choices for enabling HTTP traffic inbound to a load balancer:
Directly as a HTTP or HTTP_PROXY request, listing publicly accessible hostnames to which API Gateway will forward the traffic.
If you want to keep transit private then your only option is to create a network load balancer and make use of VPCLink to create a private connection between API Gateway and your Network resource.
If you're creating a HTTP API (sometimes referred to as API Gateway v2) then you can make use of direct connection to a private ALB, however be aware that at this time HTTP API does not support all the features of REST APIs so you would want to compare feature sets before doing this.
Using multiple load balancers to direct traffic
You determine the value per each resource/method combo, for example POST /example would be assigned its target endpoint, but only one.
My suggestion would be to make use of stage variables if you're using a REST API to specify any endpoints that you're forwarding traffic for the following reasons:
Prevents mistyping of domain names
Allows quick replacement of a hostname
Provides functionality for canary deployments to shift traffic proportionally between 2 variable names (these could be anything as long as the type is the same e.g. Lambda to another Lambda, not Lambda to a load balancer).
Using a Lambda to redirect
Technically a Lambda can perform a redirect by return a response using the below syntax
{
statusCode: 302,
headers: {
Location: 'https://api.example.com/new/path',
}
}
However be aware this will change the request to become a GET request, this will also remove the payload of the body request when the redirect occurs. Additionally you would need to set this up for every resource/method combo that you wanted to redirect.
There are 2 options that you have available to get around these issues, both involve using CloudFront combined with a Lambda#Edge function.
The first solution can act as a workaround for the request type changing, in the Origin Request event you could modify the Request URI property to match the new URI structure. By doing this your clients would be able to use the API still, whilst you would notify them of the depercations to certain paths that you were migrating.
The second solution acts as a workaround for the need to add redirects to each resource/method combo which can create a lot of mess of methods just for redirects. You could create a Lambda#Edge function to perform the same redirect on an Origin Response event. You could create mappings in your Lambda function to work out which URL it should redirect to.
There are many great examples on the Lambda#Edge example functions page
I have an application where the client is hosted on S3 with a CloudFront distribution. The API is behind an API Gateway with a WAF, and the client makes http requests to the API to fetch and post data.
I want to restrict the access to the API such that it's only available from the client, and it should return an error when someone tries to access the API directly.
The trick is that the API is exposed to a 3rd party, so I cannot use API Gateway authorizers, because they must have direct access.
I set up a Custom Origin Header (My-Secret-Header: 1234567890qwertyuiop) in CloudFront, and I thought that I could create a rule in WAF to allow requests with this header (plus the 3rd party based on other criteria, but that part is working well, and it's not an issue), and block everything else.
The problem is that My-Secret-Header never makes it to the WAF, and it doesn't get added to the http requests originated from the client application.
I also tried to add the custom header with Lambda#Edge, no success. I created heaps of logs in with Lambda#Edge, and the event.Records[0].cf.request.origin.s3.customHeaders shows My-Secret-Header (which is expected).
What is the best way to add a custom header to the client request, so that it would be possible to create a rule in WAF?
I want to restrict the access to the API such that it's only available from the client, and it should return an error when someone tries to access the API directly.
The short answer is: there is no way to do this. There is no way to tell if a request originates from JavaScript in the browser, a Postman call, a user typing the URL in the address bar, etc.
Custom Headers in CloudFront are not headers that are added onto API requests that the user makes from the served files. They are headers that CloudFront uses to retrieve the static source that it is serving. In the case that the source is in an S3 bucket, these custom headers are on the request that CloudFront uses to retrieve files from your S3 bucket.
Once a user has the files that CloudFront serves (HTML, CSS, JavaScript, assets, etc.), CloudFront is no longer a part of the process. Any API calls made on the frontend do not go through CloudFront.
There are a few very weak ways to do what you are trying to do, but all are easily bypassable and absolutely cannot be used when security is in any way necessary. For instance, you can make an API Key and hard-code it into the application, but it is completely exposed to anyone who can access the page. Same for hard-coded access key ID and secret access key.
Ultimately what you need is an authentication system, some way to make sure that users are allowed to make the API calls that they are making. I don't know if this fits your use case, but Amazon Cognito is an excellent service that handles user authentication and federation.
from the docs here,
CloudFront adds the header to the viewer request before forwarding the request to your origin. The header value contains an encrypted string that uniquely identifies the request.
Now, when inspecting my appsync response on the client (postman), I find x-amz-cf-id as a response header. But I am sure my system has nothing to do with CloudFront,
My questions,
Does appsync uses cloudfront, (by default somewhere internally) ?
In what scenario(s) is this header added to my appsync response ?
What does it tells me(client) with respect to appsync (other than the cloudfront-req-id ofcourse) ?
appsync, api gateway, cognito and other lots of services use cloudFront in the backend/internally by AWS for high performance and to reduce latency etc purposes. CloudFront is playing just a role of reverse proxy , nothing else.
Because CloudFront always serves x-amz-cf-id header in response which can be used to track individual request(the string is unique), There is no use of this for appsync other then if AWS asks you provide his to track some failed requests if you contact them.
When you resolve the appsync endpoint and perform a reverse dns on that IP address, you'll see it actually use cloudfront.
Is it possible to enable/disable caching a request through the AWS API Gateway in the response of the request?
According to this document: http://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-caching.html It appears that the most granular one can get in defining cache settings is enabling/disabling caching for a specific API function. What I am wanting to do is allow the response for the API request to dictate whether or not it is to be cached. (i.e. I want my end API program to be able to determine if a response for a given request should be cached).
Is this possible, and if so how can it be accomplished?
Configure your own CloudFront distribution, with the API Gateway endpoint as the origin server. CloudFront web distributions respect Cache-Control headers from the origin server. If you customize that response, this should accomplish your objective.
API Gateway, as you may already know, runs behind some of the CloudFront infrastructure already, so this might seem redundant, but this appears to be the only way to take control of the caching behavior.
I have an AWS API Gateway endpoint(Invoke URL),
I created a Custom Domain, to map the Domain with my API Gateway as the Invoke URL is made of non user friendly characters,
I mapped the Custom Domain with the API Gateway,
I followed these steps -
http://docs.aws.amazon.com/apigateway/latest/developerguide/how-to-custom-domains.html
Both the Default Invoke URL and Custom Domain endpoint are responding correct data,
So far so good.
On further testing I found out that as my default Invoke URL had Caching enabled on it,
I enabled API Gateway cache by following this -
http://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-caching.html
The response was in miliseconds,
Weirdly the Custom Domian mapped endpoint is responding slower and looks like it is not Caching the previous responses, even though Caching is properly enabled on the API Gateway,
I need to Enable Caching on the Custom Domian as well,
Do I need to add CloudFront in front of the API Gateway or something?
How do I achieve this?
I am not able to find my Invoke URL in CloudFront origin,
I couldn't understand these solutions either -
1. http://www.davekonopka.com/2016/api-gateway-domain.html
2. How do you add CloudFront in front of API Gateway