Implement Lambda#Edge authentication for CloudFront

Implement Lambda#Edge authentication for CloudFront - amazon-web-services

I am looking to add the Lambda#Edge to one of our services. The goal is to regex the url for certain values and compare those against a header value to ensure authorization. If the value is present then it is compared and if rejected should return a 403 immediately to the user. If the value compared matches or the url doesn't contain a particular value, then the request continues on as an authorized request.
Initially I was thinking that this would occur with a "viewer request" event. Some of the posts and comments on SO suggest that the "origin request" is more ideal for this check. But right now I've been trying to play around with the examples in the documentation on one of our CF end points but I'm not seeing expected results. The code is the following:
'use strict';
exports.handler = (event, context, callback) => {
const request = event.Records[0].cf.request;
request.headers["edge-test"] = [{
key: 'edge-test',
value: Date.now().toString()
}];
console.log(require('util').inspect(event, { depth: null }));
callback(null, request);
};
I would expect that there should be a logged value inside cloudwatch and a new header value in the request, yet I'm not seeing any logs nor am I seeing the header value when the request comes in.
Can someone shed some light on why things don't seem to be executing as to what I would think should be the response? Is my understanding of what the expected output wrong? Is there configuration that I may be missing (My distribution ID on the trigger is set to the instance we want, and the behavior was set to '*')? Any help is appreciated :)

First, a few notes;
CloudFront is (among other things) a web cache.
A web cache's purpose is to serve content directly to the browser instead of sending the request to the origin server.
However, one of the most critical things a cache must do correctly is not return the wrong content. One of the ways a cache can return the wrong content is by not realizing that certain request headers may cause the orogin server to vary the response it returns for a given URI.
CloudFront has no perfect way of knowing this, so its solution -- by default -- is to remove almost all of the headers from the request before forwarding it to the origin. Then it caches the received response against exactly the request that it sent to the origin, and will only use that cached response for future identical requests.
Injecting a new header in a Viewer Request trigger will cause that header to be discarded after it passes through the matching Cache Behavior, unless the cache behavior specifically is configured to whitelist that header for forwarding to the origin. This is the same behavior you would see if the header had been injected by the browser, itself.
So, your solution to get this header to pass through to the origin is to whitelist it in the cache behavior settings.
If you tried this same code as an Origin Request trigger, without the header whitelisted, CloudFront would actually throw a 502 Bad Gateway error, because you're trying to inject a header that CloudFront already knows you haven't whitelisted in the matching Cache Behavior. (In Viewer Request, the Cache Behavior match hasn't yet occurred, so CloudFront can't tell if you're doing something with the headers that will not ultimately work. In Origin Request, it knows.) The flow is Viewer Request > Cache Behavior > Cache Check > (if cache miss) Origin Request > send to Origin Server. Whitelisting the header would resolve this, as well.
Any header you want the origin to see, whether it comes from the browser, or a request trigger, must be whitelisted.
Note that some headers are inaccessible or immutable, particularly those that could be used to co-opt CloudFront for fraudulent purposes (such as request forgery and spoofing) and those that simply make no sense to modify.

Related

Cloudfront Lambda#edge set cookie on Viewer Request

Update: Collected my thoughts better
I'm generating a unique identifier (UUID) for each user in the Viewer Request Lambda, and then selecting a cached page to return based upon that UUID. This works.
Ideally, this user would always have the same UUID.
I must generate that UUID in the Viewer Request if it is not present in a cookie on that Viewer Request. I also need that UUID to be set as a cookie, which of course happens in the response not the request.
Without caching, my server simply handles taking a custom header and creating a Set-Cookie in the response header.
I am not finding a way to handle this if I want to cache the page. I can ignore the request header for caching and serve the correct cached page, but then the user does not persist that UUID as no cookie is set to be utilized in their next request.
Has anyone accomplished something like this?
Things I'm trying
There are a few angles I'm working on with this, but haven't been able to get to work yet:
Some sort of setting in Cloudfront I'm unaware of that handles the header or other data pass-through from Viewer Request to Viewer Response, which could be used in a second lambda in Cloudfront.
Modify the response object headers preemptively in the Viewer Request. I don't think this is possible, as they return headers are not yet created, unless there's some built-in Cloudfront methodology I'm missing.
An existing pass-through header of some sort, I don't know if that's even a thing since I'm not intimately familiar with this aspect of request-response handling, but worth a shot.
Possibly (haven't tried yet though) I could create the entire response object in the Client Request lambda and somehow serve the cached page from there, modifying the response headers then passing it into the callback method.
Tobin's answer actually works, but is not a solid solution. If the user is not storing or serving their cookies it becomes an infinite loop, plus I'd rather not throw a redirect up in front of all of my pages if I can avoid it
Somewhat-working concept
Viewer Request Lambda, when UUID not present in cookies, generates UUID
Viewer Request Lambda sets UUID in cookies on header in request object. Callback with updated request object passed in
Presence of UUID cookie busts Cloudfront cache
Origin Request Lambda is triggered with UUID present
Origin Request Lambda calls original request URL again via http.get with UUID cookie set (40KB limit makes doing this in the Viewer Request Lambda impractical)
Second scenario for Viewer Request Lambda, seeing UUID now present, strips the UUID cookie then continues the request normally
Second Origin Request if not yet cached - Cached response if cached, as cache-busting UUID is not present - returns actual page HTML to First Origin Request
First Origin Request receives response from http.get containing HTML
First Origin Request creates custom response object containing response body from http.get and Set-Cookie header set with our original UUID
Subsequent calls, having the UUID already set, will strip the UUID from the cookie (to prevent cache busting) and skip directly to the second-scenario in the Viewer Request Lambda which will directly load the cached version of the page.
I say "somewhat" because when I try to hit my endpoint, I get a binary file downloaded.
EDIT
This is because I was not setting the content-type header. I now have only a 302 redirect problem... if I overcome this I'll post a full answer.
Original question
I have a function on the Viewer Request that picks an option and sets some things in the request before it's retrieved from the cache or server.
That works, but I want it to remember that choice for future users. The thought is to simply set a cookie I can read the next time that user comes through. As this is on the Viewer Request and not the Viewer Response I haven't figured out how to make that happen, or if it even is possible via the Lambda itself.
Viewer Request ->
Lambda picks options (needs to set cookie) ->
gets corresponding content ->
returns to Viewer with set-cookie header intact
I have seen the examples and been able to set cookies successfully in the Viewer Response via a Lambda. That doesn't help me much as the decision needs to be made on the request. Quite unsurprisingly adding this code into the Viewer Request shows nothing in the response.

I would argue that the really correct way to set a nonexistent cookie would be to return a 302 redirect to the same URI with Set-Cookie, and let the browser redo the request. This probably would not have much of an impact since the browser can reuse the same connection to "follow" the redirect.
But if you insist on not doing it that way, then you can inject the cookie into the request with your Viewer Request trigger and then emit a Set-Cookie with the same value in your Viewer Response trigger.
The request object, in a viewer response event, can be found at the same place where it's found in the original request event, event.Records[0].cf.request.
In a viewer-response trigger, this part of the structure contains the "request that CloudFront received from the viewer and that might have been modified by the Lambda function that was triggered by a viewer request event."
Use caution to ensure that you handle the cookie header correctly. The Cookie request header requires careful and accurate manipulation because the browser can use multiple formats when multiple cookies exist.
Once upon a time, cookies were required to be sent as a single request header.
Cookie: foo=bar; buzz=fizz
Parse these by splitting the values on ; followed by <space>.
But the browser may also split them with multiple headers, like this:
Cookie: foo=bar
Cookie: buzz=fizz
In the latter case, the array event.Records[0].cf.request.headers.cookie will contain multiple members. You need to examine the value attribute of each object in that array, check for multiple values within each, as well as accommodating the fact that the array will be completely undefined (not empty) if no cookies exist.
Bonus: Here's a function I wrote, that I believe correctly handles all cases including the case where there are no cookies. It will extract the cookie with the name you are looking for. Cookie names are case-sensitive.
// extract a cookie value from request headers, by cookie name
// const my_cookie_value = extract_cookie(event.Records[0].cf.request.headers,'MYCOOKIENAME');
// returns null if the cookie can't be found
// https://stackoverflow.com/a/55436033/1695906
function extract_cookie(headers, cname) {
const cookies = headers['cookie'];
if(!cookies)
{
console.log("extract_cookie(): no 'Cookie:' headers in request");
return null;
}
// iterate through each Cookie header in the request, last to first
for (var n = cookies.length; n--;)
{
// examine all values within each header value, last to first
const cval = cookies[n].value.split(/;\ /);
const vlen = cval.length;
for (var m = vlen; m--;)
{
const cookie_kv = cval[m].split('=');
if(cookie_kv[0] === cname)
{
return cookie_kv[1];
}
} // for m (each value)
} // for n (each header)
// we have no match if we reach this point
console.log('extract_cookie(): cookies were found, but the specified cookie is absent');
return null;
}

Are you able to add another directory: with the first cookie setter request, return (from the lambda) a redirect which includes the cookie-set header, that redirects to your actual content?
OK, long way round but:
Take cookie instruction from the incoming request
Set this somewhere (cache, etc)
Let the request get your object
on the Response, also call a function that reads the (cache) and sets the set-cookie header on the response if needed?

It's been more than one year since the question was published. I hope you found a solution and you can share it with us!
I am facing the same problem and I've thinking also about the infinite loop... What about this?
The viewer request event sends back a 302 response with the cookie set, e.g. uuid=whatever and a GET parameter added to the URL in the Location header, e.g. _uuid_set_=1.
In the next viewer request where the GET parameter _uuid_set_ is set (and equals 1, but this is not needed), there will be two options:
Either the cookie uuid is not set, in which case you can send back a response 500 to break the loop, or whatever fits your needs,
or the cookie is set, in which case you send another 302 back with the parameter _uuid_set_ removed, so that it is never seen by the end user and cannot be copy-pasted and shared and we all can sleep at night.

Intermittent 403 CORS Errors (Access-Control-Allow-Origin) With Cloudfront Using Signed URLs To GET S3 Objects

In Brief
In order to keep the uploaded media (S3 objects) private for all the clients on my multi-tenant system I implemented a Cloudfront CDN deployment and configured it (and its Origin S3 Bucket) to force the use of signed URLs in order to GET any of the objects.
The Method
First, the user is authenticated via my system, and then a signed URL is generated and returned to them using the AWS.CloudFront.Signer.getSignedUrl() method provided by the AWS JS SDK. so they can make the call to CF/S3 to download the object (image, PDF, docx, etc). Pretty standard stuff.
The Problem
The above method works 95% of the time. The user obtains a signed URL from my system and then when they make an XHR to GET the object it's retrieved just fine.
But, 5% of the time a 403 is thrown with a CORS error stating that the client origin is not allowed by Access-Control-Allow-Origin.
This bug (error) has been confirmed across all environments: localhost, dev.myapp.com, prod.myapp.com. And across all platforms/browsers.
There's such a lack of rhyme or reason to it that I'm actually starting to think this is an AWS bug (they do happen, from time-to-time).
The Debugging Checklist So Far
I've been going out of my mind for days now trying to figure this out. Here's what I've attempted so far:
Have you tried a different browser/platform?
Yes. The issue is present across all client origins, browsers (and
versions), and all platforms.
Is your S3 Bucket configured for CORS correctly?
Yes. It's wide-open in fact. I've even set <MaxAgeSeconds>0</MaxAgeSeconds> in
order to prevent cacheing of any pre-flight OPTIONS requests by the
client:
Is the signed URL expired?
Nope. All of the signed URLs are set to expire 24hrs after generation. This problem has shown up even seconds
after any given signed URL is generated.
Is there an issue with the method used to generate the signed URLs?
Unlikely. I'm simply using the AWS.CloudFront.Signer.getSignedUrl()
method of their JS SDK. The signed URLs do work most of the time, so
it would seem very strange that it would be an issue with the signing
process. Also, the error is clearly a CORS error, not a signature
mis-match error.
Is it a timezone/server clock issue?
Nope. The system does serve users across many timezones, but that
theory proved to be false given that the signed URLs are all generated
on the server-side. The timezone of the client doesn't matter, it gets
a signed URL good for 24hrs from the time of generation no matter what
TZ it's in.
Is your CF distro configured properly?
Yes, so far as I can make out by following several AWS guides,
tutorials, docs and such.
Here's a screenshot for brevity. You can see that I've disabled
cacheing entirely in an attempt to rule that out as a cause:
Are you seeing this error for all mime-types?
No. This error hasn't been seen for any images, audio, or video files
(objects). With much testing already done, this error only seems to
show up when attempting to GET a document or PDF file (.doc, .docx,
.pdf). This lead me to believe that this was simply an Accept header
mis-match error: The client was sending an XHR with the the header
Accept: pdf, but really the signature was generated for Accept: application/pdf.
I haven't yet been able to fully rule this out as a
cause. But it's highly unlikely given that the errors are
intermittent. So if it were a Accept header mis-match problem then it
should be an error every time.
Also, the XHR is sending Accept: */* so it's highly unlikely this is where the issue is.
The Question
I've really hit a wall on this one. Can anyone see what I'm missing here? The best I can come up with is that this is some sort of "timing" issue. What sort of timing issue, or if it even is a timing issue, I've yet to figure out.
Thanks in advance for any help.

Found the solution for the same on serverfault.
https://serverfault.com/questions/856904/chrome-s3-cloudfront-no-access-control-allow-origin-header-on-initial-xhr-req
You apparently cannot successfully fetch an object from HTML and then
successfully fetch it again with as a CORS request with Chrome and S3
(with or without CloudFront), due to peculiarities in the
implementations.
Adding the answer from original post so that it does not get lost.
Workaround:
This behavior can be worked-around with CloudFront and Lambda#Edge, using the following code as an Origin Response trigger.
This adds Vary: Access-Control-Request-Headers, Access-Control-Request-Method, Origin to any response from S3 that has no Vary header. Otherwise, the Vary header in the response is not modified.
'use strict';
// If the response lacks a Vary: header, fix it in a CloudFront Origin Response trigger.
exports.handler = (event, context, callback) => {
const response = event.Records[0].cf.response;
const headers = response.headers;
if (!headers['vary'])
{
headers['vary'] = [
{ key: 'Vary', value: 'Access-Control-Request-Headers' },
{ key: 'Vary', value: 'Access-Control-Request-Method' },
{ key: 'Vary', value: 'Origin' },
];
}
callback(null, response);
};

API Gateway Lambda CORS handler. Getting Origin securely

I want to implement CORS for multiple origins and I understand I need to do so via a lambda function as I cannot do that via the MOCK method
exports.handler = async (event) => {
const corsUrls = (process.env.CORS_URLS || '').split(',')
const requestOrigin = (event.headers && event.headers.origin) || ''
if (corsUrls.includes(requestOrigin)) {
return {
statusCode: 204,
headers: {
"Access-Control-Allow-Headers": 'Content-Type,X-Amz-Date,Authorization,X-Api-Key,X-Amz-Security-Token,X-Requested-With',
'Access-Control-Allow-Origin': requestOrigin,
'Access-Control-Allow-Methods': 'POST,DELETE,OPTIONS'
}
}
}
return {
statusCode: 403,
body: JSON.stringify({
status: 'Invalid CORS origin'
})
}
}
Firstly, does the above looks ok? Then I am getting origin from headers event.headers.origin. But I find that I can just set that header manually to "bypass" cors. Is there a reliable way to detect the origin domain?

Firstly, does the above looks ok?
Your code looks good to me at first glance, and other than your point But I find that I can just set that header manually to "bypass" cors, I don't see any major problems with it.
Then I am getting origin from headers event.headers.origin. But I find that I can just set that header manually to "bypass" cors. Is there a reliable way to detect the origin domain?
The code you are currently using is the only way I can think of how to detect the origin domain off the top of my head. Although as you said, you can just set that header manually, and there is 0 assurances that header is correct or valid. It shouldn't be used as a layer of trust for security. For browsers, they restrict how this header can be set (see Forbidden header name). But if you control the HTTP client (ex. curl, postman, etc.) you can easily can send whatever headers you want. There is nothing technology wise preventing me from sending any headers with whatever values I want to your web server.
Therefor, at the end of the day, it might not be a huge concern. If someone tampers with that header, they are opening themselves up to security risks and unexpected behavior. There are a ton of ways to bypass CORS, like this, or this, or this maybe. So at the end of the day, it's possible to bypass CORS, despite your best efforts to enforce it. Although all of those tricks are hacks, and probably won't be used by normal users. Same with changing the origin header, not likely to be done by normal users.
There are a few other tricks you could look into tho to try to enforce it a little bit more. You could look into the refer header, and see if that is the same as the origin header. Again, possible to send anything for any header, but will make it a bit harder and enforce what you want a little bit more.
If you assume that your origin header should always equal the domain of your API Gateway API then the other thing you can look into is the event.requestContext object that API Gateway gives you. That object has resourceId, stage, accountId, apiId, and a few other interesting properties attached to it. You could look into building a system that will also verify those and based on those values, determine which API in API Gateway is making the request. This might require ensuring that you have separated out each domain into a separate API gateway API.
I don't see anyway those values in the event.requestContext could be tampered with tho since AWS sets them before passing the event object off to you. They are derived from AWS and can not be easily tampered with by a user (unless the entire makeup of the request changes). For sure a lot less tamperproof than headers which are just sent with the request, and AWS passes through to you.
Of course you can combine multiple of those solutions together to create a solution that enforces your policy more. Remember, security is a spectrum, so how far down that spectrum you go is up to you.
I would also encourage you to remember that CORS is not totally meant to hide information on the internet. Those methods I shared about how you can bypass CORS with a simple backend system, or plugin, show that it's not completely foolproof and if someone really wants to fake headers they will be able to. But of course at the end of the day you can make it as hard as possible for that to be achieved. But that requires implementing and writing a lot of code and doing a lot of checks to make that happen.
You really have to ask yourself what the objectives and goals are. I think that really determines your next steps. You might determine that your current setup is good enough and no further changes are necessary. You might determine that you are trying to protect sensitive data from being sent to unauthorized origins, which in that case CORS probably isn't a solid solution (due to the ability to set that header to anything). Or you might determine that you might wanna lock things down a bit more and use a few other signals to enforce your policy a bit more.
tldr You can for sure set the Origin header to anything you want, therefor it should not be completely trusted. If you assume that your origin header should always equal the domain of your API Gateway API, you can try to use the event.requestContext object to get more information about the API in API Gateway to gain more information about the request. You could also look into the Refer header to see if you can compare that against the Origin header.
Further information:
Is HTTP Content-Length Header Safe to Trust? (disclaimer: I posted this question on Stack Overflow a while back)
Example Amazon API Gateway AWS Proxy Lambda Event
Bypass CORS
https://github.com/Rob--W/cors-anywhere
https://chrome.google.com/webstore/detail/allow-control-allow-origi/nlfbmbojpeacfghkpbjhddihlkkiljbi
https://www.thepolyglotdeveloper.com/2014/08/bypass-cors-errors-testing-apis-locally/
Refer Header
https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Referer
https://en.wikipedia.org/wiki/HTTP_referer

The only way to validate multiple origins is as you did, with your lambda read the Origin header, compare that to the list of domains you would like to allow, and if it matches, return the value of the Origin header back to the client as the Access-Control-Allow-Origin header in the response.
An information: the Origin header is one of the headers that are set automatically by the user agent.. so anyone can't altered programatically or through extensions. For more details, look at MDN.

AWS Lambda#edge. Access browser cookie in origin response triggered function

(My setup: CloudFront + S3 Origin)
Hi everyone!
This is what I’m trying to do:
Step 1. Trigger a Lambda function on viewer request. Get cookie with user preferred language if available (this cookie is set when the user chooses site language).
Step 2. Trigger a Lambda function on origin response. If response is an error (ex. 404), return an error page to the viewer based on the preferred language cookie from step 1.
My question is: how do I make information gotten in step 1 available in step 2? In general, how do I process a response based on user request AND origin response information? I would appreciate any suggestion. Thank you!

You shouldn't need step 1.
Whitelist the cookie for forwarding to the origin in the cache behavior. This causes CloudFront to cache a separate copy of each page, based on the value of the cookie. You'd need this anyway if your origin is going to be able to see the cookie.
In Lambda#Edge, there are viewer-side triggers (in front of the cache) and origin-side triggers (behind the cache).
An Origin Response trigger can see the response returned from the origin, but can also see the request that was sent to the origin.
request
Origin response – The request that CloudFront forwarded to the origin and that might have been modified by the Lambda function that was triggered by an origin request event
https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/lambda-event-structure.html#lambda-event-structure-response
There's not a straigtforward way to send information from a viewer request trigger to an origin response trigger, because they are on opposite sides of the cache and not able to communicate directly.
Your handler will be passed an event.
Everything you need is in event.Records[0].cf.
const cf = event.Records[0].cf;
The response is in cf.response and the request is in cf.request.
If the response status isn't 404, bail out of the origin response trigger and allow CloudFront to continue processing.
if(cf.response.status != "404')
{
return callback(null, cf.response);
}
Otherwise, extract the cookie from cf.request.headers.cookie (you'll need to parse this array after verifying that it exists -- it will not, if the browser didn't supply cookies), generate your custom response based on the cookie, and return it.
See Generated Responses - Examples for how to return a generated response.
Since you are generating the response in an origin response trigger, it will be stored in the cache according to the value of the Error Caching Minimum TTL (default 5 minutes).

AWS nginx as a service?

I'm looking for a service that allows me to proxy/modify incoming requests inside AWS.
Currently I am using cloudfront, but that has limited functions.
I need to be able to see user agent strings and make proxy decisions based on that - like reverse proxying to another domain, or routing all requests to /index.html.
Anyone know of a service that within AWS - or outside of AWS.

It sounds like you are describing Lambda#Edge, which is a CloudFront enhancement that allows you to define Lambda functions that will fire at any of 4 hook points in the CloudFront signal flow, and modify the request or generate a dynamic response.
Viewer Request triggers allow inspection/modification of requests and dynamic generation of small responses before the cache lookup.
Origin Request triggers are similar, but fire after the cache is checked. They allow you to inspect and modify the request, including changing the origin server, path, and/or query string, or to generate a response instead of allowing CloudFront to proceed with the connection to the origin.
If the request goes to the origin, then once it returns, an Origin Response trigger can fire to modify the response headers or replace the response body with a different body you generate. The response after this trigger is finished with it is what gets stored in the cache, if cacheable.
Once a reaponse is cached, further firing of the Origin Request and Origin Response triggers doesn't occur for subsequent requests that can be served from the cache.
Finally, when the response is ready, whether it came from the cache or the origin, a Viewer Response trigger can modify it further, if desired.
Response triggers can also inspect many of the headers from the original request.
Lambda#Edge functions are written in Node.js, and are presented with the request or responses as simple structured objects that you inspect and/or modify.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js