Understanding Server/Client Routing: How Can Amazon(?) Be Redirecting My SPA ... Without a Redirect (or History Entry)? - amazon-web-services

NOTE: I'm providing details of my setup, but really this is a "how is this possible" question, not a "please debug my setup" question.
I have a "singe page application" (ie. an HTML file that uses the History API to simulate URLs). I'm serving this app on AWS S3, behind an AWS Cloudfront ... front.
I had successfully configured things so that if someone went to www.example.com/foo (let's pretend I own example.com), Cloudfront would serve an "error page" of my index.html. My index.html would then see the URL, and use its routing to show the user the correct page.
That all worked great ... until it didn't. Now for some reason when I go to www.example.com/foo, I get redirected to www.example.com. I'm trying to debug things, but what I can't understand is how I'm going from /foo to the main page.
When I look in the Network panel of my developer tools, I can see the request made to the original (/foo). Then I can see the chain of requests (for images, css files, etc.), and they all have a referrer of www.example.com/foo.
Then all of the sudden I see a request for React Developer tools (why it needs to make a request is beyond me) ... and it's from referrer www.example.com. After that I get one last image request from /foo, and then all subsequent requests come from www.example.com.
Can anyone explain how this could be working? I know that if a server returns a redirect (either type) that could change my URL ... but every request has a 200 status (ie. no server redirects).
I know Javascript could "push" a new URL to my browser ... but that would leave a history entry right? When I go "back" (either with my browser or history.back()) I go to the page before; I don't go "back" to /foo.
So somehow I'm not making a history entry, but I am switching my URL, and the URL I make requests from, and this all happens within milliseconds on page load ... without any redirects. How?
P.S. When I use my dev tools to add an beforeunload breakpoint, then try to navigate from example.com to example.com/foo I don't hit that break point (either for going to /foo, or when I'm "redirected" back to example.com).
When I check the box for any Load event, I do see some happen ... after my URL has already switched. In other words, I type example.com/foo, hit enter, and by the time any event fires I'm back on example.com. Whatever mechanism is doing the "redirection" here ... it doesn't trigger any load events.

I figured out my (AWS-specific) problem, thanks to a bit of Gatsby documentation. I'll include the details below in case it helps others, but I won't accept this answer, as I still don't understand how AWS did what it did (and I'd still welcome an answer for that).
What happened was that I had my Cloudfront "Origin Domain Name and Path" pointing to:
example.com.s3.amazonaws.com
However, as explained on https://www.gatsbyjs.com/docs/deploying-to-s3-cloudfront/:
There are two ways that you can connect CloudFront to an S3 origin. The most obvious way, which the AWS Console will suggest, is to type the bucket name in the Origin Domain Name field. This sets up an S3 origin, and allows you to configure CloudFront to use IAM to access your bucket. Unfortunately, it also makes it impossible to perform serverside (301/302) redirects, and it also means that directory indexes (having index.html be served when someone tries to access a directory) will only work in the root directory. You might not initially notice these issues, because Gatsby’s clientside JavaScript compensates for the latter and plugins such as gatsby-plugin-meta-redirect can compensate for the former. But just because you can’t see these issues, doesn’t mean they won’t affect search engines.
In order for all the features of your site to work correctly, you must instead use your S3 bucket’s Static Website Hosting Endpoint as the CloudFront origin. This does (sadly) mean that your bucket will have to be configured for public-read, because when CloudFront is using an S3 Static Website Hosting Endpoint address as the Origin, it’s incapable of authenticating via IAM.
Once I changed my Cloudfront "Origin Domain Name and Path" to the bucket's static hosting URL:
http://example.com.s3-website-us-west-1.amazonaws.com
Everything worked!
But again, I still don't understand how AWS did what it did when I mis-set my "Origin Domain Name and Path". It redirected me to my root domain, seemingly without either a redirect response OR a client-side redirect, and I'd love to hear how that was accomplished.

Related

AWS Route 53 Domain Point To Github Pages

I am new to working with AWS and route 53 so any help is appreciated.
I have created an organization on GitHub, and then created a simple repository for a static site to display with Github pages. this is working as expected and I can see the static site at the URL generated by Github (something like: https://<githubOrgName>.github.io/<repoName>/)
I got a domain from AWS and now I'm trying to set it up so the apex domain (e.g. "my-domain.com") points to the Github pages site.
I followed the instructions found at: https://docs.github.com/en/pages/configuring-a-custom-domain-for-your-github-pages-site/about-custom-domains-and-github-pages ... but it doesn't seem to be working.
I am trying to make it so that the apex domain points to the repository Github page. something like:
https://my-domain.com -> https://<githubOrgName>.github.io/<repoName>/
... but this only shows a blank screen when I go to the root domain ("my-domain.com"). I have also tried to go to https://my-domain.com/<repoName>/... but this shows me a Github 404 page (so it seems to be correctly forwarding something to Github):
my AWS route 53 configuration is similar to the following (i have tried to remove sensitive details):
can anyone explain to me what I am doing wrong? I am new to working with domains so any help is appreciated.
Using Route53 alone won't help you there, because your target URL contains a URL path i.e. /<repoName>/.
DNS is a name resolution system and knows nothing about HTTP
Furthermore, the origin server (github.io) might be running a reverse proxy which might be parsing the request headers, among which is the Host header. You browser automatically sets this header to the url you feed it. Eventually, you send it the wrong header (i.e. https://my-domain.com/), which Github cannot process. You can explicitly set this header (e.g. via curl) to what Github is expecting, but I believe it's not what you and your users would like.
Instead, you could try using layer 7 redirects (301/302) with the help of Lambda#Edge (provided by AWS CloudFront). I have created a simple solution using the Serverless framework, which does the following redirects:
https://maslick.tech -> https://github.com/maslick
https://maslick.tech/cv -> https://www.linkedin.com/in/maslick/
https://maslick.tech/qa -> https://stackoverflow.com/users/2996867/maslick
https://maslick.tech/ig -> https://www.instagram.com/maslick/
But you can customize it by adjusting handler.js according to your needs. You might also need to create a free TLS certificate using AWS Certificate Manager in the us-east-1 region and attach it to your CloudFront distribution. But this is optional.
Lambda#Edge will give low latencies, since your redirects will be served from CloudFront's edge locations across the globe.
How I got it to work was:
Set a CNAME record from example.org to <USERNAME>/github.io. in the Route 53 console
Set Custom domain to example.org in the Github Pages settings for github.com/<USERNAME>/<REPO>
Note: You shouldn't be setting the CNAME record to <USERNAME>/github.io/<REPO>
Source: https://deanattali.com/blog/multiple-github-pages-domains/

Naked domain and http to https redirects

Hope you're all doing well!
I have a question I'm hoping to get some help with. I have a static site served through S3 with CloudFront distributions in front.
My main site is served on www.xyz.xyz and the cloudfront distribution connected ha a behavior http to https redirect.
Then I also want people to be able to access http://xyz.xyz, therefore I have created another bucket for the naked domain, with a redirect policy to www.xyz.xyz with http as protocol. In the CloudFront distribution connected to this the origin is the direct S3 website link, and not the bucket.
In the end this ensures all guests end at https://www.xyz.xyz, however when running Google Lighthouse for a SEO check, if I enter http://xyz.xyz it seems to go through 2 redirects, one to https and one to www and I'm assuming, according to Lighthouse, that this has some negative effects in that regard, both in terms of time to serve, but also SEO.
Am I doing something wrong? I hope you can help me. I really thought it was simpler, also with all the buckets and such :-)
I noticed in AWS Amplify you need to setup redirect/rewrites, but I guess in S3 + CloudFront terms, that's what I'm already doing.
Best,
To maintain compatibility with HSTS, you must perform your redirection in two steps. The first redirect should upgrade the request to https. The second can canonicalize the domain (add or remove www). So this behavior is desirable.

AWS S3 Redirect only works on bucket as a subdomain not bucket as a directory

Many people have received 100s of links to PoCs that are on an internal facing bucket and the links are in this structure.
https://s3.amazonaws.com/bucket_name/
I added a redirect using AWS's Static website hosting section in Properties and it ONLY redirects when the domain is formatted like this:
https://bucket_name.s3-website-us-east-1.amazonaws.com
Is this a bug with S3?
For now, how do I make it redirect using both types of links? My current workaround is to add a meta redirect tag in each html file.
The s3-website is the only endpoint that supports redirects unfortunately. Using the s3.amazonaws.com supposes that you will be using S3 as a storage layer, instead of a website. If the link is to a specific object, you can place an HTML file at that url with a JS redirect, but other than that there is really no way to achieve what you are trying to do.
In the future, i would recommend always setting up a Cloudfront distribution for those kinds of usecases, as that will allow you to change the origin later on.

AWS Cloudfront Signed Cookie Local Setup

I've been attempting to get cloudfront signed cookies setup for a site to make getting HLS manifest segment files easier to authenticate. Setting up the cloudfront origin and code in a live environment seems simple enough looking at resources like
https://mnm.at/markus/2015/04/05/serving-private-content-through-cloudfront-using-signed-cookies/
http://www.spacevatican.org/2015/5/1/using-cloudfront-signed-cookies/
What I'm trying to figure out is if it's possible to to have this working in a local environment (localhost) prior to deploying the initial solution. Cloudfront itself will forward to the live origin which will set the cookies for cloudfront and continue on as normal, but since the code isn't live this will not work until deployed.
Seems like a chicken and egg problem here where I need it live to use it, but can not test it (with code or manually) without it deployed.
Any thoughts here?
You'd not be able to test/run it properly on your localhost. When you try to set cookies for your CloudFront URL, you'll encounter cross domain issue. I'd recommend you to try generate signed URL first. If signed URL works, that means you're on the right direction. Setting up a cookie cannot go wrong as long as you've properly set CNAME in the CloudFront Web distribution and CloudFront URL records are set within your domain provider.

Using Meteor browser-policy package allowOriginForAll for AWS works on http site but not https

So we are using the Meteor browser-policy package, and using Amazon S3 to store content.
On the server we have setup the browser policy as follows:
BrowserPolicy.content.allowOriginForAll('*.amazonaws.com');
BrowserPolicy.content.allowOriginForAll('*.s3.amazonaws.com');
This works fine in local dev and in production when visiting our http:// site. However when using the https:// address to our site the AWS content no longer passes this policy.
The following error is put on the console
Refused to load the image 'http://our-bucket-name.s3.amazonaws.com/asset-stored-in-s3.png' because it violates the following Content Security Policy directive: "img-src data: 'self' *.google-analytics.com *.zencdn.net *.filepicker.io *.uservoice.com *.amazonaws.com *.s3.amazonaws.com".
As you can see we have some other origins allowed in the browser policy, these all seem to work fine in both http and https. AWS S3 is the only one that is failing.
I've tried Chrome, Firefox, and Safari and they all have the same issue.
Whats going on?
I may not have the exact answer to this question but I have some information which the community may find helpful.
First, you should avoid serving mixed content. I'm unclear if that would set off the browser policy alerts but you just shouldn't do it anyway. The easiest solution is to use a protocol-relative-url or just explicitly specify https in your url.
Second, I too assumed that the wildcard worked like a glob. However, I've been told that it works the same way as an ssl certificate rule - i.e. for all subdomains or for a specific subdomain. In other words, *.example.com and www.example.com, are valid but *.foo.example.com, isn't meaningful. I think you want to explicitly add your bucket like so:
BrowserPolicy.content.allowOriginForAll('our-bucket-name.s3.amazonaws.com')
unless you literally want to trust all of amazonaws.com.