Google Cloud HTTPS Load Balancer URL Rewrite To Remove .html Extension - google-cloud-platform

I've read the Traffic management overview for global external HTTP(S) load balancers URL maps overview but do not see how to do the following:
https://example.com/page ----> https://example.com/page.html
Is it possible to "remove" the .html extension from my URL with Google's global external HTTP(S) load balancer?
My website is hosted on Google Cloud Storage (bucket). I understand that I can use gsutil to set metadata on files to type:text/html and that is a viable workaround, however I would need to script that and I spent a couple of hours looking at that but never got it figured out. The script would basically need to recursively list all files with .html extension then rename them removing the file extension then set the metadata.

URL rewrites allow you to present external users with URLs that are different from the URLs that your services use. Although it says that it provides URL shortening, extension removal isn't done through the Load Balancer, but rather by setting the file's Content-Type metadata to "text/html" or using App engine or Firebase hosting to serve a static HTML website and hide HTML extension. The latter suggestion was discussed in another stackoverflow post
url: /contact
static_files: www/contact.html
upload: www/contact.html

Related

Configure Google Cloud Platform bucket to serve example.com/page.html when user accesses example.com/page

I'm hosting a static (NextJS) site on a GCP bucket, with my domain CNAME (let's say example.com) pointing to GCP. When Javascript is disabled, NextJS links in my generated content point to URLs like:
Page 1
but the actual file stored in the bucket is:
pages/1.html
which generates a 404 error when Javascript is disabled and <Link> doesn't capture the click.
I'm aware of the specialty page option MainPageSuffix in GCP, but I have it set as index.html and I don't think it can be set to rewrite someaddress to someaddress.html (and even if it could, it would not serve my root index.html corrctly when I point my browser to example.com)
I'm also aware of the as option in NextJS, but if I use it like:
<Link
href={`/pages/1`}
as={`/pages/1.html`}
>
it will not work when Javascript is enabled and I'm serving the site locally with npm run dev (I suppose it confuses <Link>?).
Is there any way to make this work? I'm using Next.js v13.0.7
(Alternatively, is there any other (free tier) option to host my site? I thought I could use Cloudflare Pages, but my static site has a lot of small pages - in the order of 100k - and Pages has a file limit of 20k)

Create a redirect rule for Azure CDN in Verizon which adds /index.html to the end of URL

I created an Azure CDN under Verizon Premium Subscription in the Azure portal with an endpoint which points to my Azure Static Website URL.
I want to create a redirect rule in Azure Verizon engine which adds /index.html to the end of URL if no extension is specified or the last sing of URL is not a / symbol.
So far I tried to use (.+\/[^\.]\w+$) regex expression, you can see an example of how it works here
My first approach:
In this case, if you type the URL https://blah.com/foo/bar in the web browser
it doesn't change the URL however you are able to view some of the content of the existing file from https://blah.com/foo/bar/index.html but some of the links to resources are broken. Not sure why I'm not getting the 404 in this case but maybe its because that I set the Index document name to index.html in the Static Website panel in the Storage account panel in Azure. If I open the Network tab in the developer tools of Chrome I can see a lot of 404 responses e.g.
And its because the website tries to get resources from the https://blah.com/foo/ directory instead of https://blah.com/foo/bar/
So, for example, the loadcsh.js in fact is located under the https://blah.com/foo/bar/loadcsh.js but the website is searching for the file under the wrong directory https://blah.com/foo/loadcsh.js
My second approach
In this case, if you type the URL https://blah.com/foo/bar
it makes a redirect to https://blah.com/foo/bar/foo/bar/index.html
so the foo/bar/ is redundant here.
My third approach
In this case, if you type the URL https://blah.com/foo/bar
it makes a redirect to https://blah.com/index.html
I have no idea how to apply the rule which makes a redirect from https://blah.com/foo/bar
to https://blah.com/foo/bar/index.html and is generic for all such cases.
Any ideas??
Cheers
I think your Regex expression is ok. You can add the rule URL redirect, in the Source textbox you type your Regex expression (.+\/[^\.]\w+$) and in the Destination textbox add https://%{host}/$1/index.htm. Here I used the HTTP variable for Azure CDN which can be used in Verizon. You can read more about the variables here.
In short words the %{host} returns a host name e.g. www.contoso.com
Please keep in mind all rule changes will require couple of hours propagation before it takes an effect on the CDN.

URLs of Azure Static Website should work without trailing /

I've created an Azure Static Website which works based on the Azure Blob Storage.
To be able to manage the automatic redirect from HTTP to HTTPs I created Azure CDN with Azure Verizon Premium subscription and I created an endpoint which
points to the URL of the static website. I followed the steps from this tutorial
If you hit the URL e.g.
https://blah.com/foo/
You will be automatically redirected to
https://blah.com/foo/index.html
This is because I set the Index document name to index.html in the Static website configuration panel.
What I want to achieve is to add the /index.html symbol to the very end of URL if it doesn't have an extension e.g.
https://blah.com/foo
https://blah.com/bar/foo
The expected result would be a redirect to:
https://blah.com/foo/index.html
https://blah.com/bar/foo/index.html
So my idea was to open the https://cdn.windowsazure.com/http/rules/default.aspx and try to create a new Rule; feature-> URL Redirect. In the TextBox near the Source label, I tried to specify the condition using Regex expression ^[^.]+$ which checks if the path contains a . If yes then it would mean the URL points to file with extension and the /index.html should be added to the end of URL. I think my Regex expression is wrong and should be different. Or maybe it is not the best way to achieve what I want?
Any ideas?
Cheers
So I tried almost everything and in the end, after adding this rule the Azure Static Webiste worked as expected:
Just further to this as I know it has an accepted answer but you won't need any redirect rule for index.html if you use a custom origin and use the static website's primary endpoint (will be something like .z8.web.core.windows.net/). For whatever reason, the CDN will treat that as a web server rather than a vanilla storage place.

AWS S3 Redirect only works on bucket as a subdomain not bucket as a directory

Many people have received 100s of links to PoCs that are on an internal facing bucket and the links are in this structure.
https://s3.amazonaws.com/bucket_name/
I added a redirect using AWS's Static website hosting section in Properties and it ONLY redirects when the domain is formatted like this:
https://bucket_name.s3-website-us-east-1.amazonaws.com
Is this a bug with S3?
For now, how do I make it redirect using both types of links? My current workaround is to add a meta redirect tag in each html file.
The s3-website is the only endpoint that supports redirects unfortunately. Using the s3.amazonaws.com supposes that you will be using S3 as a storage layer, instead of a website. If the link is to a specific object, you can place an HTML file at that url with a JS redirect, but other than that there is really no way to achieve what you are trying to do.
In the future, i would recommend always setting up a Cloudfront distribution for those kinds of usecases, as that will allow you to change the origin later on.

Difference between S3 public files and static websites

In AWS S3 you can upload a file and make it public. You get a URL to access the same. Also, you can enable "Static Website Hosting". Can someone clarify the difference between these 2 approaches? If I can simply upload my html pages and make them public and access them over HTTP through browsers, why do I need to enable static website hosting ??
Enabling Static Website Hosting on S3 allows you to use a custom domain name, custom error pages, index.html documents for paths that end in /, and 301 redirects.
For others who are just stumbling across this, one disadvantage of enabling Static Website Hosting is the HTTP only endpoint you get.
See relevant docs. If you can work with the limitations of simply making the files public such as no custom domain name, you get TLS for free since some browsers block HTTP links on pages served over HTTPS.