How to get SSI variable REQUEST_URI without query parameters - regex

I'm trying to get the pathname part of the REQUEST_URI, without the query parameters. I need to do this in raw SSI, without any PHP or anything.
If I do something like <!--#echo var="REQUEST_URI" -->, that will output the pathname plus the query parameters, so if the browser URL shows http://example.com/foo.html?bar, that would return /foo.html?bar. But I need to return only /foo.html. Is there a way to do that directly inside an echo statement?
Note: It needs to use the requested uri only. The actual file paths on the server are very different and I cannot display those.

I don't have a running nginx with SSI around, so i am just guessing here.
But maybe you can try to use a regular expression to extract what you want.
Maybe something like this:
<!--# if expr="$REQUEST_URI = /(.+)\?.*/" -->
<!--# echo var="1" -->
<!--# endif -->
I am not sure about the \ before the ?.

You could try to use the DOCUMENT_URI variable instead:
<!--#echo var="DOCUMENT_URI" -->
SCRIPT_NAME seems to work too:
<!--#echo var="SCRIPT_NAME" -->

This code works for me :
<!--#if expr="$REQUEST_URI = /([^?]+)\?.*/" -->
<!--#set var="URL_WITHOUT_QUERY_STRING" value="$1" -->
<!--#echo var='URL_WITHOUT_QUERY_STRING' -->
<!--#endif -->

Related

Regex Custom Redirect in Blogger for every archive.html

I need to create a regex Custom Redirect in Blogger. The purpose is to redirect all HTML archives to somewhere else.
Currently I'm using the following in Settings / Search preferences / Custom Redirects:
From:/2018_11_21_archive.html
To:/p/somewhere_else.html
Permanent:Yes
The problem is that this method requires to add every date, and that's not acceptable.
/2016_10_21_archive.html
/2016_10_22_archive.html
/2016_10_23_archive.html
/2017_07_10_archive.html
/2017_07_10_archive.html
/2017_07_10_archive.html
/2018_11_21_archive.html
/2019_11_21_archive.html
...
So far I've tried this regex with no success:
From:/2018_(.*)
To:/p/somewhere_else.html
Permanent:Yes
Blogger custom Redirects does not support regex.
But I have a solution for you, use this code, and put it after <head>
<b:if cond='data:view.isArchive and data:view.url contains "_archive"'>
<b:with value='"https://www.example.com/p/somewhere_else.html"' var='destination'>
<script>window.location.replace("<data:destination/>")</script>
<noscript><meta expr:content='"0; URL=" + data:destination' http-equiv='refresh'/></noscript>
</b:with>
</b:if>
You have to escape the "/" character! Just insert a "\" before.
This line must be like this:
From:\/2018_.*
But be aware that this way only /2018_11_21_archive.html will match.
If you need ALL dates as you mentioned, I recommend this regex below:
\/([12]\d{3}_(0[1-9]|1[0-2])_(0[1-9]|[12]\d|3[01]))_archive\.html

Remove trailing / from URL in Jinja2 (regex?)

I have a template for an MkDocs site, which uses Jinja2. I am trying to add a link to a PDF version of each page. The PDF always has the same name as the markdown file. So I am trying to add a link in the template that will automatically target the correct PDF for each page. This feels cleaner than having the writers add a manual link to every page.
Download
The above is almost correct, but there is a '/' at the end of all the URLs. Meaning the result is:
page/url/slug/.pdf
Neither MkDocs nor Jinja seem to provide a filter to remove trailing slashes, so I am wondering if it's possible to use regex to remove it. I believe that would be as simple as \/$? However, I can't see from the docs how to apply a regex filter in Jinja?
You can do something like this:
{{ "string/".rstrip("/") }}
Worked for me.
So I found a workaround for my specific case, but it's nasty:
<a href='{{ config.site_url }}{{ page.url | reverse | replace("/", "", 1) | reverse }}.pdf'>Download</a>
Prepend the site URL
Get the current page URL, reverse it, use replace with the optional count parameter to remove the FIRST '/', then reverse it again to get it back in the right order
Append '.pdf'
According to one of the answers to the question linked by Jan above, you can't simply use regex in Jinja2 without getting into custom filters.
Download
where $ is the end of the line / end of the string.
Therefore, /$ means the / at the end.

How to add an extra parameter to the img source in HTML using perl

I have a situation where I need to differentiate two calls by the path in the source of a HTML. This is how the img tag looks like
<img src="/folder/12280218/160024536.images.jpg" />
I am planning to alter the source to
<img src="/folder/12280218/160024536.images.jpg/1" />
observe the "/1" at the end of src
I need this so that I can change the flow in the controller when I am serving this image.
This is what I have tried until now.
my $string = '<p><img src="/folder/12280218/160024536.images.jpg" /></p>';
$string =~ s/<img\s+src\=\"(.*)"\s+\/><\/p>/<img src\=\"$1\/1" \><\/p>/g;
This is working as long as the $string looks like this.
In our application, user has the ability to alter the HTML input using CKEditor.
He can alter the image tag by adding width="800" before or after the src attribute. I want the regular expression to handle all these situations.
Please let me know how to proceed.
Thanks in advance.
Replace :
(<img.*src="[^"]*)(".*\/>)
by
$1/1$2
Demo here
Edit : Changed the regex to handle situations with other attributes (like the "width" part)

How to use Regular Expression Extractor to get authenticity token with / = + signs?

I need to correctly parse authenticity token in JMeter which has +, / and spaces in it and looks like below…
<meta content="authenticity_token" name="csrf-param" />
<meta content="kJ+AzaV/saCxK+F4Ibh6LeEqH8rpiGZfyRKn3RGX960=" name="csrf-token" />
I have a “Regular Expression Extractor” and Regular Expression looks like..
meta content="([^"]+)" name="csrf-token" />
The problem is that the / gets replaced with %2F and = at the end gets replace with %3D and
kJ+AzaV%2FsaCxK+F4Ibh6LeEqH8rpiGZfyRKn3RGX960%3D
How can I parse the authenticity token correctly?
It looks like your attribute has been URI encoded, so you'll need to decode it before attempting to do more
window.decodeURIComponent('kJ+AzaV%2FsaCxK+F4Ibh6LeEqH8rpiGZfyRKn3RGX960%3D');
// "kJ+AzaV/saCxK+F4Ibh6LeEqH8rpiGZfyRKn3RGX960="
Further, using a RegExp to extract data from HTML or XML is not always the best idea, perhaps you could try parsing it and accessing the Nodes and Attributes you want via a DOM Tree.
If you're passing it as a parameter just untick "Encode?" box for it.
If you need to decode this via JavaScript as Paul S. suggests, consider using __javaScript function as follows:
${__javaScript(decodeURIComponent("${YOUR_VARIABLE_NAME_HERE}"),)}
See How to Use JMeter Functions post series for more details.

What's wrong with this SSI

I'm having trouble with SSI. It seems like I don't get to work the most basic command:
<!--#if expr="${title}" -->
<!--#echo var="title" -->
<!--#endif -->
I think it's obvious what I want to do and I can't find what's wrong with this piece of code. However SSI says [an error occurred while processing this directive].
The echo without the if block works fine.
I've found the error:
Apache changed its syntax for SSI conditionals with Version 2.3 or something so now it has to look like that:
<!--#if expr='-n v("title")' -->
<!--#echo var="title" -->
<!--#endif -->