I am looking for proper handling of URLs in Varnish (5.2.1), here is what I do (trying to redirect to lowercase URLs):
set req.url = std.tolower(req.url); //this is new.url
//if original.url != new.url => redirect
This produce good URL, until client library (and there are quite few) where they convert %[hex] to %[HEX] according https://www.rfc-editor.org/rfc/rfc3986#section-2.1 end up in URL redirection loop.
Example:
req.url = "/query=mythbusters%20-%20die%20wissensj%c3%A4ger"
is redirected to
"/query=mythbusters%20-%20die%20wissensj%c3%a4ger"
and client redirects it to
"/query=mythbusters%20-%20die%20wissensj%c3%A4ger"
I am trying to solve this issue, using regular expressions, but for some reason, I can not get UPPER case results, according PCRE/PCRE2/Perl regexp it should be possible like this:
set req.url = std.tolower(req.url);
set req.url = regsuball(req.url, "(%[0-9a-f][0-9a-f])", "\U\1");
Anybody have idea how to solve this ?
I posted issue on Varnish github, answer was this is not supported.
Related
I have written nginx rewrite rule to redirect all request for /path/category except subcategory1. I am using below regular expression for match and it is working fine in regex tester. However, when I am providing same regex in Nginx conf then it is not working for negative lookahead if url contains the # character. Do you have any suggestions?
Regex tried so far:
^\/path\/category(?!.*(\bsubcategory1\b)).*$
^\/path\/category(([\/#]*)(?!.*(subcategory1))).*$
Rewrite Rule:
rewrite ^\/path\/category(?!.*(\bsubcategory1\b)).*$ https://new.host.com permanent;
Path Details:
It should redirect to https://new.host.com which is working fine
/path/category
/path/category/
/path/category#/
/path/category/#/
skip the redirection for subcategory1 . It is not working for last 3 urls that contains hash.
/path/category/subcategory1
/path/category/subcategory1/
/path/category/subcategory1/dsadasd
/path/category#/subcategory1
/path/category/#/subcategory1
/path/category#/subcategory1/dadsd
Anything in the URI after # is ignored because it is supposed to be client side so it never gets to HTTP server (Nginx for instance).
Nginx regex will show abnormal behavior if a # is in the string under processing.
The part after # is called fragment.
The fragment can be processed at client side.
You can use window.location.hash to access and process fragments.
This Javascript example transform fragment in parameters in a request to process.html :
let param = window.location.hash;
param = param.substring(1); // remove #
param = '?' + param;
console.log('param=',param);
location.href = '/process.html' + param;
location ~* "/mypath/([a-zA-Z0-9_.-]{12}$)" {
return 301 https://new-domain.com;
}
Above regular express is when user type https://mywebsite.com/mypath/uy2hgy12jer2 in browser, it will be redirect to https://new-domain.com. But problem is when they type https://mywebsite.com/mypath/uy2hgy12jer2?params=1287612, it's also redirected. What I want is I want to make redirect only to https://mywebsite.com/mypath/uy2hgy12jer2. Please let me know how to do it. Thanks.
Location blocks in NGINX will only match the URI part but not the query string.
Alternatively, you can use below inside location block.
if ($is_args) {
break;
}
I found this behavior after few trails in https://nginx.viraptor.info/. Any character you type after 12th character doesn't get matched except when it is a query string. Next I found the alternative I mentioned and the link below.
For more info - https://serverfault.com/questions/237517/nginx-query-keyword-matching-in-location
I have a URL in the form of:
http://some-site.com/api/v2/portal-name/some/webservice/call
The data I want to fetch needs
http://portal-name.com/webservices/v2/some/webservice/call
(Yes I can rewrite the application so it uses other URL's but we are testing varnish at the moment so for now it cannot be intrusive.)
But I'm having trouble getting the URL correctly in varnish VCL. The api part is replaced by an empty string, no worries but now the portal-name.
Things I've tried:
if (req.url ~ ".*/(.*)/") {
set req.http.portalhostname = re.group.0;
set req.http.portalhostname = $1;
}
From https://docs.fastly.com/guides/vcl/vcl-regular-expression-cheat-sheet and Extracting capturing group contents in Varnish regex
And yes, std is imported.
But this gives me either a
Syntax error at
('/etc/varnish/default.vcl' Line 36 Pos 35)
set req.http.portalhostname = $1;
or a
Symbol not found: 're.group.0' (expected type STRING_LIST):
So: how can I do this? When I have extracted the portalhostname I should be able to simply do a regsub to replace that value with an empty string and then prepend "webservices" and my URL is complete.
The varnish version i'm using: varnish-4.1.8 revision d266ac5c6
Sadly re.group seems to have been removed at some version. Similar functionality appears to be accessible via one of several vmods. See https://varnish-cache.org/vmods/
I'm using nginx to serve static news-like pages.
On the top-level there is
https://example.com/en/news/ with an overview of the articles.
Individual items have a URL similar to this: https://example.com/en/news/some-article
All URLs contain the language, i.e. /en/ or /de/.
I would like to create a rule that redirects requests that don't contain the language to the correct URL (the language is mapped based on IP an available via $lang).
The following should work (en example):
/news/ --- redirect ---> /en/news/
/news/some-article --- redirect ---> /en/news/some-article
My attempts looked something like this
location ~* /news/.*$ {
if ($request_uri !~* /(de|en)/$) {
return 302 https://example.com/$lang/$request_uri;
}
}
So far this resulted in infinite redirects.
Your solution seems overly complicated to me. And testing $request_uri with a trailing $ will never match the rewritten URIs (hence the loop).
You could use a prefix location to only match URIs that begin with /news/.
Assuming that you have calculated a value for $lang elsewhere, this may work for you:
location ^~ /news/ {
return 302 /$lang$request_uri;
}
The ^~ modifier is only necessary if you have regular expression location blocks within your configuration that may conflict. See this document for more.
I am out of my depth here, currently reading the tutorials and using python to learn regex.
I have a website where a php file http://www.example.com/showme.php?user=JOHN will load the visitor page of JOHN. However I want to let John have his own vanity URL like john.example.com and rewrite it to http://www.example.com/showme.php?user=JOHN .
I know it can be done and after fiddling with it it seems lighttpd mod_rewrite is the way to go. Now I am stumped as I am trying to come up with regex to match!
rewrite ("^![www]\.example\.com" => "www\.example\.com\?user=###");
I am playing with python re module to test out several ways of getting the john from john.example.com and recognize when the first segment of url is not www and then redirect. Above was my trial. Am I even in the right continent!
Any help will be appreciated in
recognizing when first part of url before the first . is not www and is something else - so that example.com won't stump it.
getting the first part of the url before first . and tag it to user=###
Thanks a bunch
Use lighttpd's mod-rewrite module. Add this to your lighttpd.conf file:
$HTTP["host"] != "www.example.com" {
$HTTP["host"] =~ "^([^.]+)\.example\.com$" {
url.rewrite-once = (
"^/?$" => "/showme.php?user=%1"
)
}
}
For an href value like /dir/page.php the domain part of the link gets automatically added from the current request as shown in the browser's address bar. So, if you had used www.example.com; the link would point to htp://www.example.com/dir/page.php and likewise for john.example.com.
For all your links to point at www.example.com, you need to be accessing the page using www. This would be possible only if you do an external redirect from the vanity URL to the actual one i.e. users can still use the shortened URL but they would get redirected to the actual one.
$HTTP["host"] != "www.example.com" {
$HTTP["host"] =~ "^([^.]+)\.example\.com$" {
url.redirect = (
"^/?$" => "http://www.example.com/showme.php?user=%1"
)
}
}