I need to invalidate cache in Varnish for multiple specific values of one parameter simultaneously. Now the code makes calls following this pattern:
varnish_host/path?.*parameter=1
varnish_host/path?.*parameter=2
varnish_host/path?.*parameter=3
varnish_host/path?.*parameter=4
And following a documentation of Varnish 2.0 found here https://kly.no/varnish/regex.txt I found this rule for multiple matching
Multiple matches
req.url ~ "\.(jpg|jpeg|css|js)$"
True if req.url ends with either "jpg", "jpeg", "css" or "js".
So I changed my code to adapt it in the following way
varnish_host/path?.*parameter=(1|2|3|4)$
But it does not clean the cache as expected, even if it returns a status 200.
Is there in Varnish 4.0 a way to do this multiple match in a parameter? If so, is there a limit in the number of variations that we should have into account?
Varnish doesn't offer an out-of-the-box HTTP-based invalidation mechanism.
What you can do is issue bans using varnishadm. This will allow you to setup regex patterns that match multiple objects.
Varnishadm ban example
Here's such an example, where we will invalidate each PNG file in the cache for the example.com hostname:
varnishadm ban req.http.host == example.com '&&' req.url '~' '\\.png$'
HTTP-based banning & purging
varnishadm works fine, but isn't that easy to integrate into your logic. If you want to invalidate objects from the cache via purge or ban, you need to write some VCL.
Here's a VCL snippet that will facilitate HTTP-based invalidation:
vcl 4.0;
acl purge {
"localhost";
"192.168.55.0"/24;
}
sub vcl_recv {
if (req.method == "PURGE") {
if (!client.ip ~ purge) {
return(synth(405, "Not allowed."));
}
if (!req.http.ban-url) {
return(purge);
}
ban("obj.http.x-host == " + req.http.host + " && obj.http.x-url ~ " + req.http.ban-url);
return(synth(200, "Ban added"));
}
}
sub vcl_backend_response {
set beresp.http.x-url = bereq.url;
set beresp.http.x-host = bereq.http.host;
}
sub vcl_deliver {
unset resp.http.x-url;
unset resp.http.x-host;
}
Important: you need to adjust the values of the ACL, which will prohibit unauthorized access to the invalidation interface. You can use IP addresses, IP ranges, and hostnames to limit access.
Here's how we perform the same PNG invalidation via HTTP:
curl -XPURGE -H"ban-url: '\.png$'" http://example.com/
You can also just invalidate a single URL:
curl -XPURGE http://example.com/my-page
Because the example above doesn't contain an ban-url request header, only the exact URL is invalidated, instead of a pattern begin matched.
Related
Consider my requested url is www.example.com/foo/emplooyee?names = test1;test2.
and varnish stores this entire URL along with query parameters to uniquely identify the cache.
now, in my backend, I'm running one service and which I'm supposed to configure as whenever there are changes in names (i.e. test1 or test2) is should fire an HTTP ban with an older name (single name at a time in ban expression) to invalidate all the cached URL which entered with similar names.
Questions:
My client request url could be like this,
www.example.com/foo/emplooyee?names = test1;test2
www.example.com/foo/emplooyee?names = test1;
www.example.com/foo/emplooyee?names = test2;test1;test3;test4
www.example.com/foo/emplooyee?names = test1;test4.
How to write a VCL code and in Ban expression to invalidate all object which has query parameter as test1?
This is the VCL code you need for banning:
vcl 4.1;
acl purge {
"localhost";
"192.168.55.0"/24;
}
sub vcl_recv {
if (req.method == "PURGE") {
if (!client.ip ~ purge) {
return(synth(405));
}
if(!req.http.x-invalidate-pattern) {
return(purge);
}
ban("obj.http.x-url ~ " + req.http.x-invalidate-pattern
+ " && obj.http.x-host == " + req.http.host);
return (synth(200,"Ban added"));
}
}
sub vcl_backend_response {
set beresp.http.x-url = bereq.url;
set beresp.http.x-host = bereq.http.host;
}
sub vcl_deliver {
unset resp.http.x-url;
unset resp.http.x-host;
}
Here's an example HTTP request to invalidate all requests that contain a test1 value in the query string:
PURGE / HTTP/1.1
Host: www.example.com
X-Invalidate-Pattern: ^/foo/emplooyee\?names=([^;]+;)*test1(;[^;]+)*$
Here's the same request via curl:
curl -XPURGE -H'X-Invalidate-Pattern: ^/foo/emplooyee\?names=([^;]+;)*test1(;[^;]+)*$' http://www.example.com
This VCL snippet has the flexibility to remove multiple items from cache through banning, but if you don't set the pattern through the X-Invalidate-Pattern header, it will just remove the URL itself from cache.
Here's an example where we just remove http://www.example.com/foo/emplooyee?names=test1 from the cache:
curl -XPURGE 'http://www.example.com/foo/emplooyee?names=test1'
Don't forget to modify the acl block in the VCL code and add the right IP addresses or IP ranges.
So on the Nginx documentation it is a bit unclear but it appears we can use Regex to match anything within the if statement
http://nginx.org/en/docs/http/ngx_http_rewrite_module.html#if
So i wanted to tweak this to make it so Nginx checks the value of any cookie.
But my following string what should be from my understanding * matching any cookie name and = matching any cookie contents.
if ($http_cookie ~* "*=*") {
return 444;
}
But i get the error "pcre_compile() failed: nothing to repeat in "" at "" in"
What i am trying to achieve is to Have nginx check the cookie like a WAF (web application firewall) to make sure it only contains A-Z uppercase a-z lowercase 0-9 + - _ . my PHP app is Joomla what does these checks too but be useful if i could perform these checks with Nginx too since it could deny the request faster.
EDIT to Show half solved issue / dilemma
set $block_cookie_exploits 0;
#If cookie name or contents does not contain the following
if ($http_cookie !~ "[a-zA-Z0-9\+\-\_]=[a-zA-Z0-9\+\-\_]") {
set $block_cookie_exploits 1;
}
#Block the client request
if ($block_cookie_exploits = 1) {
return 403;
}
New problem with above configuration is it will return 403 while no cookie is present. And if you put characters in the name or contents of the cookie like {} it does not return 403
I have multiple subdomains pointing to one varnish instance. I read in the documentation that PCRE regex should be used. I believe the regex I have below should return true when the request url is “http://internal.my.com/any/thing” and the 15s ttl should be set. I’ve tried just (req.url ~ “internal.my.com”) as well because I read that it should match if any part of the request url contains that string. The below vcl_fetch subroutine is always resulting in 300s cache despite making a request to internal.my.com.
# Cache for a longer time if the internal.my.com URL isn't used
sub vcl_fetch {
if (req.url ~ "^[(http:\/\/)|(https:\/\/)]*internal\.my\.com.*"){
set beresp.ttl = 15 s;
} else {
set beresp.ttl = 300 s;
}
}
Whoops... I should've used req.http.host instead of req.url. A simple misunderstanding when once corrected resulted in the intended behavior.
I'm trying to disable caching in varnish for all subdomains. Our application allows users to create and manage their own website on a subdomain of our url, but varnish keeps caching their page when they're trying to edit it.
I know the basic format:
if (req.url ~ "[code here]") {
# Don't cache, pass to backend
return (pass);
}
but nothing I've tried seems to work for all subdomains.
Maybe it's a simple regex?
You can use req.http.host for this purpose. And yes, it can be a regex.
sub vcl_recv
{
/* your earlier definitions */
if( req.http.host ~ 'my.subdomain.example.com' )
{
// set the backend first
set req.backend = localhost;
return( pass );
}
/* your definitions */
}
In some cases you may need to return( pipe ):
https://www.varnish-cache.org/docs/2.1/faq/configuration.html
I think you would need this for any subdomain (note this may be an issue if you use www as it may be considered a subdomain) and will match anything before the . in example.com
sub vcl_recv {
if(req.http.host ~ ".*\.example.com") {
return( pass );
}
}
I want to strip out values from a url in varnish, so I can take different actions based on the url, for example:
URL: /product/123/price/available.
I'd like to convert this to /products?id=123&sort=price&available=true
I would also like to be able to (if this is possible), set values on the request header, so instead of passing all the params on the URL, I could do the following:
/products?id=123&sort=price
with header: x-show-available-only: true
I appreciate the second example appears a bit odd, but this way we could pass new params back to our legacy application, and ensure none of the new params interfere with current params - we would just read new params via the header, until we migrate all our functionality to our new platform.
I'm sure it's a regex thing, but can't work out how to do it.
This is NOT tested, but should be close enough that you can tweak it to get it right.
Based on your URL:
/product/123/price/available
You'll need to specify two backends (change the IPs for your own ones):
backend old_app {
.host = "127.0.0.1";
.port = "80";
}
backend new_app {
.host = "127.0.0.2";
.port = "80";
}
And the code in your vcl_recv:
if (req.url ~ "^/product/") {
set req.url = regsub(req.url, "/product/([0-9]/)/([a-zA-Z])", "/products?id=\1&sort=\2");
set req.backend = old_app;
} else {
set req.backend = new_app.
}
if (req.url ~ "/available") {
set req.http.x-show-available-only = "true";
}
You can add more regex rules as if/else blocks.
Good luck!