varnish - regexp to edit x-forward-for header inbound / outbound - regex

I am curious if anyone knows, is it possible to build a varnish config which
looks at inbound headers from clients
and does regexp match, to build a certain header - that is passed on to backed web server?
Context:
-- Varnish server deployed in front of 3-(apache)-webserver pool, acting as cache and LB
-- 3rd party service DDOS proxy was dropped in front of varnish recently
-- now all inbound requests to varnish have an X-Forwarded-For: header already present, indicating what the 'real' client IP address is.
-- the back-end PHP web app running on the apache hosts - wants to see only one IP address, not a 'proper' X-F-F header which shows a comma-delimited list IP1, IP2, IP3 (for external proxy; varnish as proxy, original client IP - for example)
-- ideally what I would like to do: is get Varnish to do regexp pattern match:rebuild on the existing inbound X-F-F headers; and strip away the external DDOS proxy IP address
I had tried something like this in the 'default.vcl' file,
set req.http.X-Forwarded-For = regsub(req.http.X-Forwarded-For, "123.123.123.123"," ");
where 123.123.123.123 is the IP address of the external DDOS / proxy
but this most certainly is not working / is not the correct syntax. I'm having trouble finding good resources online that explain regexp in varnish / or good clear examples that people use.
If anyone can provide a kick in the right direction, certainly it would be greatly appreciated.
Thanks for comment below - I have adjusted my config now as follows,
if (req.http.x-forwarded-for) {
set req.http.X-Forwarded-For =
regsub(req.http.X-Forwarded-For, "\, 123\.123\.123\.123","");
} else {
set req.http.X-Forwarded-For = client.ip;
}
and now I find in my apache logs, for example,
client.ip.real.here, 123.123.123.123 - - [22/Dec/2013:07:45:17 -0800] "GET /PATH/TO/STUFF" "Mozilla/5.0.... like Gecko"
which suggests to me the regexp is not actually doing what I thought I wanted. Is there any way to 'test' varnish interactively, ie, feed a given VCL rule set some defined input, and then get varnish to feed out specific output; kind of a 'dry run' processing as a way to facilitate debug ?
Thanks!

Related

Cannot setup Google Cloud CDN for an external website?

I am following the instructions at https://cloud.google.com/cdn/docs/setting-up-cdn-with-ex-backend-internet-neg and https://medium.com/the-innovation/how-to-enable-google-cdn-for-custom-origin-websites-google-cdn-for-external-websites-56e3fe66cca9 to setup Google Cloud CDN for my website www.datanumen.org.
For the "Fully qualified domain name and port" in "New network endpoint", I choose www.datanumen.org.
All others are same as the above two articles, I use HTTP protocol for all the communications. Finally I get a frontend IP address 34.96.69.82. So I try to visit http://34.96.69.82/, but get a default "SORRY" web page instead of the contents from www.datanumen.org. Why?
Also later I plan to update the DNS A Record for www.datanumen.org so that datanumen.org will points to 34.96.69.82 instead of its current IP address. I am just curious that if I do that, then since what I put in "Fully qualified domain name and port" in "New network endpoint" is www.datanumen.org, will it cause the following deadloop:
a user visit www.datanumen.org
Based on DNS A record, he will go to 34.96.69.82(frondend)
The frontend will request data from backend, and the endpoint is www.datanumen.org,
Based on DNS A record, the backend end point will also solved to 34.96.69.82.
Thus will cause a deadloop for ever?
Update:
For the 1st question, I find the solution. My website is hosted on a server with shared IP. In article https://cloud.google.com/cdn/docs/setting-up-cdn-with-ex-backend-internet-neg, it asks me to add "Host" to the request header, which is used to identify the actual site to be accessed when the request reaches the original server. In my previous configuration, I thought this step is useless so I just skip it. After adding the "Host" field, now I can visit my website properly with the IP address given by Google.
You already fixed your fist issue so you're at a point when you can successfully access your site using an IP address.
Right now all you need is a CDN enabled - if it's not you can do it in the "backend" section of the "load balancing" page.
Have a look also at Network Endpoint Groups documentation to see how to create a load balancer utilising them (the only way you can use GCP's CDN for external site).
To answer your second question - there will be no loop - your site works properly.
Since you're not using secure HTPPS (only HTTP) then you don't have to worry about SSL certificates and the only thing that remains for you to do is to direct your domain to your load-balancer's IP and you're done.
If you encounter any issues or just want to check if CDN is working correctly then have a look at CDN troubleshooting page.
Most simple way to verify if it's working is to use curl: curl -s -D - -o /dev/null http://example.com/style.css and see if you have Cache-Control line present in the output:
HTTP/1.1 200 OK
Date: Tue, 16 Feb 2016 12:00:31 GMT
Content-Type: text/css
Content-Length: 1977
Cache-Control: max-age=86400,public
Via: 1.1 google
However I recommend using HTTPS and SSL certificates for the security reasons - it's much harder to spoof the traffic/listen to the between the site and the client. It's not mandatory though.

Direct IP Attacks, ElastickBeanstalk/NGINX

I have a bit problem with my site.
So setup is ElasticBeanstalk(NGINX) + Cloudflare
But each day around 4AM I have direct IP attack to my server.
Around 300 requests in 1-2 minutes.
Bot try to access some resources like
GET /phpMyadmi/index.php HTTP/1.1
GET /shaAdmin/index.php HTTP/1.1
POST /htfr.php HTTP/1.1
For now all of them going to 80 or 8080 ports.
And successfully handled by Nginx configuration that redirect it to example:443
server {
listen 80 default_server;
listen 8080 default_server;
server_name _;
return 301 https://example.com$request_uri;
}
server {
listen 443 ssl;
server_name example.com;
ssl on;
...
So questions are,
have many site owners/devOps face the same attack. What is your action to prevent such attacks.
For now it is handled very well and did not affect on server work, should I worry about it? Or just filter out logs with /phpmy/ pattern and forgot about it.
Before this attacks I have request with method PROPFIND, should I blocked it for security reasons? It is handled by default server for now.
I know that I can use Cloudflare Argotunel or ELB + WAF. But I am not really want to do it for now.
I have found one solution on stackoverflow. Is whitelist of all cloudflare ips. But i think it is not a good one.
Also another solution that should work I guess it is to check Host header, and compare it with 'example.com'.
To answer your specific questions:
Every public IP receives unwanted traffic like you describe, sadly its pretty normal. This isnt really an attack as such, its just a bot looking for signs of specific weaknesses, or otherwise trying to provoke a response that contains useful data. This data is no doubt later used in actual attacks, but its basically automated recognisance on a potentially massive scale.
This kind of script likely isnt trying to do any damage, so as long your server is well configured & fully patched its not a big concern. However these kinds of scans are first step towards launching an attack - by identifying services & application versions with known vulnerabilities - so its wise to keep your logs for analysis.
You should follow the principle of least privilege. PROPFIND is related to WebDAV - if you dont use it, disable it (or better white list the verbs you do support and ignore the rest).
If your site is already behind CloudFlare then you really should firewall access to your IP so only Cloudflares IPs can talk to your server. Those IPs do change, so I would suggest a script to download the latest from https://www.cloudflare.com/ips-v4 and have it periodically update your firewall. Theres a slightly vuage help article from CloudFlare on the subject here: https://support.cloudflare.com/hc/en-us/articles/200169166-How-do-I-whitelist-Cloudflare-s-IP-addresses-in-iptables-
If for whatever reason you cant firewall the IP, your next best option is something like fail2ban (www.fail2ban.org) - its a log parser that can manipulate the firewall to temporarily or permanently block an IP address based on patterns found in your log files.
A final thought - id advise against redirecting from your IP to your domain name - your telling the bot/hackers your URL - which they can then use to bypass the CDN and attack your server directly. Unless you have some reason to allow HTTP/HTTPS traffic to your IP address, return a 4XX (maybe 444 a " Connection Closed Without Response") instead of redirecting when requests hit your IP. You should then create a separate server block to handle your redirects, but only have it respond to genuine named URLs.

Create AWS Application Load Balancer Rule which trim off request's prefix without using additional reverse proxy like nginx, httpd

Basically, I have a couple of services. I want to forward every requests with prefix "/secured" to server1 port 80 and all other requests to server 2 port 80. The problem is that on server1, I am running service which accept the request without "/secured" prefix. In other words, I want to forward every requests such as "http://example.com/secured/api/getUser" to server1 as "http://example.com/api/getUser" (remove /secured from request' path).
With AWS ALB, currently the request is sent as http://example.com/secured/api/getUser; which forces me to update my server1's code so that the code handles requests with /secured prefix which doesn't look good.
Is there any easy way to solve this with ALB?
Thanks.
I can confirm that this is unfortunately not possible with the ALB alone - and I agree it really should be.
AWS states:
Note that the path pattern is used to route requests but does not
alter them. For example, if a rule has a path pattern of /img/*, the
rule would forward a request for /img/picture.jpg to the specified
target group as a request for /img/picture.jpg.
I had the same issue, and as Mark pointed out, you can use reverse proxy on your server and do something like this (this is Nginx configuration):
server {
listen 80 default_server;
location /secured/ {
proxy_pass http://localhost:{service_port}/;
}
}
This will strip the /secured part and proxy everything else to your service. Just be sure to have the trailing / after the service port.

How to tell AWS application load balancer to not forward the path pattern?

I have configured my AWS application load balancer to have the following rules:
/images/* forward to server A (https://servera.com)
/videos/* forward to server B (https://serverb.com)
And this is correctly forwarding to the respective servers. However, I don't want the load balancer to forward the request as https://servera.com/images & https://serverb.com/videos. I just want the respective servers to be hit without the path pattern as https://servera.com & https://serverb.com (without the path patterns in the request).
I don't want to modify my request parameters or change my server side code for this. Is there a way I can tell the application load balancer to not forward the path patterns?
Is there a way I can tell the application load balancer to not forward the path patterns?
No, there isn't. It's using the pattern to match the request, but it doesn't modify the request.
I don't want to modify my request parameters or change my server side code for this.
You'll have to change something.
You shouldn't have to change your actual code. If you really need this behavior, you should be able to accomplish it using the web server configuration -- an internal path rewrite before the request is handed off to the application by the web server should be a relatively trivial reconfigurarion in Nginx, Apache, HAProxy, or whatever is actually listening on the instances.
Also, it seems to me that you are making things difficult on yourself by wanting the server to respond to a path different than what is requested by the browser. Such a configuration will tend to make it more difficult to ensure correct test results and correct handling of relative and absolute paths, since the applications will have an inaccurate internal representation of what the browser is requesting or will need to request.

Browser DNS behaviour - multiple IP addresses for single domain

I have the following problem and I am struggling to find if a solution exists for it, or what the best practice is here.
I have a site example.com, and multiple servers with different IP addresses around the world. I am seeing the following behaviour in my browser (Chrome) - for simplicity lets say I only have 2 IP addresses for now.
I connect to example.com and data is served from IP address A.B.C.D (server 1). After 40 seconds or, any subsequent request (GET/POST) to example.com then resolves to W.X.Y.Z (server 2). My issue is that I have a cookie based web session on server 1, and server 2 knows nothing about that session. There is no kind of back-end replication I can do to sync state between both servers.
Is there any way I can force the browser to only connect to a single server once a server has served the first page? I am using RR DNS with multiple A records at the moment. Would switching to CNAME solve this problem?
One solution I was thinking of was having each server reply with a configured domain in the http headers (e.g. server1 would reply with X-HEADER: server1.example.com, server2 would reply with X-HEADER: server2.example.com) and then force the browser to make requests to these. I would then have a single IP address for server1.example.com, and another for server2.example.com. Does this break same-origin policy though? If I am on example.com can I send GET/POST/PUT etc. to server1.example.com?
I'd really appreciate any advice on this - I'm so confused!