HAProxy 1.6+: rewrite host based on path - regex

I'm trying to redirect all requested of type:
static.domain.com/site1/resource.jpg
static.domain.com/site1/resource2.js
static.domain.com/site2/resource3.gif
static.domain.com/site2/someDir/resource4.txt
to
site1.domain.com/resource.jpg
site1.domain.com/resource2.js
site2.domain.com/resource3.gif
site2.domain.com/someDir/resource4.txt
Basically, if the host is static.domain.com:
New subdomain is based on the the first part of the original path, with same TLD
New path is the original path not including the first part
I am pretty sure regexps can solve this, just not sure how to modify one header based on another..

At first I thought this might work:
# Detect hosts of the format static.*
acl host_static hdr_beg(host) -i static.
# Style using reqirep
# -------------
# Replace "static.domain.com" with "someFolder.domain.com" if the host is static.* and the path has at least two / symbols
# This causes: static.domain.com ===> whatever3.domain.com
#reqirep ^([^\ :]*\ /)([^/]+)(/.*\n)(^(?:[a-zA-Z0-9()\-=\*\.\?;,+\/&_]+:\ .+\n)+)*Host:\ static\.([^/]+?)$ \1\2\3\4Host:\ \2.\5 if host_static
#
# Replace "/someFolder/" with "/" at the beginning of any request path, if the host is static.*
# This causes: /whatever3/another/long/path ===> /another/long/path
#reqirep ^([^\ :]*)\ /[^/]+/(.*) \1\ /\2 if host_static
#---------------
but it doesn't work as expected. The regexp works properly in controlled tests, but not in haproxy itself. Probably an issue of directive processing and execution order. (perhaps the modification of the request path screws the first regexp?)
I then tried this:
# Style using set-var, set-path etc
#---------------
#http-request set-var(req.first_path_part) path,field(2,/) if host_static
#http-request set-var(req.last_host_part) hdr(host),regsub(^static\.,) if host_static
#http-request replace-header Host .* %[var(req.first_path_part)].%[var(req.last_host_part)] if host_static
#http-request set-path %[path,regsub(^/.*?/,/)] if host_static
#---------------
Once again, it almost works, but for some reason the host doesn't get replaced properly.
Since this was only used by the QA env, and the behaviour is different from Production anyways (static.*, in my case, would point to a CDN), I decided this is a sufficient solution for now:
# New style, using set-var and redirection.
#---------------
http-request set-var(req.first_path_part) path,field(2,/) if host_static
http-request set-var(req.last_host_part) hdr(host),regsub(^static\.,) if host_static
http-request redirect location https://%[var(req.first_path_part)].%[var(req.last_host_part)]%[path,regsub(^/.*?/,/)] code 302 if host_static
#---------------

I'm not sure how HAProxy works, but I can help you with the regex.
Try: ^static\.([^/]+)/([^/]+)/(.*)$
Your new URL will be \2.\1/\3.
Note that you may need to escape the /s in the regex (which would make it \/).

Related

Regex for "wp-admin" "wp-login" entries in syslog trying on drupal sites

I am looking for a fail2ban regex (or two) to find the wp-admin and wp-login attemps on drupal sites.
The regex should find "drupal:" and "page not found" and ("wp-admin" or "wp-login")
the problem for me are the "and" conditions
The logfile entries:
Apr 7 10:59:23 webserver drupal: https://www.anywebsite.com|1617785962|page not found|123.456.789.112|https://www.anywebsite.com/wp-admin/admin-ajax.php?action=revslider_show_image&img=../wp-config.php|https://anywebsite.com/wp-admin/admin-ajax.php?action=revslider_show_image&img=../wp-config.php|0||wp-admin/admin-ajax.php
Apr 7 06:53:47 webserver drupal: https://www.anywebsite.com|1617771227|page not found|123.456.789.112|https://www.anywebsite.com/wp/wp-login.php||0||wp/wp-login.php
Here you go:
failregex = ^\s*\S+ drupal: [^|]*\|\d+\|(?:page not found)\|<ADDR>
replace <ADDR> with <HOST> for fail2ban versions before v.0.10
WARNING Note that this assumes that first URI in your log-line (site? referrer?) after drupal: never contains a pipe-character (so an intruder is unable to add it to URI somehow to avoid ban). Otherwise it becomes complex (you must anchor it from both sides or write some conditional REs with lookaheads or lookbehinds).
Also note that if your side can make some 404 for legitimate users (because missing some references etc), you have to add to the RE some precise pattern excluding your missing pages to avoid false positives, e. g. something like this (with blacklisting expressions):
_block_uris = wp-admin|(?:wp/)wp-login
failregex = ^\s*\S+ drupal: [^|]*\|\d+\|(?:page not found)\|<ADDR>\|\w+://[^/]+/(?:%(_block_uris)s)
or (with white-listing expressions, here ignoring /my-page/ and my-site/ URIs):
_ignore_uris = my-page/|my-side/
failregex = ^\s*\S+ drupal: [^|]*\|\d+\|(?:page not found)\|<ADDR>\|\w+://[^/]+/(?!%(_ignore_uris)s)

Edit/Match a string n lines below some specific match in Bash?

I have a complex problem. Below is the ndxconfig.ini file I want to Edit
# /etc/ndxconfig.ini will override this file
# if APP_ID is added in service propery, service discovery will be using marathon;
# HOST/PORT specified will override values retrieved from marathon
[MARATHON]
HOSTS = {{ ','.join(groups['marathon'])}}
PORT = 8080
PROTOCOL = http
SECRET = SGpQIcjK2P7RYnrdimhhhGg7i8MdmUqwvA2JlzbyujFS4mR8M88svI7RfNWt5rnKy4WHnAihEZmmIUb940bnlYmnu47HdUHE
[MYSQL]
; APP_ID = /neon/infra/mysql
HOST = {{keepalived_mysql_virtual_ip}}
PORT = 3306
SECRET = tIUFN1rjjDBdEXUsOJjPEtdieg8KhwTzierD48JsgDeYc84DD6Uy5a6kzHfKolq1MNS1DKwlSqxENk33UulJd9DPHPzYCxFm
I want to change specifically marathon protocol conf from http to https. Not other's protocol conf. I have to match PROTOCOL = http 3 lines below the [MARATHON] line. I researched and couldn't find any solution. There's only 1 line below sed solutions.
One idea stuck mine was somehow specially grep [MARATHON] and 3 lines below and tail 1 line. I don't know.
How can fix this? Please Help.
Solution found here
sed '/\[MARATHON\]/{N;N;N;s/http/https/;}' <file>
If you have python available, you can use crudini:
crudini --set ndxconfig.ini MARATHON protocol https
This screams for ed, which treats the file as a whole, not processing it a line at a time like sed (And also doesn't depend on the availability of the non-posix -i extension to sed for in-place editing of files):
ed -s ndxconfig.ini <<'EOF'
/MARATHON/;/^$/ s/^PROTOCOL = http$/PROTOCOL = https/
w
EOF
This will replace the PROTOCOL line in the block that starts with a line matching MARATHON and ending with a blank line. The protocol entry can be the second, third, fourth, etc. line; doesn't matter. If the protocol is already https, it won't do anything (Except print a question mark)

How to set HTTP header difference by URL in Apache

I try to set Apache/2.4.25 (Debian) (via docker) form this post but that's not work for me.
Configuration file like.
Header set Test-1 %{THE_REQUEST}e
<If "%{REQUEST_URI} != '/en'">
Header set Test-2 %{REQUEST_URI}e
</If>
When call GET /en HTTP header is
Test-1: (null)
Test-2: /en
How do I fix it?
You need to use %{REQUEST_URI}e instead of %{THE_REQUEST}e.
You can also simplify this by using:
Header set Test-1 %{REQUEST_URI}e
SetEnvIf Request_URI "^/(?!en/?$)" NO_EN
Header set Test-2 %{REQUEST_URI}e env=NO_EN

Apache 2.4 setenvif dontlog pingdom.com

Many of webmasters use pingdom.com as a monitoring ping service.
But the problem is that /httpd/access_log is full of
208.64.28.194 - - [06/Aug/2015:12:20:22 -0500] "GET / HTTP/1.1" 200 2917 "-" "Pingdom.com_bot_version_1.4_(http://www.pingdom.com/)"
I set
CustomLog "logs/access_log" combined env=!dontlog
and tried to get rid of it using variations like
SetEnvIf Remote_Host "^pingdom\.com$" dontlog
SetEnvIFNoCase Remote_Host "pingdom.com$" dontlog
SetEnvIfNoCase Referer "www\.pingdom\.com" dontlog
SetEnvIFNoCase Host "^pingdom.com$" dontlog
but still no a success with any of them - so thanks for any else hint to try.
I'll put my comment here as answer so anyone will find this easier.
Since one can see from the log file, the host name is not Pingdom.com but a part of the user agent string.
Solutions to try:
First be sure you have enabled the setenvif-module. Write the command
sudo apache2ctl -M | grep setenv
It should return something like "setenvif_module (shared)"
Then you can try setting by remote address
SetEnvIf Remote_Addr "208\.64\.28\.194$" dontlog
The final working solution is this, dont log if the user agent string contains Pingdom string:
SetEnvIfNoCase User-Agent "^Pingdom" dontlog
Edit: enhanced some parts of the answer.

Same regex behaving differently in Apache and Nginx

I'm trying to covert 5G Blacklist to from Apache(.htaccess) to Nginx(.conf). There is a line in .htaccess that is causing problem:
<IfModule mod_alias.c>
RedirectMatch 403 (\,|\)\+|/\,/|\{0\}|\(/\(|\.\.\.|\+\+\+|\||\\\"\\\")
</IfModule>
I have converted it to .conf as follows:
Code included in http block
map $request_uri $bad_uri {
default 0;
"~*(\,|\)\+|/\,/|\{0\}|\(/\(|\.\.\.|\+\+\+|\||\\\"\\\")" 1;
}
Code included in server block
if ($bad_uri) {
return 403;
}
As far as I know both Apache and Nginx use perl regex so no change should be required when converting from former to the latter. However, following URI is giving 403 on Nginx but working fine on Apache:
www.example.com/some,url,with,commas
www.example.com/?q=some,url,with,commas
Finally found the issue.
In Apache RedirectMatch matches only the url without query string whereas $request_uri in nginx maps to url with query string.
So the correct code for Nginx is:
map $uri $bad_uri {
default 0;
"~*(\,|\)\+|/\,/|\{0\}|\(/\(|\.\.\.|\+\+\+|\||\\\"\\\")" 1;
}