Apache 2.4 setenvif dontlog pingdom.com - regex

Many of webmasters use pingdom.com as a monitoring ping service.
But the problem is that /httpd/access_log is full of
208.64.28.194 - - [06/Aug/2015:12:20:22 -0500] "GET / HTTP/1.1" 200 2917 "-" "Pingdom.com_bot_version_1.4_(http://www.pingdom.com/)"
I set
CustomLog "logs/access_log" combined env=!dontlog
and tried to get rid of it using variations like
SetEnvIf Remote_Host "^pingdom\.com$" dontlog
SetEnvIFNoCase Remote_Host "pingdom.com$" dontlog
SetEnvIfNoCase Referer "www\.pingdom\.com" dontlog
SetEnvIFNoCase Host "^pingdom.com$" dontlog
but still no a success with any of them - so thanks for any else hint to try.

I'll put my comment here as answer so anyone will find this easier.
Since one can see from the log file, the host name is not Pingdom.com but a part of the user agent string.
Solutions to try:
First be sure you have enabled the setenvif-module. Write the command
sudo apache2ctl -M | grep setenv
It should return something like "setenvif_module (shared)"
Then you can try setting by remote address
SetEnvIf Remote_Addr "208\.64\.28\.194$" dontlog
The final working solution is this, dont log if the user agent string contains Pingdom string:
SetEnvIfNoCase User-Agent "^Pingdom" dontlog
Edit: enhanced some parts of the answer.

Related

blockinfile Ansible module does not insert at the given regex

Below is the test.conf where i wish to add a block before the line closing tags i.e. before the line which starts with </VirtualHost>
cat test.conf
#
##<VirtualHost _default_:443>
<VirtualHost *:443>
#ProxyPreserveHost On
</VirtualHost>
Below is my playbook to add the block:
cat /tmp/test.yml
---
- name: "Play 1"
hosts: localhost
tasks:
- name: Debug
blockinfile:
path: "/tmp/test.conf"
marker: "#"
state: present
block: |
<FilesMatch "^.*\.(css|html?|js|pdf|txt|xml|xsl|gif|ico|jpe?g|png)$">
Require all granted
</FilesMatch>
insertbefore: '^[^#]*</VirtualHost>'
I checked my test.conf and regex ^[^#]*<\/VirtualHost> on online python editor https://regex101.com and it gets the correct line matched.
Online regex tester and debugger: PHP, PCRE, Python, Golang and JavaScript
Online regex tester, debugger with highlighting for PHP, PCRE, Python, Golang and JavaScript.
regex101.com
The file gets changed and the block gets inserted however in the wrong place as you can see below:
TASK [Debug] ************************************************************************************************************************************************
changed: [localhost]
PLAY RECAP **************************************************************************************************************************************************
localhost : ok=1 changed=1 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
cat /tmp/test.conf
#
<FilesMatch "^.*\.(css|html?|js|pdf|txt|xml|xsl|gif|ico|jpe?g|png)$">
Require all granted
</FilesMatch>
#
##<VirtualHost _default_:443>
<VirtualHost *:443>
#ProxyPreserveHost On
</VirtualHost>
Can you please suggest what is wrong with my playbook and how to get this to work ?
It's because ansible specifies in the fine manual that marker: is exactly what it says -- the way it knows where the managed blocks begin and end. Since you chose to use text that is found throughout your file but is unrelated to the managed block sections, ansible just shrugged its shoulders and gave GIGO.
They even have a dedicated warning about leaving out the magic {mark} template param from marker::
Using a custom marker without the {mark} variable may result in the block being repeatedly inserted on subsequent playbook runs.
If you change your marker: to even something like marker: "#*#*#*" it will start to work ... or at least will work once.

How to set HTTP header difference by URL in Apache

I try to set Apache/2.4.25 (Debian) (via docker) form this post but that's not work for me.
Configuration file like.
Header set Test-1 %{THE_REQUEST}e
<If "%{REQUEST_URI} != '/en'">
Header set Test-2 %{REQUEST_URI}e
</If>
When call GET /en HTTP header is
Test-1: (null)
Test-2: /en
How do I fix it?
You need to use %{REQUEST_URI}e instead of %{THE_REQUEST}e.
You can also simplify this by using:
Header set Test-1 %{REQUEST_URI}e
SetEnvIf Request_URI "^/(?!en/?$)" NO_EN
Header set Test-2 %{REQUEST_URI}e env=NO_EN

HAProxy 1.6+: rewrite host based on path

I'm trying to redirect all requested of type:
static.domain.com/site1/resource.jpg
static.domain.com/site1/resource2.js
static.domain.com/site2/resource3.gif
static.domain.com/site2/someDir/resource4.txt
to
site1.domain.com/resource.jpg
site1.domain.com/resource2.js
site2.domain.com/resource3.gif
site2.domain.com/someDir/resource4.txt
Basically, if the host is static.domain.com:
New subdomain is based on the the first part of the original path, with same TLD
New path is the original path not including the first part
I am pretty sure regexps can solve this, just not sure how to modify one header based on another..
At first I thought this might work:
# Detect hosts of the format static.*
acl host_static hdr_beg(host) -i static.
# Style using reqirep
# -------------
# Replace "static.domain.com" with "someFolder.domain.com" if the host is static.* and the path has at least two / symbols
# This causes: static.domain.com ===> whatever3.domain.com
#reqirep ^([^\ :]*\ /)([^/]+)(/.*\n)(^(?:[a-zA-Z0-9()\-=\*\.\?;,+\/&_]+:\ .+\n)+)*Host:\ static\.([^/]+?)$ \1\2\3\4Host:\ \2.\5 if host_static
#
# Replace "/someFolder/" with "/" at the beginning of any request path, if the host is static.*
# This causes: /whatever3/another/long/path ===> /another/long/path
#reqirep ^([^\ :]*)\ /[^/]+/(.*) \1\ /\2 if host_static
#---------------
but it doesn't work as expected. The regexp works properly in controlled tests, but not in haproxy itself. Probably an issue of directive processing and execution order. (perhaps the modification of the request path screws the first regexp?)
I then tried this:
# Style using set-var, set-path etc
#---------------
#http-request set-var(req.first_path_part) path,field(2,/) if host_static
#http-request set-var(req.last_host_part) hdr(host),regsub(^static\.,) if host_static
#http-request replace-header Host .* %[var(req.first_path_part)].%[var(req.last_host_part)] if host_static
#http-request set-path %[path,regsub(^/.*?/,/)] if host_static
#---------------
Once again, it almost works, but for some reason the host doesn't get replaced properly.
Since this was only used by the QA env, and the behaviour is different from Production anyways (static.*, in my case, would point to a CDN), I decided this is a sufficient solution for now:
# New style, using set-var and redirection.
#---------------
http-request set-var(req.first_path_part) path,field(2,/) if host_static
http-request set-var(req.last_host_part) hdr(host),regsub(^static\.,) if host_static
http-request redirect location https://%[var(req.first_path_part)].%[var(req.last_host_part)]%[path,regsub(^/.*?/,/)] code 302 if host_static
#---------------
I'm not sure how HAProxy works, but I can help you with the regex.
Try: ^static\.([^/]+)/([^/]+)/(.*)$
Your new URL will be \2.\1/\3.
Note that you may need to escape the /s in the regex (which would make it \/).

Same regex behaving differently in Apache and Nginx

I'm trying to covert 5G Blacklist to from Apache(.htaccess) to Nginx(.conf). There is a line in .htaccess that is causing problem:
<IfModule mod_alias.c>
RedirectMatch 403 (\,|\)\+|/\,/|\{0\}|\(/\(|\.\.\.|\+\+\+|\||\\\"\\\")
</IfModule>
I have converted it to .conf as follows:
Code included in http block
map $request_uri $bad_uri {
default 0;
"~*(\,|\)\+|/\,/|\{0\}|\(/\(|\.\.\.|\+\+\+|\||\\\"\\\")" 1;
}
Code included in server block
if ($bad_uri) {
return 403;
}
As far as I know both Apache and Nginx use perl regex so no change should be required when converting from former to the latter. However, following URI is giving 403 on Nginx but working fine on Apache:
www.example.com/some,url,with,commas
www.example.com/?q=some,url,with,commas
Finally found the issue.
In Apache RedirectMatch matches only the url without query string whereas $request_uri in nginx maps to url with query string.
So the correct code for Nginx is:
map $uri $bad_uri {
default 0;
"~*(\,|\)\+|/\,/|\{0\}|\(/\(|\.\.\.|\+\+\+|\||\\\"\\\")" 1;
}

Deploy Django with Gunicorn and APACHE

I have a Django project and I wanna delivery it using gunicorn (and apache proxing).
I can't use Nginx, so that's no possible.
I've set the Apache proxy and setup a runner script to gunicorn, but i am get this weird error
2012-08-27 14:03:12 [34355] [DEBUG] GET /
2012-08-27 14:03:12 [34355] [ERROR] Error handling request
Traceback (most recent call last):
File "/home/tileone/venv/lib/python2.6/site-packages/gunicorn/workers/sync.py", line 93, in handle_request
self.address, self.cfg)
File "/home/tileone/venv/lib/python2.6/site-packages/gunicorn/http/wsgi.py", line 146, in create
path_info = path_info.split(script_name, 1)[1]
IndexError: list index out of range
I am running this script
#!/bin/bash
LOGFILE=/var/log/gunicorn/one-project.log
VENV_DIR=/path/to/venv/
LOGDIR=$(dirname $LOGFILE)
NUM_WORKERS=5
# user/group to run as
USER=USER
GROUP=GROUP
BIND=127.0.0.1:9999
cd /path_to_project
echo 'Setup Enviroment'
#some libraries
echo 'Setup Venv'
source $VENV_DIR/bin/activate
export PYTHONPATH=$VENV_DIR/lib/python2.6/site-packages:$PYTHONPATH
#Setup Django Deploy
export DJANGO_DEPLOY_ENV=stage
echo 'Run Server'
test -d $LOGDIR || mkdir -p $LOGDIR
export SCRIPT_NAME='/home/tileone/one-project'
exec $VENV_DIR/bin/gunicorn_django -w $NUM_WORKERS --bind=$BIND\
--user=$USER --group=$GROUP --log-level=debug \
--log-file=$LOGFILE 2>>$LOGFILE
and my apache configuration is like this:
Alias /static/ /hpath_to_static/static/
Alias /media/ /path_to_static/media/
Alias /favicon.ico /path_to/favicon.ico
ProxyPreserveHost On
<Location />
SSLRequireSSL
ProxyPass http://127.0.0.1:9999/
ProxyPassReverse http://127.0.0.1:9999/
RequestHeader set SCRIPT_NAME /home/tileone/one-project/
RequestHeader set X-FORWARDED-PROTOCOL ssl
RequestHeader set X-FORWARDED-SSL on
</Location>
What am i doing wrong?
In case anyone has similar issues, I managed to fix this by removing the equivalent of:
RequestHeader set SCRIPT_NAME /home/tileone/one-project/
And instead adding to settings.py the equivalent of:
FORCE_SCRIPT_NAME = '/one-project'
Of course for this, the apache configuration should be more like:
ProxyPreserveHost On
<Location /one-project/>
SSLRequireSSL
ProxyPass http://127.0.0.1:9999/
ProxyPassReverse http://127.0.0.1:9999/
RequestHeader set X-FORWARDED-PROTOCOL ssl
RequestHeader set X-FORWARDED-SSL on
</Location>
The reason for the fix proposed in the accepted answer is that you need to decide between one of the following two approaches:
Let the HTTP server strip the location sub path BEFORE forwarding the request to the WSGI server (as explained above ... this is done when both the Location and the ProxyPass directive end with a forward slash. Nginx behaves the same way). In this case, you may not use the SCRIPT_NAME HTTP header/gunicorn-env-variable, because gunicorn would try to strip the value from the incoming URL (but that fails because the web server, Apache, already did that). In this case, you're forced to use Django's FORCE_SCRIPT_NAME setting.
Let the request URL passed to gunicorn unmodified (an example of an URL might be /one-project/admin), and use the SCRIPT_NAME HTTP header (or gunicorn-env-variable). Because then gunicorn will modify the request and strip the value of SCRIPT_NAME from the URL before Django handles building the response.
I would prefer option 2, because you only need to change one file, the web server configuration, and all changes are neatly together.