Can apache be configured to ignore OPTIONS requests? - django

I run a small webapp for a couple of departments at work, which is very low traffic and doesn't have that many users. It's built on top of Django and uses apache as the web server.
I have things configured to email me when any errors occur which until yesterday was a great thing - there aren't many errors, but sometimes the users don't speak up when they encounter problems, so it allows me to stay on top of things.
Yesterday we had a new user, and I started getting tons of error emails. He had no idea that anything was wrong, so I figured it was something behind the scenes. When I looked at the logs, they are HTTP OPTIONS requests which are using the "Microsoft Data Access Internet Publishing Provider Protocol" and "Microsoft Office Protocol Discovery". I'd never heard of this until that point, but it appears to be some sort of MS web folders/webDAV thing.
One option is to figure out how he can turn that off and tell him to stop doing that, but I'd rather just cut the head off here and do something like have apache just not pass on those requests to Django Is there a way that this can be handled?

The rewrite option is good, the 'Apache Way' is probably more like:
<LimitExcept GET POST>
deny from all
</LimitExcept>
or...
<Limit OPTIONS>
deny from all
</Limit>

I found a solution used by a different framework and ported to Django. I place this at the top of any view that generate HTML with links to .XLS or .DOC files:
if request.method == 'OPTIONS':
response = HttpResponse()
response['Allow'] = 'GET, HEAD, POST'
return response
I like Apache solution better though... assuming it doesn't cause problems on the Windows side of things.

How about:
RewriteEngine On
RewriteCond %{REQUEST_METHOD} ^OPTION
RewriteRule .* - [F]
(With mod_rewrite enabled.)

Related

Redirect htaccess regex keeping same URL in browser

I am trying to redirect some pages on a Wordpress site. The pages would have this URL pattern:
domain.com/sponsored/something1/.../something2?par_t=param
But should be redirected to this one:
domain.com/sponsored/?par_t=param
So I need remove some parameters from the address but without updating the actual URL in the browser.
I have been tried adding this rule and some others into the .htaccess but no luck so far:
RewriteRule ^/sponsored/([A-Za-z0-9]+)/?$ domain.com/sponsored/$2 [QSA]
Is this possible? Any idea on how could this be achieved?
Thanks!
Sounds pretty straight forward, probably this is what you are looking for:
RewriteEngone on
RewriteRule ^/?sponsored/(.+)/?$ /sponsored/ [END,QSA]
In case you receive an internal server error (http status 500) using the rule above then chances are that you operate a very old version of the apache http server. You will see a definite hint to an unsupported [END] flag in your http servers error log file in that case. You can either try to upgrade or use the older [L] flag, it probably will work the same in this situation, though that depends a bit on your setup.
This rule will work likewise in the http servers host configuration or inside a dynamic configuration file (".htaccess" file). Obviously the rewriting module needs to be loaded inside the http server and enabled in the http host. In case you use a dynamic configuration file you need to take care that it's interpretation is enabled at all in the host configuration and that it is located in the host's DOCUMENT_ROOT folder.
And a general remark: you should always prefer to place such rules in the http servers host configuration instead of using dynamic configuration files (".htaccess"). Those dynamic configuration files add complexity, are often a cause of unexpected behavior, hard to debug and they really slow down the http server. They are only provided as a last option for situations where you do not have access to the real http servers host configuration (read: really cheap service providers) or for applications insisting on writing their own rules (which is an obvious security nightmare).

How to redirect part of a URL using regex?

So I am having a bit of trouble.
I have many products with the same part o URL that I recently changed:
https://www.website.com/shop/category-sample/product1/
https://www.website.com/shop/category-sample/product2/
https://www.website.com/shop/category-sample/product3/
https://www.website.com/shop/category-sample/product4/
I need the "category-sample" to be "category"
So the new URLS would look like this:
https://www.website.com/shop/category/product1/
And etc.
Thank you!
Assuming that you are using the typical apache http server with loaded rewriting module this should do what you are looking for:
RewriteEngine on
RewriteRule ^/?shop/category-sample/(.*)$ /shop/category/$1 [R=301,QSA,END]
In case "category" actually is a dynamic value, not a fixed literal this variant should do what you ask for:
RewriteEngine on
RewriteRule ^/?shop/(.+)-sample/(.*)$ /shop/$1/$2 [R=301,QSA,END]
That rule will work likewise in the http servers host configuration of in a dynamic configuration file (".htaccess" style file) if you have to use those.
If you receive an "internal server error" using those rules (http status 500) then chances are that you operate a very old version of the apache http server. Have a try using the L flag instead of the newer END flag then. You will find a corresponding hint in your http servers error log file in that case.
And a general remark: you should always prefer to place such rules in the http servers host configuration instead of using dynamic configuration files (".htaccess"). Those dynamic configuration files add complexity, are often a cause of unexpected behavior, hard to debug and they really slow down the http server. They are only provided as a last option for situations where you do not have access to the real http servers host configuration (read: really cheap service providers) or for applications insisting on writing their own rules (which is an obvious security nightmare).

"Not Found: /406.shtml" from django

I'm running django with apache fcgi on a shared host. I've set it up to report 404 errors and keep seeing Not Found: /406.shtml via emails (I'm guessing the s is because it's https only). However I have error documents already set up in .htaccess:
ErrorDocument 406 /error/406.html
I was getting a bunch of similar 404 errors from django before setting up an ErrorDocument for each one, but it's still happening for 406. From a grep 406 through the apache error log I'm seeing an occasional 406 (not 404) error for 406.shtml, such as the following, but not nearly as often as django emails me:
[Fri ...] [error] [client ...]
ModSecurity: Access denied with code 406 (phase 1).
Pattern match "Mozilla ... AhrefsBot ...)" at REQUEST_HEADERS:User-Agent.
[file "/usr/local/apache/conf/mod_sec/mod_sec.hg.conf"] [line "126"]
[id "900165"]
[msg "AhrefsBot BOT Request"]
[hostname "www.myhostname.com"]
[uri "/406.shtml"]
[unique_id "..."]
I'm not even sure if this is apache redirecting internally to 406.shtml and it being forwarded on to django or if some bot is trying to find 406.shtml directly. The former seems to indicate a problem with ErrorDocument. The latter isn't really my problem, but then either I should be seeing a 404 for 406.shtml in the apache logs or nothing at all because django will handle the 404? How can I track it down further?
I haven't been able to reproduce the issue just by visiting my site, but I'd like to know what's going on.
You have ModSecurity installed in your Apache which is a WAF which attempts to protect your website from attacks, bots and the like. These, like email spam are part and parcel of running a website now a days unfortunately.
ModSecurity is an add on module to Apache which allows you to define rules and then it runs each request against those rules and decides whether to block the request or not.
In this case a rule (900165, which is defined in file "/usr/local/apache/conf/mod_sec/mod_sec.hg.con) has decided to block this request with a 406 status based on the user agent (AhrefsBot).
Ahref is a website which crawls the web trying to build up a database of links. It's used by SEO people to see who links to your websites (back links are very important to SEO) as Google (who you think would be better providers of this type of information) only give samples of links rather than full listing.
Is AhrefBot a danger and should it be blocked? Well that's a matter of opinion. Assuming it's really AhrefBot (some nefarious bots might pretend to be it so as to look legitimate so check the IP address to see the hostname it came from), then it's probably wasting your resources without doing you much good. On the other hand this is the price of an open web. Your website is available to the public and so also to those that write bots and tools (good or bad).
Why does it return a 406? Well that's how your ModSecurity and/or your rule is defined. Check your Apache config. 406 is a little unusual as would normally expect a 403 (access denied) or 500 (internal server error).
What's the 406.shtml file? That I don't get. A .shtml is a HTML file which also allows server side includes to embed other files and code into an HTML file. They are not used much any more to be honest as the likes of PHP and/or other languages are more common. It could be an attack: I.e. someone's attempting to upload the 406.shtml file and then cause it to be called so it "executes" and includes the contents of the file, potentially giving access to files Apache can see which are not available on the webserver, or the user has requested that (for some reason) or Apache is configured to show that for 406 errors or the ModSecurity rule is redirecting to that file.
Hopefully that gives a good bit of background, and best thing I can suggest is to go through your Apache config file, and any other config files it loads (including mod_sec.hg.con file which it must load) to fully understand your set up and the. Decide if you need to do anything here.
You could do one of several things:
Leave as is. ModSecurity is doing what it was told to do and blocking this with a 406
Turn off this rule and allow AhrefRef through so you don't get alerted by this.
Alter the ModSecurity config/rule to return an error other than 406 so you can ignore it
Turn off ModSecurity completely. I think it is a good tool and worthwhile but does take some time and effort to get most out of it.
Set up the 406 error page properly. To do that you need to understand why it's attempting to return 406.shtml at the moment.
Also not sure which of these options are available to you as you are on a shared host and might not have full access. If so speak to your hosting provider for advice.

Deploy Django REST API to api.example.com: Apache 2.2, mod_wsgi and mod_rewrite

I have been searching for information on this topic for a couple days and I keep running into road blocks.
I have a Django web site and application running at www.example.com and I'm forcing HTTPS. It's deployed on Apache 2.2 with WSGI. This works fine and works for both example.com and www.example.com.
I also have a REST API (pip install djangorestframework) running at https://www.example.com/api/v1/. This also works fine.
I want to run the API from a subdomain https://api.example.com and keep this URL in the address bar. For example, to fetch JSON objects I might use something like this:
curl -X GET https://api.example.com/objects/ -H 'Authorization: Token xxx'
I can get this now by using this:
curl -X GET https://www.example.com/api/v1/objects/ -H 'Authorization: Token xxx'
I have a separate SSL certificate for this subdmain and his has been correctly configured.
I have tried many things in my Apache configuration to accomplish this but failed at every turn. I thought I could use mod_rewrite to silently fetch the content from https://www.example.com/api/v1/ while leaving https://api.example.com in the address bar. Is this possible? Here is what I've tried (in the sites-available virtual host file):
RewriteEngine on
RewriteCond %{HTTP_HOST} ^api.example.com [NC]
RewriteRule ^(.*)$ https://www.example.com/api/v1/$1 [L]
I have tried several variations of this idea to no avail. I played around with HTTPS on/off as well with no real benefit.
I read a couple places that using mod_proxy could accomplish this but when I went down this road, the API was available (after quite a bit of tweaking) at the desired URL (https://api.example.com) but none of my static content was there and when I clicked on a relative link in the Django REST Framework UI, I'd get 404s because it was looking at:
https://api.example.com/api/v1/
which Django complained about: /api/v1/api/v1/
I guess all I'm trying to do is make https://api.example.com the base URL for the API as if it were https://www.example.com/api/v1/.
Duplicate that lead to the discovery of the django-hosts package:
Django subdomain configuration for API endpoints
I have been playing around with this and it shows promise, although I haven't "solved" my problem yet. I plan to edit this answer once I get more information to share. In the meantime, if anyone has used django-hosts to approach my original question, please add your answers here or at least make some comments!

How to abandon a request and not have to respond to it?

There are some bots that just keep hitting my site over and over again.
http://proxy.parser.by/proxy.php (Referrer)
I don't want to even reply to anyone that is requesting a .php, or .htm, or .html file.
What is the best way of not responding to such requests?
Update: (I don't want to incur the cost of responding)
This is probably best done at the server level, before the request even gets to Django. For example, in Apache, you can use mod_rewrite for access control. This rule rejects all requests with paths ending with .php, .htm, or .html:
RewriteRule \.(php|html?)$ - [F]
The Apache documentation explains how to block requests by user agent, by referer, by orginating IP address, and so on.
Drop in some middleware which will detect any request you don't want to handle and have it return an HTTP 403 (Forbidden).