Setting a header in apache - django

I'm trying to serve static files for download in a django application, I figured that I'd put the static files in /media/files and have Apache set the content-type header to application/octet-stream (the files to download are going to be word files but I'll work out the details later).
To do this I activated mod_headers and then in the apache config did this:
<Location "/media/files">
Header set Content-Type "application/octet-stream"
</Location>
After doing this I restarted apache and tried a sample file but it doesn't work, I still get text/plain in the content type and the browser does not prompt me to download anything.
By the way I know it is recommended to use a different web server for static files but I don't have much control on the server I'm going to deploy, it has to be only Apache with mod_python.

There could be any number of problems (it takes a lot more information than you've provided to trace down some apache config problems) but here are some thoughts:
Are you absolutely certain this snippet is being applied to the right files (e.g., if there are multiple virtual servers, and you stuck this in the wrong one, well..)
Do you have rewriting going on that might prevent this from being seen as a match?
Are you setting the Content-Type header elsewhere?
Do you have content arbitration going on? Depending, that could override anything you do in the headers.
One thing you might try is to add some other header and see if it comes back. Also, try doing the request yourself with telnet or elsewise reducing the number of things between you and the server. Use the log files. They are there to help you. Good luck.

Related

Can't Get Browser Caching Working on AWS Lightsail Bitnami Joomla System

I have followed several instructions here about editing the htaccess.conf file and other suggestions that come up with a search for adding expire headers to this hosting system (Bitnami/Lightwave/AWS). But nothing seems to make a difference. GTMetrix doesn't seem to see the expire headers in Page Speed or Y-Slow reports.
I'm using current versions of Joomla and Rockettheme's Gantry 5 Myriad theme. I am using RokBoost have Page Cache plugin enabled and System Cache Settings of Cache handler: file, Path to Cache Folder: blank, Cache Time: 15, Platform Specific Caching: No, System Cache: Off.
Can anyone tell me how to get the expire headers working?
Thanks for any help you can give.
Bitnami Enginer here,
Depending on the results in the GTMetrix site, you will need to add different "ExpireByType" lines in the htacess.conf file. For example, if you want to expire the .jpg files, you will need to add something similar to this
<Directory "opt/bitnami/apps/joomla/htdocs"
## EXPIRES CACHING ##
<IfModule mod_expires.c>
ExpiresActive On
ExpiresByType image/jpg "access 1 year"
</IfModule>
## EXPIRES CACHING ##
</Directory>
You will need to restart Apache after that
sudo /opt/bitnami/ctlscript.sh restart apache
Please note that you can't expire elements that are not owned by you. I mean, you can expire the jpg images that exist in your server but if you included images or any other element from another site, you can't do anything with that to expire the cache.
We were trying to get a wrong result. i was searching for a solution too but what i found after reading the GTMetrix test result with those (expire header) code and without them:
the expire header code is working but only with internal files.
this code will not work with external files(from other websites like google analytics js ...).
the test result will show you only the external files.
if you remove those lines of codes the result will be worst, and it will result in more files doesn't have an expired age.

"Not Found: /406.shtml" from django

I'm running django with apache fcgi on a shared host. I've set it up to report 404 errors and keep seeing Not Found: /406.shtml via emails (I'm guessing the s is because it's https only). However I have error documents already set up in .htaccess:
ErrorDocument 406 /error/406.html
I was getting a bunch of similar 404 errors from django before setting up an ErrorDocument for each one, but it's still happening for 406. From a grep 406 through the apache error log I'm seeing an occasional 406 (not 404) error for 406.shtml, such as the following, but not nearly as often as django emails me:
[Fri ...] [error] [client ...]
ModSecurity: Access denied with code 406 (phase 1).
Pattern match "Mozilla ... AhrefsBot ...)" at REQUEST_HEADERS:User-Agent.
[file "/usr/local/apache/conf/mod_sec/mod_sec.hg.conf"] [line "126"]
[id "900165"]
[msg "AhrefsBot BOT Request"]
[hostname "www.myhostname.com"]
[uri "/406.shtml"]
[unique_id "..."]
I'm not even sure if this is apache redirecting internally to 406.shtml and it being forwarded on to django or if some bot is trying to find 406.shtml directly. The former seems to indicate a problem with ErrorDocument. The latter isn't really my problem, but then either I should be seeing a 404 for 406.shtml in the apache logs or nothing at all because django will handle the 404? How can I track it down further?
I haven't been able to reproduce the issue just by visiting my site, but I'd like to know what's going on.
You have ModSecurity installed in your Apache which is a WAF which attempts to protect your website from attacks, bots and the like. These, like email spam are part and parcel of running a website now a days unfortunately.
ModSecurity is an add on module to Apache which allows you to define rules and then it runs each request against those rules and decides whether to block the request or not.
In this case a rule (900165, which is defined in file "/usr/local/apache/conf/mod_sec/mod_sec.hg.con) has decided to block this request with a 406 status based on the user agent (AhrefsBot).
Ahref is a website which crawls the web trying to build up a database of links. It's used by SEO people to see who links to your websites (back links are very important to SEO) as Google (who you think would be better providers of this type of information) only give samples of links rather than full listing.
Is AhrefBot a danger and should it be blocked? Well that's a matter of opinion. Assuming it's really AhrefBot (some nefarious bots might pretend to be it so as to look legitimate so check the IP address to see the hostname it came from), then it's probably wasting your resources without doing you much good. On the other hand this is the price of an open web. Your website is available to the public and so also to those that write bots and tools (good or bad).
Why does it return a 406? Well that's how your ModSecurity and/or your rule is defined. Check your Apache config. 406 is a little unusual as would normally expect a 403 (access denied) or 500 (internal server error).
What's the 406.shtml file? That I don't get. A .shtml is a HTML file which also allows server side includes to embed other files and code into an HTML file. They are not used much any more to be honest as the likes of PHP and/or other languages are more common. It could be an attack: I.e. someone's attempting to upload the 406.shtml file and then cause it to be called so it "executes" and includes the contents of the file, potentially giving access to files Apache can see which are not available on the webserver, or the user has requested that (for some reason) or Apache is configured to show that for 406 errors or the ModSecurity rule is redirecting to that file.
Hopefully that gives a good bit of background, and best thing I can suggest is to go through your Apache config file, and any other config files it loads (including mod_sec.hg.con file which it must load) to fully understand your set up and the. Decide if you need to do anything here.
You could do one of several things:
Leave as is. ModSecurity is doing what it was told to do and blocking this with a 406
Turn off this rule and allow AhrefRef through so you don't get alerted by this.
Alter the ModSecurity config/rule to return an error other than 406 so you can ignore it
Turn off ModSecurity completely. I think it is a good tool and worthwhile but does take some time and effort to get most out of it.
Set up the 406 error page properly. To do that you need to understand why it's attempting to return 406.shtml at the moment.
Also not sure which of these options are available to you as you are on a shared host and might not have full access. If so speak to your hosting provider for advice.

Can I force to display the file in browser rather than download it for a particular sub url?

Can I do this through javascript or modifying the HTTP header?
http://www.example.com/downloads/*
Any files coming out of this should not be auto-download, instead, display on browser. Can I overwrite the rules set by the browser? Can I also set this limit to just this particular sub url?
Thank you.
Thanks.
What type of file are you working with?
This is used through the HTTP header. If the mime type is a certain type, the browser will decide whether to download or display it. You can also force downloading. The file type will help.
For text files, set the content-type to text/plain. For JPEGs, set it to image/jpeg, and for PNGs set it to image/png. This should overwrite any attachment values Django is setting.
You want to use the Content-Disposition header for this. It should any haggling over content-type.
http://www.ietf.org/rfc/rfc2183.txt
The default document type is declared under your server settings, not in how you link to the file. If you are under Apache try looking in httpd.conf for
DefaultType text/plain
If it says something different that may be your problem. text/plain should set all unknowns to download and be viewed in the browser as text.
EDIT:
I don't know any way of modifying this behavior through javascript as it has to be in the header of the file being downloaded.

How can I do an HTTP redirect in C++

I'm making an HTTP server in c++, I notice that the way apache works is if you request a directory without adding a forward slash at the end, firefox still somehow knows that it's a directory you are requesting (which seems impossible for firefox to do, which is why I'm assuming apache is doing a redirect).
Is that assumption right? Does apache check to see that you are requesting a directory and then does an http redirect to a request with the forward slash? If that is how apache works, how do I implement that in c++? Thanks to anyone who replies.
Determine if the resource represents a directory, if so reply with a:
HTTP/1.X 301 Moved Permanently
Location: URI-including-trailing-slash
Using 301 allows user agents to cache the redirect.
If you wanted to do this, you would:
call stat on the pathname
determine that it is a directory
send the necesssary HTTP response for a redirect
I'm not at all sure that you need to do this. Install the Firefox 'web developer' add-on to see exactly what goes back and forth.
Seriously, this should not be a problem. Suggestions for how to proceed:
Get the source code for Apache and look at what it does
Build a debug build of Apache and step through the code in a debugger in such a case; examine which pieces of code get run.
Install Wireshark (network analysis tool), Live HTTP Headers (Firefox extension) etc, and look at what's happening on the network
Read the relevant RFCs for HTTP - which presumably you should be keeping under your pillow anyway if you're writing a server.
Once you've done those things, it should be obvious how to do it. If you can't do those things, you should not be trying to develop a web server in C++.
The assumption is correct and make sure your response includes a Location header to the URL that allows directory listing and a legal 301/302 first line. It is not a C++ question, it is more of a HTTP protocol question, since you are trying to write a HTTP server, as one of the other posts suggests, read the RFC.
You should install Fiddler and observe the HTTP headers sent by other web servers.
Your question is impossible to answer precisely without more details, but you want to send an HTTP 3xx status code with a Location header.

Differentiate nginx behaviour depending on URL

I have a Django application and I use nginx to serve static content. Unfortunately, all registered MIME types get displayed in client browser, while I would like to give an ability to download the same content, along with usual behaviour. Say, I have JPEG file under /media/images/image01.jpg and I want that nginx serves this file in usual way, with standard image/jpeg header, but additionally I want the same image to be served by nginx with content-disposition: attachment (effectively forcing content download) when accessed as, say, /downloads/images/image01.jpg. Anybody can suggest a solution?
Make sure you have the http_headers_module compiled in. (should be by default, if it isn't in the core)
Use "add_header content-disposition attachment;"
I recommend using a url like "/download?file=/downloads/images/image01.jpg" combined with a rewrite rule to avoid some annoying bug later.
Http Headers Module Documention