Django - Get raw request path - django

How do I get the raw request path (everything after the host name and the port) in Django?
I tried request.get_full_path(), but it doesn’t work for some URLs.
For example, when the URL is http://localhost:8000/data/?, the result is /data/ instead of /data/?.
I know that the server receives the full string because it show "GET /data/? HTTP/1.1" 200 642 in the terminal.

You can use request.build_absolute_uri()
Depends on source code:
Builds an absolute URI from the location and the variables available in
this request. If no location is specified, the absolute URI is
built on request.get_full_path(). Anyway, if the location is
absolute, it is simply converted to an RFC 3987 compliant URI and
returned and if location is relative or is scheme-relative (i.e.,
//example.com/), it is urljoined to a base URL constructed from the
request variables.

Getting access to the raw URL depends on your setup. I'm not aware of a way to do this with Django's development server.
HTTPRequest.META is described as follows:
A dictionary containing all available HTTP headers. Available headers depend on the client and server, but here are some examples:
If you're running gunicorn, request.META["RAW_URI"] might give you what you need.

Related

apache does not pass request to Django (404 not found)

I have a custom 404 page setup for the site, which works fine, like this:
when I hit mysite.com/fdsafsadfdsa which doesn't exist, the custom 404 page shows up.
However if I add a urlencoded '/' which is '%2f' at the end of url, mysite.com/fdsafsadfdsa%2f, and this gives me the apache 404 not found.
it looks like apache decided to handle this 404 itself instead of passing down to Django
Anybody has idea why this is happening?
Turns out it's a issue in Apache/Nginx. And somebody submit this issue to Django project before, see here: https://code.djangoproject.com/ticket/15718
and quote from the ticket, there's a workaround:
After investigation I've found that the 2nd issue (404 error directly from apache) is not related to django and can be avoided by adding "AllowEncodedSlashes On" into apache config. Unfortunately apache replaces %2f with / itself, so the behavior is exactly the same as in simple http server provided by django. In Apache 2.2.18 (which is not released yet, i guess), AllowEncodedSlashes allows value NoDecode. With the value NoDecode, such URLs are accepted, but encoded slashes are not decoded but left in their encoded state. Meanwhile I'm using the workaround
request_uri = force_unicode(environ.get('REQUEST_URI', u'/'))
if u'?' in request_uri:
path_info,query = request_uri.split('?',1)
else:
path_info,query = request_uri,''
instead of original
path_info = force_unicode(environ.get('PATH_INFO', u'/'))
in core/handlers/wsgi.py

HTTP headers list

I am studying Django and have created a page that shows all HTTP headers in a request using request.META dictionary. I'm running it locally and it the page shows me a weird amount of headers like 'TEMP' containing the path to my Windows temp folder, or 'PATH' with my full path parameters and much more information that I don't really find necessary to share in my browser requests (like installed applications).
Is it normal? What do I do about it?
So, let's jump quickly into Django's source code:
django/core/handlers/wsgi.py
class WSGIRequest(http.HttpRequest):
def __init__(self, environ):
...
self.META = environ
self.META['PATH_INFO'] = path_info
self.META['SCRIPT_NAME'] = script_name
...
This handler is used by default in runserver command and every other wsgi server. The environ dictionary comes from the underlying web server. And it is filled with lots of data. You can read more about environ dictionary here in the official wsgi docs:
https://www.python.org/dev/peps/pep-0333/#environ-variables
Also note that any web server is free to add its own variables to environ. I assume that's why you see things like TEMP. They are probably used internally by the web server.
If you wish to get headers only then wsgi mandates that headers have to start with HTTP_ prefix with the exception of CONTENT_TYPE and CONTENT_LENGTH headers.
So Django's docs are misleading. The META field contains more then headers only. It is neither correct nor incorrect, it's just how it is. Special care has to be taken when dealing with META. Leaking some of the data might be a serious security issue.

How to implement HTTP methods using fastCGI along with nginx?

I am trying to work with basic HTTP using FastCGI and Nginx in c++. I have found the link for fastcgi here: http://chriswu.me/blog/getting-request-uri-and-content-in-c-plus-plus-fcgi/
But there is no clear distinction for HTTP methods like GET and POST. Also, I am unable to figure out how to perform redirection of url using fastcgi. P
I do not have experience with fastCGI and nginx, but since I have used CGI/Apache and took a look of fastCGI samples, I can suggest the following things (and risk to fail with the answer):
GET request is part of URL, so I would parse const char* uri = FCGX_GetParam("REQUEST_URI", request.envp); to check if parameters are given (i.e. if there are key/value pairs after the question mark).
If the previous condition is false, then check if media type in the header is application/x-www-form-urlencoded (meaning it's a POST) and parse HTTP request body to obtain key/value pairs. More info on that can be found at Wikipedia
To perform redirection, use the example but modify response to return HTTP redirection response as described at Wikipedia.
Perhaps fastCGI offers more advanced functions, so all of these can be achieved in a fancy way.

Do all mainstream browsers use host headers when sending HTTP requests?

My server is mapped to 2 domain names, and I want to return different web pages when a user is visiting the home page, based on which domain name is used.
Django has a get_host() function in request object, Django doc:
get_host() Returns the originating host of the request using information from the HTTP_X_FORWARDED_HOST (if USE_X_FORWARDED_HOST is enabled) and HTTP_HOST headers, in that order. If they don’t provide a value, the method uses a combination of SERVER_NAME and SERVER_PORT as detailed in PEP 3333.
I am not sure if every mainstream browsers respect these headers.
Can I rely on this function to tell me which domain name is the user visiting?
Yes, all mainstream browsers send the Host header as it is mandatory for all requests sent via HTTP/1.1. Many HTTP/1.0 clients also support this header.

Differentiate nginx behaviour depending on URL

I have a Django application and I use nginx to serve static content. Unfortunately, all registered MIME types get displayed in client browser, while I would like to give an ability to download the same content, along with usual behaviour. Say, I have JPEG file under /media/images/image01.jpg and I want that nginx serves this file in usual way, with standard image/jpeg header, but additionally I want the same image to be served by nginx with content-disposition: attachment (effectively forcing content download) when accessed as, say, /downloads/images/image01.jpg. Anybody can suggest a solution?
Make sure you have the http_headers_module compiled in. (should be by default, if it isn't in the core)
Use "add_header content-disposition attachment;"
I recommend using a url like "/download?file=/downloads/images/image01.jpg" combined with a rewrite rule to avoid some annoying bug later.
Http Headers Module Documention