Persian text in url Django - django

I have some links that include Persian texts, such as:
http://sample.com/fields/طب%20نظامی
And in the view function I want to access to Persian part, so:
url = request.path_info
key = re.findall('/fields/(.+)', url)[0]
But I get the following error:
IndexError at /fields/
list index out of range
Actually, the problem is with the index zero because it can not see anything there! It should be noted that it is a Django project on IIS Server and I have successfully tested it with other servers and the local server. I think it has some thing related to IIS. Moreover I have tried to slugify the url without success. I can encode urls successfully, but I think it is not the actual answer to this question.
Based on the comments:
I checked the request.path too and the same problem. It contains:
/fields/
I implemented a sample django project in local server and here is my views:
def test(request):
t = request.path
return HttpResponse(t)
The results:
http://127.0.0.1:8000/تست/
/تست/
Without any problem.
Based on the #sytech comment, I have created a middlware.py in my app directory:
from django.core.handlers.wsgi import WSGIHandler
class SimpleMiddleware(WSGIHandler):
def __call__(self, environ, start_response):
print(environ['UNENCODED_URL'])
return super().__call__(environ, start_response)
and in settings.py:
MIDDLEWARE = [
...
'apps.middleware.SimpleMiddleware',
]
But I am getting the following error:
__call__() missing 1 required positional argument: 'start_response'

Assuming you don't have another problem in your rewrite configuration, on IIS, depending on your rewrite configuration, you may need to access this through the UNENCODED_URL variable which will contain the unencoded value.
This can be demonstrated in a simple WSGI middleware:
from django.core.handlers.wsgi import WSGIHandler
class MyHandler(WSGIHandler):
def __call__(self, environ, start_response):
print(environ['UNENCODED_URL'])
return super().__call__(environ, start_response)
You would see the unencoded URL and the path part that's in Persian would be passed %D8%B7%D8%A8%2520%D9%86%D8%B8%D8%A7%D9%85%DB%8C. Which you can then decode with urllib.parse.unquote
urllib.parse.unquote('%D8%B7%D8%A8%2520%D9%86%D8%B8%D8%A7%D9%85%DB%8C')
# طب%20نظامی
If you wanted, you could use a middleware to set this as an attribute on the request object or even override the request.path_info.
You must be using URL rewrite v7.1.1980 or higher for this to work.
You could also use the UNENCODED_URL directly in the rewrite rule, but that may result in headaches with routing.
I can encode urls successfully, but I think it is not the actual answer to this question.
Yeah, that is another option, but may result in other issues like this: IIS10 URL Rewrite 2.1 double encoding issue

You can do this by using python split() method
url = "http://sample.com/fields/طب%20نظامی"
url_key = url.split(sep="/", maxsplit=4)
url_key[-1]
output : 'طب%20نظامی'
in this url is splited by / which occurs 4 time in string so it will return a list like this
['http:', '', 'sample.com', 'fields', 'طب%20نظامی']
then extract result like this url_key[-1] from url_key

you can Split the URL by :
string = http://sample.com/fields/طب%20نظامی
last_part = string. Split("/")[-1]
print(last_part)
output :< طب%20نظامی >
slugify(last_part)
or
slugify(last_part, allow_unicode=True)
I guess This Will Help You :)

Related

In Django, How do you write the url pattern for a json file?

My website generates json files in saves them with the user specified name: test.json
I couldn't immediately access it, so I assumed that I would have to write a URL pattern and view to see this file.
I want something akin to this :
url(r'^(?P<Controller>).json$', views.loadjson, name='loadjson')
If you need to serve static JSON files then doing so through Django is not the best way - if that's the case use something like Nginx.
If you wan't Django to generate those files and/or for e.g. an authentication mechanism in front of it then ok for Django.
To your question:
# urls.py
url(r'^(?P<json_file>[\w]+).json$', views.loadjson, name='loadjson')
It seems like you forgotten about [\w]+ which I guess is the right pattern for your needs.
the view function will be called with a WSGIRequest instance and the json_file argument (from your url pattern), from there you should be able to do whatever you like with the file. It's a good idea to return in as an application/json content type as those are supposed to be JSON.
# views.py
from django.http import HttpResponse
def loadjson(request, json_file):
# open, generate, fetch the json file
# for e.g.:
json_content = read_file(json_file)
return HttpResponse(
json_content,
content_type='application/json',
status=200
)

fix_location_header in Django causes wrong redirection to LOGIN_URL

I am trying to send an oauth request from another application to Django. Django is hosted on 8000 port and the application on 8080. Here is the URL called from the application:
http://localhost:8000/o/authorize/?client_id=MM8i27ROetg3OQv60tDcJwp7JKXi28FqkuQgetyG&response_type=token&redirect_url=http://localhost:8080/auth/add_token
This will get redirected to a wrong location:
http://192.168.56.101:8000/o/authorize/accounts/login/?next=/o/authorize/%3Fclient_id%3DMM8i27ROetg3OQv60tDcJwp7JKXi28FqkuQgetyG%26response_type%3Dtoken%26redirect_url%3Dhttp%3A//192.168.56.101%3A8080/auth/add_token
when I expect it to be without the prefix /o/authorize which is is generated by WSGI handler using HTTP header called PATH_INFO:
http://192.168.56.101:8000/accounts/login/?next=/o/authorize/%3Fclient_id%3DMM8i27ROetg3OQv60tDcJwp7JKXi28FqkuQgetyG%26response_type%3Dtoken%26redirect_url%3Dhttp%3A//192.168.56.101%3A8080/auth/add_token
I have following routes in urls.py and they work properly if it is not for the weird redirection I am facing.
url(r'^o/', include('oauth2_provider.urls', namespace='oauth2_provider')),
# OAuth User Info
url(r'^o/api/get_userinfo/$', oauth2.GetuserInfoView.as_view(), name="get-userinfo"),
In settings.py I have:
LOGIN_URL = "/accounts/login"
Now here are the lines of Django code that I find problematic.
django/core/handlers/base.py:220
# Apply response middleware, regardless of the response
for middleware_method in self._response_middleware:
response = middleware_method(request, response)
# Complain if the response middleware returned None (a common error).
if response is None:
raise ValueError(
"%s.process_response didn't return an "
"HttpResponse object. It returned None instead."
% (middleware_method.__self__.__class__.__name__))
response = self.apply_response_fixes(request, response)
# ^^^------------this will lead to fix_location_header being called.
django/http/utils.py
def fix_location_header(request, response):
"""
Ensures that we always use an absolute URI in any location header in the
response. This is required by RFC 2616, section 14.30.
Code constructing response objects is free to insert relative paths, as
this function converts them to absolute paths.
"""
if 'Location' in response:
response['Location'] = request.build_absolute_uri(response['Location'])
return response
django/request.py
def build_absolute_uri(self, location=None):
"""
Builds an absolute URI from the location and the variables available in
this request. If no ``location`` is specified, the absolute URI is
built on ``request.get_full_path()``. Anyway, if the location is
absolute, it is simply converted to an RFC 3987 compliant URI and
returned and if location is relative or is scheme-relative (i.e.,
``//example.com/``), it is urljoined to a base URL constructed from the
request variables.
"""
if location is None:
# Make it an absolute url (but schemeless and domainless) for the
# edge case that the path starts with '//'.
location = '//%s' % self.get_full_path()
bits = urlsplit(location)
if not (bits.scheme and bits.netloc):
current_uri = '{scheme}://{host}{path}'.format(scheme=self.scheme,
host=self.get_host(),
path=self.path)
# ^^^---The location URL forming logic that I find problematic.
# Join the constructed URL with the provided location, which will
# allow the provided ``location`` to apply query strings to the
# base path as well as override the host, if it begins with //
location = urljoin(current_uri, location)
return iri_to_uri(location)
I included the comments from the source file so the reader could know there are use cases for them.
So what is the proper way of getting around apply_reponse_fix? 1) inherit and rewrite the function 2) modify HTTP header PATH_INFO... somehow 3) change the logic for Location header in response so that it is a full URI with http:// prefix so that bits.scheme is not None. None of them seem like a proper fix for me, so I am wondering if I overlooked a convention. Any insight will be very helpful.
This has been already addressed and fixed later version of Django 1.8
fix_location_header is removed: https://code.djangoproject.com/ticket/23960
Just upgrading to the latest version made it go away:
pip install "django>=1.8,<1.9"

Django URLConf Redirect with odd characters

I'm getting ready to move an old Classic ASP site to a new Django system. As part of the move we have to setup some of the old URLs to point to the new ones.
For example,
http://www.domainname.com/category.asp?categoryid=105 should 301 to http://www.domainname.com/some-category/
Perhaps I'm regex stupid or something, but for this example, I've included in my URLconf this:
(r'^asp/category\.asp\?categoryid=105$', redirect_to, {'url': '/some-category/'}),
My thinking is that I have to escape the . and the ? but for some reason when I go to test this, it does not redirect to /some-category/, it just 404s the URL as entered.
Am I doing it wrong? Is there a better way?
To elaborate on Daniel Roseman's answer, the query string is not part of the URL, so you'll probably want to write a view function that will grab the category from the query string and redirect appropriately. You can have a URL like:
(r'^category\.asp', category_redirect),
And a view function like:
def category_redirect(request):
if 'categoryid' not in request.GET:
raise Http404
cat_id = request.GET['category']
try:
cat = Category.objects.get(old_id=cat_id)
except Category.DoesNotExist:
raise Http404
else:
return HttpResponsePermanentRedirect('/%s/' % cat.slug)
(Altered to your own tastes and needs, of course.)
Everything after the ? is not part of the URL. It's part of the GET parameters.

Django: creating/modifying the request object

I'm trying to build an URL-alias app which allows the user create aliases for existing url in his website.
I'm trying to do this via middleware, where the request.META['PATH_INFO'] is checked against database records of aliases:
try:
src: request.META['PATH_INFO']
alias = Alias.objects.get(src=src)
view = get_view_for_this_path(request)
return view(request)
except Alias.DoesNotExist:
pass
return None
However, for this to work correctly it is of life-importance that (at least) the PATH_INFO is changed to the destination path.
Now there are some snippets which allow the developer to create testing request objects (http://djangosnippets.org/snippets/963/, http://djangosnippets.org/snippets/2231/), but these state that they are intended for testing purposes.
Of course, it could be that these snippets are fit for usage in a live enviroment, but my knowledge about Django request processing is too undeveloped to assess this.
Instead of the approach you're taking, have you considered the Redirects app?
It won't invisibly alias the path /foo/ to return the view bar(), but it will redirect /foo/ to /bar/
(posted as answer because comments do not seem to support linebreaks or other markup)
Thank for the advice, I have the same feeling regarding modifying request attributes. There must be a reason that the Django manual states that they should be considered read only.
I came up with this middleware:
def process_request(self, request):
try:
obj = A.objects.get(src=request.path_info.rstrip('/')) #The alias record.
view, args, kwargs = resolve_to_func(obj.dst + '/') #Modified http://djangosnippets.org/snippets/2262/
request.path = request.path.replace(request.path_info, obj.dst)
request.path_info = obj.dst
request.META['PATH_INFO'] = obj.dst
request.META['ROUTED_FROM'] = obj.src
request.is_routed = True
return view(request, *args, **kwargs)
except A.DoesNotExist: #No alias for this path
request.is_routed = False
except TypeError: #View does not exist.
pass
return None
But, considering the objections against modifying the requests' attributes, wouldn't it be a better solution to just skip that part, and only add the is_routed and ROUTED_TO (instead of routed from) parts?
Code that relies on the original path could then use that key from META.
Doing this using URLConfs is not possible, because this aliasing is aimed at enabling the end-user to configure his own URLs, with the assumption that the end-user has no access to the codebase or does not know how to write his own URLConf.
Though it would be possible to write a function that converts a user-readable-editable file (XML for example) to valid Django urls, it feels that using database records allows a more dynamic generation of aliases (other objects defining their own aliases).
Sorry to necro-post, but I just found this thread while searching for answers. My solution seems simpler. Maybe a) I'm depending on newer django features or b) I'm missing a pitfall.
I encountered this because there is a bot named "Mediapartners-Google" which is asking for pages with url parameters still encoded as from a naive scrape (or double-encoded depending on how you look at it.) i.e. I have 404s in my log from it that look like:
1.2.3.4 - - [12/Nov/2012:21:23:11 -0800] "GET /article/my-slug-name%3Fpage%3D2 HTTP/1.1" 1209 404 "-" "Mediapartners-Google
Normally I'd just ignore a broken bot, but this one I want to appease because it ought to better target our ads (It's google adsense's bot) resulting in better revenue - if it can see our content. Rumor is it doesn't follow redirects so I wanted to find a solution similar to the original Q. I do not want regular clients accessing pages by these broken urls, so I detect the user-agent. Other applications probably won't do that.
I agree a redirect would normally be the right answer.
My (complete?) solution:
from django.http import QueryDict
from django.core.urlresolvers import NoReverseMatch, resolve
class MediapartnersPatch(object):
def process_request(self, request):
# short-circuit asap
if request.META['HTTP_USER_AGENT'] != 'Mediapartners-Google':
return None
idx = request.path.find('?')
if idx == -1:
return None
oldpath = request.path
newpath = oldpath[0:idx]
try:
url = resolve(newpath)
except NoReverseMatch:
return None
request.path = newpath
request.GET = QueryDict(oldpath[idx+1:])
response = url.func(request, *url.args, **url.kwargs)
response['Link'] = '<%s>; rel="canonical"' % (oldpath,)
return response

How to generate temporary URLs in Django

Wondering if there is a good way to generate temporary URLs that expire in X days. Would like to email out a URL that the recipient can click to access a part of the site that then is inaccessible via that URL after some time period. No idea how to do this, with Django, or Python, or otherwise.
If you don't expect to get a large response rate, then you should try to store all of the data in the URL itself. This way, you don't need to store anything in the database, and will have data storage proportional to the responses rather than the emails sent.
Updated: Let's say you had two strings that were unique for each user. You can pack them and unpack them with a protecting hash like this:
import hashlib, zlib
import cPickle as pickle
import urllib
my_secret = "michnorts"
def encode_data(data):
"""Turn `data` into a hash and an encoded string, suitable for use with `decode_data`."""
text = zlib.compress(pickle.dumps(data, 0)).encode('base64').replace('\n', '')
m = hashlib.md5(my_secret + text).hexdigest()[:12]
return m, text
def decode_data(hash, enc):
"""The inverse of `encode_data`."""
text = urllib.unquote(enc)
m = hashlib.md5(my_secret + text).hexdigest()[:12]
if m != hash:
raise Exception("Bad hash!")
data = pickle.loads(zlib.decompress(text.decode('base64')))
return data
hash, enc = encode_data(['Hello', 'Goodbye'])
print hash, enc
print decode_data(hash, enc)
This produces:
849e77ae1b3c eJzTyCkw5ApW90jNyclX5yow4koMVnfPz09JqkwFco25EvUAqXwJnA==
['Hello', 'Goodbye']
In your email, include a URL that has both the hash and enc values (properly url-quoted). In your view function, use those two values with decode_data to retrieve the original data.
The zlib.compress may not be that helpful, depending on your data, you can experiment to see what works best for you.
You could set this up with URLs like:
http://yoursite.com/temp/1a5h21j32
Your URLconf would look something like this:
from django.conf.urls.defaults import *
urlpatterns = patterns('',
(r'^temp/(?P<hash>\w+)/$', 'yoursite.views.tempurl'),
)
...where tempurl is a view handler that fetches the appropriate page based on the hash. Or, sends a 404 if the page is expired.
models
class TempUrl(models.Model):
url_hash = models.CharField("Url", blank=False, max_length=32, unique=True)
expires = models.DateTimeField("Expires")
views
def generate_url(request):
# do actions that result creating the object and mailing it
def load_url(request, hash):
url = get_object_or_404(TempUrl, url_hash=hash, expires__gte=datetime.now())
data = get_some_data_or_whatever()
return render_to_response('some_template.html', {'data':data},
context_instance=RequestContext(request))
urls
urlpatterns = patterns('', url(r'^temp/(?P<hash>\w+)/$', 'your.views.load_url', name="url"),)
//of course you need some imports and templates
It depends on what you want to do - one-shot things like account activation or allowing a file to be downloaded could be done with a view which looks up a hash, checks a timestamp and performs an action or provides a file.
More complex stuff such as providing arbitrary data would also require the model containing some reference to that data so that you can decide what to send back. Finally, allowing access to multiple pages would probably involve setting something in the user's session and then using that to determine what they can see, followed by a redirect.
If you could provide more detail about what you're trying to do and how well you know Django, I can make a more specific reply.
I think the solution lies within a combination of all the suggested solutions. I'd suggest using an expiring session so the link will expire within the time period you specify in the model. Combined with a redirect and middleware to check if a session attribute exists and the requested url requires it you can create somewhat secure parts of your site that can have nicer URLs that reference permanent parts of the site. I use this for demonstrating design/features for a limited time. This works to prevent forwarding... I don't do it but you could remove the temp url after first click so only the session attribute will provide access thus more effectively limiting to one user. I personally don't mind if the temp url gets forwarded knowing it will only last for a certain amount of time. Works well in a modified form for tracking invited visits as well.
It might be overkill, but you could use a uuidfield on your model and set up a Celerybeat task to change the uuid at any time interval you choose.
If celery is too much and it might be, you could just store the time the URL is first sent, use the timedelta function whenever it is sent thereafter, and if the elapsed time is greater than what you want just use a redirect. I think the second solution is very straightforward and it would extend easily. It would be a matter of having a model with the URL, time first sent, time most recently sent, a disabled flag, and a Delta that you find acceptable for the URL to live.
A temporary url can also be created by combining the ideas from #ned-batchelder's answer and #matt-howell's answer with Django's signing module.
The signing module provides a convenient way to encode data in the url, if necessary, and to check for link expiration. This way we don't need to touch the database or session/cache.
Here's a minimal example with an index page and a temp page:
The index page has a link to a temporary url, with the specified expiration. If you try to follow the link after expiration, you'll get a status 400 "Bad Request" (or you'll see the SuspiciousOperation error, if DEBUG is True).
urls.py
...
urlpatterns = [
path('', views.index, name='index'),
path('<str:signed_data>/', views.temp, name='temp'),
]
views.py
from django.core import signing
from django.core.exceptions import SuspiciousOperation
from django.http import HttpResponse
from django.urls import reverse
MAX_AGE_SECONDS = 20 # short expiration, for illustrative purposes
def generate_temp_url(data=None):
# signing.dumps() returns a "URL-safe, signed base64 compressed JSON string"
# with a timestamp
return reverse('temp', args=[signing.dumps(data)])
def index(request):
# just a convenient usage example
return HttpResponse(f'temporary link')
def temp(request, signed_data):
try:
# load data and check expiration
data = signing.loads(signed_data, max_age=MAX_AGE_SECONDS)
except signing.BadSignature:
# triggers an HttpResponseBadRequest (status 400) when DEBUG is False
raise SuspiciousOperation('invalid signature')
# success
return HttpResponse(f'Here\'s your data: {data}')
Some notes:
The responses in the example are very rudimentary, and only for illustrative purposes.
Raising a SuspiciousOperation is convenient, but you could e.g. return an HttpResponseNotFound (status 404) instead.
The generate_temp_url() returns a relative path. If you need an absolute url, you can do something like:
temp_url = request.build_absolute_uri(generate_temp_url())
If you're worried about leaking the signed data, have a look at alternatives such as Django's password reset implementation.