Best way to handle legacy urls in django - django

I am working on a big news publishing platform. Basically rebuilding everything from ground zero with django. Now as we are almost ready for the launch I need to handle legacy url redirects. What is the best way to do it having in mind that I have to deal with tenths of thousands of legacy urls?
Logic should work like this: If none of existing urls/views where matched run that url thorough legacy Redirect urls patterns/views to see if it can provide some redirect to the new url before returning 404 error.
How do I do that?

You may want to create a fallback view that will try to handle any url not handled by your patterns. I see two options.
Just create a "default" pattern. It's important to this pattern to
be the last within your urlpatterns!
in your urls.py:
urlpatterns = patterns(
'',
# all your patterns
url(r'^.+/', 'app.views.fallback')
)
in your views.py:
from django.http.response import HttpResponseRedirect, Http404
def fallback(request):
if is_legacy(request.path):
return HttpResponseRedirect(convert(request.path))
raise Http404
Create a custom http 404 handler.
in your urls.py:
handler404 = 'app.views.fallback'
in your views.py
from django.http.response import HttpResponseRedirect
from django.views.defaults import page_not_found
def fallback(request):
if is_legacy(request.path):
return HttpResponseRedirect(convert(request.path))
return page_not_found(request)
it may seem to be a nicer solution but it will only work if you set DEBUG setting to False and provide custom 404 template.

Awesome, achieved that by using custom middleware:
from django.http import Http404
from legacy.urls import urlpatterns
class LegacyURLsMiddleware(object):
def process_response(self, request, response):
if response.status_code != 404:
return response
for resolver in urlpatterns:
try:
match = resolver.resolve(request.path[1:])
if match:
return match.func(request, *match.args, **match.kwargs)
except Http404:
pass
return response
Simply add this middleware as a last middleware in MIDDLEWARE_CLASSES list. Then use urls.py file in your legacy app to declare legacy urls and views which will handle permanent redirects. DO NOT include your legacy urls in to main urls structure. This middleware does it for you, but in a bit different way.

Use the Jacobian's django-multiurl. There is a django ticket to address the issue someday, but for now django-multiurl works very good.
Before:
# urls.py
urlpatterns = patterns('',
url('/app/(\w+)/$', app.views.people),
url('/app/(\w+)/$', app.views.place), # <-- Never matches
)
After:
# urls.py
from multiurl import multiurl, ContinueResolving
from django.http import Http404
urlpatterns = patterns('', multiurl(
url('/app/(\w+)/$', app.views.people), # <-- Tried 1st
url('/app/(\w+)/$', app.views.place), # <-- Tried 2nd (only if 1st raised Http404)
catch=(Http404, ContinueResolving)
))

Related

Django 1.10: login required middleware redirect loop

I'm writing a bit of middleware to effectively make #login_required on all pages. Unfortunately what I've got results in a redirect loop.
The implementation is using "old" style middleware with 1.10 via MiddlewareMixin and process_request() hook in attempt to redirect to login page whenever the user is not authenticated.
First, I'm using the default auth urls django.contrib.auth.urls. The docs say:
This will include the following URL patterns: ^login/$ [name='login']...
# main URLConf urls.py
urlpatterns = [
url(r'^admin/', admin.site.urls),
url(r'^', include('django.contrib.auth.urls')), # https://docs.djangoproject.com/en/1.10/topics/auth/default/#module-django.contrib.auth.views
]
Then here's the middleware (yes it's added to MIDDLEWARE in settings.py):
from django.http import HttpResponseRedirect
from django.utils.deprecation import MiddlewareMixin
class LoginRequiredMiddleware(MiddlewareMixin):
def process_request(self, request):
if not request.user.is_authenticated():
return HttpResponseRedirect('/login/')
The login page/functionality works fine when the my middleware isn't included, while including it results in every url causing ERR_TOO_MANY_REDIRECTS.
What am I missing? Thanks.
Doh! I needed to check for /login/ in process_request and ignore it.
Here's a simplified version of what is implemented. The real version uses settings.py and regexes to define the login exempt urls. Much credit to Ryan Witt's post on this approach.
class LoginRequiredMiddleware(MiddlewareMixin):
def process_request(self, request):
if not request.user.is_authenticated():
path = request.path_info.lstrip('/')
# If path is not root url ('') and path is not exempt from authentication
if not path or not any(path != eu for eu in ["/login", "admin"]):
return HttpResponseRedirect("/login/")

Recursive URL routing in Django

I want to have a (fairly simple) emulation of SELECT queries through URLs.
For example, in a blogging engine, You'd like /tag/sometag/ to refer to the posts having the sometag tag. Also /tag/sometag/or/tag/other/and/year/2013 should be a valid URL, beside other more complex urls. So, having (theoretically) no limit on the size of the url, I would suggest this should be done recursively, but how could it be handled in Django URL Routing model?
I would use a common URL pattern for all those URLs.
url(r'^query/([\w/]*)/$', 'app.views.view_with_query'),
You would receive all the "tag/sometag/or/tag/other/and/year/2013" as a param for the view.
Then, you can parse the param and extract the info (tag, value, tag, value, year, value) to make the query.
update 2021
django.conf.urls.url is deprecated in version 3.1 and above. Here is the solution for recursive url routing in django:
urls.py
from django.urls import path, re_path
from .views import index
urlpatterns = [
re_path(r'^query/([\w/]*)/$', index, name='index'),
]
views.py
from django.shortcuts import render, HttpResponse
# Create your views here.
def index(request, *args, **kwargs):
print('-----')
print(args)
print(kwargs)
return HttpResponse('<h1>hello world</h1>')
If I call python manage.py run server, and go to 'http://127.0.0.1:8000/query/nice/', I can see these in terminal:
-----
('nice',)
{}
I know this is an old question but for those who reach here in future, starting from django 2.0+, you can use path converter with path to match the path in the url:
# urls.py
urlpatterns = [
path('query/<path:p>/', views.query, name='query'),
]
# views.py
def query(request, p):
# here, p = "tag/sometag/or/tag/other/and/year/2013"
...

Django 404 pages return 200 status code

I'm going through the Django tutorial and am on part 5: Testing. I run into the problem where I'm using the DetailView and ListView "shortcut" views to factor out code (as suggested by the tutorial), but when a 404 page is displayed, a 200 status code is returned instead. Am I doing something wrong? The tutorial says the status code should be 404.
Thanks!
You need to define the Http header to have a 404 status.
return HttpResponse(content=template.render(context), content_type='text/html; charset=utf-8', status=404)
It is important to inform the search engines that the current page is a 404. Spammers sometimes creates lots of urls that could seem that would lead you to some place, but then serves you another content. They frequently make lots of different addresses serve you almost the exact same content. And because it is not user friendly, most SEO guide lines penalize that. So if you have lots of addresses showing the same pseudo-404 content, it could not look good to the crawling systems from the search websites. Because of that you want to make sure that the page you are serving as a custom 404 has a 404 status.
If you are trying to make a custom 404 page, here it is a good way to go:
Into your application's urls.py add:
# Imports
from django.conf.urls.static import static
from django.conf.urls import handler404
from django.conf.urls import patterns, include, url
from yourapplication import views
##
# Handles the URLS calls
urlpatterns = patterns('',
# url(r'^$', include('app.homepage.urls')),
)
handler404 = views.error404
Into your application's views.py add:
# Imports
from django.shortcuts import render
from django.http import HttpResponse
from django.template import Context, loader
##
# Handle 404 Errors
# #param request WSGIRequest list with all HTTP Request
def error404(request):
# 1. Load models for this view
#from idgsupply.models import My404Method
# 2. Generate Content for this view
template = loader.get_template('404.htm')
context = Context({
'message': 'All: %s' % request,
})
# 3. Return Template for this view + Data
return HttpResponse(content=template.render(context), content_type='text/html; charset=utf-8', status=404)
The secret is in the last line: status=404
Hope it helped!
I look forward to see the community inputs to this approach. =)
You can
return HttpResponseNotFound(render_to_string('404.html'))
instead.

Custom Django 404 error

I have a 404.html page, but in some cases I want to be able to send a json error message (for 404 and 500, etc.). I read the following page:
https://docs.djangoproject.com/en/dev/topics/http/views/#the-404-page-not-found-view
Is there any sort of example that shows the implementation? I have it in my urls.py but it's not being picked up in the event of an error.
This worked for me:
from django.conf.urls import patterns, include, url
from django.views.static import *
from django.conf import settings
from django.conf.urls.defaults import handler404, handler500
from app.views import error
urlpatterns = patterns('',
# Examples:
# url(r'^$', 'app.views.home', name='home'),
)
handler404 = error.error_handler
handler500 = error.error_handler
You can make it do anything as you wish when going to that controller.
In addition to the previous answer, it is important to say that the views.py should return a HttpResponse with a 404 status in the http header. It is important to inform the search engines that the current page is a 404. Spammers sometimes creates lots of urls that could seem that would lead you to some place, but then serves you another content. They frequently make lots of different addresses serve you almost the exact same content. And because it is not user friendly, most SEO guide lines penalize that. So if you have lots of addresses showing the same pseudo-404 content, it could not look good to the crawling systems from the search websites. Because of that you want to make sure that the page you are serving as a custom 404 has a 404 status. So here it is a good way to go:
Into your application's urls.py add:
# Imports
from django.conf.urls.static import static
from django.conf.urls import handler404
from django.conf.urls import patterns, include, url
from yourapplication import views
##
# Handles the URLS calls
urlpatterns = patterns('',
# url(r'^$', include('app.homepage.urls')),
)
handler404 = views.error404
Into your application's views.py add:
# Imports
from django.shortcuts import render
from django.http import HttpResponse
from django.template import Context, loader
##
# Handle 404 Errors
# #param request WSGIRequest list with all HTTP Request
def error404(request):
# 1. Load models for this view
#from idgsupply.models import My404Method
# 2. Generate Content for this view
template = loader.get_template('404.htm')
context = Context({
'message': 'All: %s' % request,
})
# 3. Return Template for this view + Data
return HttpResponse(content=template.render(context), content_type='text/html; charset=utf-8', status=404)
The secret is in the last line: status=404
Hope it helped!
I look forward to see the community inputs to this approach. =)
Basics:
To define custom view for handling 404 errors, define in the URL config, a view for handler404, like handler404 = 'views.error404'
Apart from the basics, some things to note about (custom 404 views):
It will be enabled only in Debug=False mode.
And more ignored one, across most answers (and this this stuck my brains out).
The 404 view defaults to
django.views.defaults.page_not_found(request, exception, template_name='404.html')
Notice the parameter exception
This was causing a 404 to 500 redirect from within def get_exception_response(self, request, resolver, status_code, exception) function defined in core.handlers.base since it could not find the parameter exception

How can I not use Django's admin login view?

I created my own view for login. However if a user goes directly to /admin it brings them to the admin login page and doesn't use my custom view. How can I make it redirect to the login view used for everything not /admin?
From http://djangosnippets.org/snippets/2127/—wrap the admin login page with login_required. For example, in urls.py:
from django.contrib.auth.decorators import login_required
from django.contrib import admin
admin.autodiscover()
admin.site.login = login_required(admin.site.login)
You probably already have the middle two lines and maybe even the first line; adding that fourth line will cause anything that would have hit the admin.site.login function to redirect to your LOGIN_URL with the appropriate next parameter.
While #Isaac's solution should reject majority of malicious bots, it doesn't provide protection for professional penetrating. As a logged in user gets the following message when trying to login to admin:
We should instead use the admin decorator to reject all non-privileged users:
from django.contrib.admin.views.decorators import staff_member_required
from django.contrib import admin
[ ... ]
admin.site.login = staff_member_required(admin.site.login, login_url=settings.LOGIN_URL)
To the best of my knowledge, the decorator was added in 1.9.
I found that the answer above does not respect the "next" query parameter correctly.
An easy way to solve this problem is to use a simple redirect. In your site's urls file, immediately before including the admin urls, put a line like this:
url(r'^admin/login$', RedirectView.as_view(pattern_name='my_login_page', permanent=True, query_string=True))
Holá
I found a very simple solution.
Just tell django that the url for admin login is handle by your own login view
You just need to modify the urls.py fle of the project (note, not the application one)
In your PROJECT folder locate the file urls.py.
Add this line to the imports section
from your_app_name import views
Locate this line
url(r'^admin/', include(admin.site.urls))
Add above that line the following
url(r'^admin/login/', views.your_login_view),
This is an example
from django.conf.urls import include, url
from django.contrib import admin
from your_app import views
urlpatterns = [
url(r'^your_app_start/', include('your_app.urls',namespace="your_app_name")),
url(r'^admin/login/', views.your_app_login),
url(r'^admin/', include(admin.site.urls)),
]
http://blog.montylounge.com/2009/07/5/customizing-django-admin-branding/
(web archive)
I'm trying to solve exactly this problem and I found the solution at this guys blog. Basically, override the admin template and use your own template. In short, just make a file called login.html in /path-to-project/templates/admin/ and it will replace the admin login page. You can copy the original (django/contrib/admin/templates/login.html) and modify a line or two. If you want to scrap the default login page entirely you can do something like this:
{% extends "my-login-page.html" %}
There it is. One line in one file. Django is amazing.
I had the same issue, tried to use the accepted answer, but has the same issue as pointed in the comment above.
Then I've did something bit different, pasting here if this would be helpful to someone.
def staff_or_404(u):
if u.is_active:
if u.is_staff:
return True
raise Http404()
return False
admin.site.login = user_passes_test(
staff_or_404,
)(admin.site.login)
The idea is that if the user is login, and tried to access the admin, then he gets 404. Otherwise, it will force you to the normal login page (unless you are already logged in)
In your ROOT_URLCONF file (by default, it's urls.py in the project's root folder), is there a line like this:
urlpatterns = patterns('',
...
(r'^admin/', include(admin.site.urls)),
...
)
If so, you'd want to replace include(admin.site.urls) with the custom view you created:
(r'^admin/', 'myapp.views.myloginview'),
or if your app has its own urls.py, you could include it like this:
(r'^admin/', include(myapp.urls)),
This is my solution with custom AdminSite class:
class AdminSite(admin.AdminSite):
def _is_login_redirect(self, response):
if isinstance(response, HttpResponseRedirect):
login_url = reverse('admin:login', current_app=self.name)
response_url = urllib.parse.urlparse(response.url).path
return login_url == response_url
else:
return False
def admin_view(self, view, cacheable=False):
inner = super().admin_view(view, cacheable)
def wrapper(request, *args, **kwargs):
response = inner(request, *args, **kwargs)
if self._is_login_redirect(response):
if request.user.is_authenticated():
return HttpResponseRedirect(settings.LOGIN_REDIRECT_URL)
else:
return redirect_to_login(request.get_full_path(), reverse('accounts_login'))
else:
return response
return wrapper
You can redirect admin login url to the auth login view :
from django.contrib import admin
from django.urls import path, include
urlpatterns = [
path('', include('your_app.urls')),
path('accounts/', include('django.contrib.auth.urls')),
path('admin/login/', RedirectView.as_view(url='/accounts/login/?next=/admin/', permanent=True)),
path('admin/', admin.site.urls),
]
As of August 2020, django.contrib.admin.sites.AdminSite has a login_template attribute. So you can just subclass AdminSite and specify a custom template i.e.,
class MyAdminSite(AdminSite):
login_template = 'my_login_template.html'
my_admin_site = MyAdminSite()
Then just use my_admin_site everywhere instead of admin.site.