Django urlpatterns frustrating problem with trailing slashes - regex

All of the examples I can find of urlpatterns for django sites have a separate entry for incoming urls that have no leading slash, or the root folder. Then they handle subfolders on each individual line. I don't understand why a simple
/?
regular expression doesn't permit these to be on one simple line.
Consider the following, let's call the Django project Baloney and the App name is Cheese. So in the project urls.py we have something like this to allow the apps urls.py to handle it's requests...
urlpatterns = patterns('',
(r'^cheese/', include('Baloney.Cheese.urls')),
)
then inside of the Cheese apps urls.py, I don't understand why this one simple line would not trigger as true for all incoming url subpaths, including a blank value...
urlpatterns = patterns('',
(r'^(?P<reqPath>.*)/?$', views.cheeseapp_views),
)
Instead, it matches the blank case, but not the case of a value present. So...
http://baloneysite.com/cheese/ --> MATCHES THE PATTERN
http://baloneysite.com/cheese/swiss --> DOES NOT MATCH
Basically I want to capture the reqPath variable to include whatever is there (even blank or '') but not including any trailing slash if there is one.
The urls are dynamic slugs pulled from the DB so I do all the matching up to content in my views and just need the url patterns to forward the values along. I know that the following works, but don't understand why this can't all be placed on one line with the /? regular expression before the ending $ sign.
(r'^$', views.cheeseapp_views, {'reqPath':''}),
(r'^(?P<reqPath>.*)/$', views.cheeseapp_views),
Appreciate any insights.

I just tried a similar sample and it worked as you wrote it. No need for /?, .* would match that anyway. What is the exact error you are getting? Maybe you have your view without the request parameter? I.e. views.cheeseapp_views should be something like:
def cheeseapp_views(request, reqPath):
...
Edit:
The pattern that you suggested catches the trailing slash into reqPath because * operator is greedy (take a look at docs.python.org/library/re.html). Try this instead:
(r'^(?P<reqPath>.*?)/?$', views.cheeseapp_views)
note it's .*? instead of .* to make it non-greedy.

Related

In Django, using urlpatterns, how do I match a url-encoded slash character as part of a parameter?

In my Django app I have these urlpatterns:
urlpatterns = [
url(r'^$', schema_view, name='swagger'),
url(r'^(?P<page_title>.+)/(?P<rev_id>[0-9]+)/$',
WhoColorApiView.as_view(), name='wc_page_title_rev_id'),
url(r'^(?P<page_title>.+)/$',
WhoColorApiView.as_view(), name='wc_page_title'),
]
The second entry will match a path like /en/whocolor/v1.0.0-beta/my_book_title/1108730659/?origin=* and the third will match a path like /en/whocolor/v1.0.0-beta/my_book_title/?origin=*.
The idea is that it's matching a page_title parameter (a Wikipedia article title), and an optional integer rev_id (revision id) parameter.
However, it does not work as intended for an article titled "Post-9/11", with path /en/whocolor/v1.0.0-beta/Post-9%2f11/?origin=*. I want this not to match the second pattern (which gets matched as page_title "Post-9" and rev_id 11), and instead match the third pattern.
I understand why the second pattern should match if the path is /en/whocolor/v1.0.0-beta/Post-9/11/?origin=*, but when the title is url-encoded as "Post-9%2f11" it still matches the second pattern instead of continuing on to the third pattern.
How can I make Django treat the url-encoded slash as part of the parameter instead of a path separator?
In Django there's something called Path Converters.
One of the existing path converters deals with things with slashes
path - Matches any non-empty string, including the path separator, '/'. This allows you to match against a complete URL path rather than a segment of a URL path as with str.
path('<path:page_title>/',
views.post_detail,
name='post_detail'),
When one uses path one doesn't need to escape anything, and it'll work. There are various cases similar to OP's where Path was used as a solution, such as:
Example 1
Example 2
Example 3
Example 4
Example 5
As Ivan Starostin mentions in a comment, OP should actually use the path converter slug.
According to the docs
A slug is a short label for something, containing only letters, numbers, underscores or hyphens.
For example, building-your-1st-django-site.
To do so, it's advisable that OP uses, in the model, a field of type SlugField.
slug = models.SlugField(max_length=250)
Then in the urls.py
path('<slug:post>/',
views.post_detail,
name='post_detail'),

Django regex pattern matching

I have the following urlpatterns:
url(r'^api/daily-means/$', views.daily_means.as_view(), name='daily_means'),
url(r'^api/daily-means/sites/(?P<url>\w+)/$', views.site_daily_means.as_view()),
url(r'^api/daily-means/pollutant/(?P<poll>\w+)$/', views.pollutant_daily_means.as_view()),
The first two work fine. The last one show work the same as the second one but it does not. Im not that great with regex and urlpatterns but I assume there is something with the second url pattern which is stopping the last one from running. Can anyone else see a reason for this?
Django will append the end slash if it is not provided. In your regex, you are matching without the end slash.
url(r'^api/daily-means/pollutant/(?P<poll>\w+)$/', views.pollutant_daily_means.as_view()),
The following URL pattern should work(after including the end slash as a part of URL match).
url(r'^api/daily-means/pollutant/(?P<poll>\w+)/$', views.pollutant_daily_means.as_view()),

Regular expression in URL for Django slug

I have 2 URL's with a slug field in the URL.
url(r'^genres/(?P<slug>.+)/$', views.genre_view, name='genre_view'),
url(r'^genres/(?P<slug>.+)/monthly/$', views.genre_month, name='genre_month'),
The first one opens fine but the second one gives a DoesNotExist error saying Genres matching query does not exist.
Here is how I'm accessing the 2nd URL in my HTML
<li>Monthly Top Songs</li>
I tried to print the slug in the view. It is passed as genre_name/monthly instead instead of genre_name.
I think the problem is with the regex in the URLs. Any idea what's wrong here?
Django always uses the first pattern that matches. For urls similar to genres/genre_name/monthly your first pattern matches, so the second one is never used. The truth is the regex is not specific enough, allowing all characters - which doesn't seem to make sense.
You could reverse the order of those patterns, but what you should do is to make them more specific (compare: urls.py example in generic class-based views docs):
url(r'^genres/(?P<slug>[-\w]+)/$', views.genre_view, name='genre_view'),
url(r'^genres/(?P<slug>[-\w]+)/monthly/$', views.genre_month, name='genre_month'),
Edit 2020:
Those days (since Django 2.0), you can (and should) use path instead of url. It provides built-in path converters, including slug:
path('genres/<slug:slug>/', views.genre_view, name='genre_view'),
path('genres/<slug:slug>/monthly/', views.genre_month, name='genre_month'),
I believe that you can also drop the _ from the pattern that #Ludwik has suggested and revise to this version (which is one character simpler :) ):
url(r'^genres/(?P<slug>[-\w]+)/$', views.genre_view, name='genre_view'),
url(r'^genres/(?P<slug>[-\w]+)/monthly/$', views.genre_month, name='genre_month'),
Note that \w stands for "word character". It always matches the ASCII characters [A-Za-z0-9_]. Notice the inclusion of the underscore and digits. more info
In Django >= 2.0, slug is included in URL by doing it like below.
from django.urls import path
urlpatterns = [
...
path('articles/<slug:some_title>/', myapp.views.blog_detail, name='blog_detail'),
...
]
Source: https://docs.djangoproject.com/en/2.0/ref/urls/#django.urls.path

How to pass a url as a parameter to a handler in Django?

A project I'm currently working on has a sort of proxy functionality where users can browse to another URL through the site. I was hoping the URL could be something like this:
www.mydomain.com/browser/[URL here]
However I'm having trouble capturing a URL as a parameter like this. I think my inexperience with regex is failing me here :( My URL conf looks like this:
urlpatterns += patterns('',
url(r'^browser/(?P<url>\w+)/$', browser_proxy, name='browser_proxy'),
)
I'm thinking the \w+ isn't sufficient to capture an arbitrary URL. Anyone know what I should be using here to capture a URL as a parameter like this?
Thanks for any help.
\w means a "word" character, i.e. alphanumeric and underscore. It is equivalent to the set [a-zA-Z0-9_]. You can match any character with a period. I.e.:
urlpatterns += patterns('',
url(r'^browser/(?P.+)/$', browser_proxy, name='browser_proxy'),
)

Regular expression that uses an "OR" conditional

I could use some help writing a regular expression. In my Django application, users can hit the following URL:
http://www.example.com/A1/B2/C3
I'd like to create a regular expression that allows accepts any of the following as a valid URL:
http://www.example.com/A1
http://www.example.com/A1/B2
http://www.example.com/A1/B2/C3
I'm guessing I need to use the "OR" conditional, but I'm having trouble getting my regex to validate. Any thoughts?
UPDATE: Here is the regex so far. Note that I have not included the "http://www.example.com" portion -- Django handles that for me. I'm just concerned with validating 1,2, or 3 subdirectories.
^(\w{1,20})|((\w{1,20})/(\w{1,20}))|((\w{1,20})/(\w{1,20})/(\w{1,20}))$
Skip the |, use the ? and ()
http://www\.example\.com/A1(/B2(/C3)?)?
And if you replace the A1-C3 with a pattern:
http://www\.example\.com/[^/]*(/[^/]*(/[^/]*)?)?
Explanation:
it matches every string that starts with http://www.example.com/A1
it can match an additional /B2 and even an additional /C3, but /C3 is only matched, when there is a /B2
[^/]* (as many non slashes as possible)
if you need the A1-C3 in special capture groups, you can use this:
http://www\.example\.com/([^/]*)(/([^/]*)(/([^/]*))?)?
Will give (groupnumber: content):
matches: 0: (http://www.example.com/dir1/dir2/dir3)
1: (dir1)
2: (/dir2/dir3)
3: (dir2)
4: (/dir3)
5: (dir3)
You can check it out online here or get this tool (yes it's free, and it's even written in Lisp...).
There's a much more Django way to do this:
urlpatterns = patterns('',
url(r'^(?P<object_slug1>\w{2}/(?P<object_slug2>\w{2}/(?P<object_slug3>\w{2})$', direct_to_template, {"template": "two_levels_deep.html"}, name="two_deep"),
url(r'^(?P<object_slug1>\w{2}/(?P<object_slug2>\w{2})$', direct_to_template, {"template": "one_level_deep.html"}, name="one_deep"),
url(r'^(?P<object_slug1>\w{2})$', direct_to_template, {"template": "homepage.html"}, name="home"),
)
The other methods don't take advantage of Django's power to pass variables.
Edit: I switched the order of the urlpattern to be more obvious for the parser (i.e. bottom up is more defined than top down).
http://www\.example\.com/A1(/B2(/C3)?)?
^(\w{1,20})(/\w{1,20})*
this is for as many subdirectories as you like if you only want 2:
^(\w{1,20})(/\w{1,20}){0,2}
If I'm understanding, I think you just need another set of parens around the whole OR statement:
^((\w{1,20})|((\w{1,20})/(\w{1,20}))|((\w{1,20})/(\w{1,20})/(\w{1,20})))$
Be aware that Django's reverse URL matching (permalinks, reverse() and {% url %}) can handle a limited subset of regular expressions. To be able to use them, it's sometimes necessary to split complex regexes into separate URL dispatcher rules.