Django url reverse: Non-reversible reg-exp portion: '(?='

Django url reverse: Non-reversible reg-exp portion: '(?=' - django

Django version: 1.5 (trunk)
I'm using a positive look-ahead assertion in url pattern A, which works fine by itself. But when I try to reverse url pattern B, which is completelly unrelated, I get:
ValueError: Non-reversible reg-exp portion: '(?='
Example urls:
url(r'^foo(?=bar)/', test, name= 'bla'),
url(r'bar/', test, name= 'bli'),
Triggering the error:
from django.core.urlresolvers import reverse
reverse('bli')
I found this related ticket, but didn't make me smarter sadly
https://code.djangoproject.com/ticket/17492
Anyone can tell me what's wrong with the code?

Your code is OK, the problem is, Django can't reverse every possible regular expression. Currently Django's implementation of regex normalizer can't handle at least two things: disjunction (|) and non-capturing (look-ahead, look-behind) patterns.
So, to solve your problem, just avoid using look-ahead in your URL patterns and you're good to go. It should be possible, after all, using plain regular expressions without all those funky extensions it is possible to represent any regular language.

Related

Django similar url pattern issue

I am working on a products comparison module and have url patterns like below:
path('comparison/<slug:slug1>-vs-<slug:slug2>/', views.compare_two_products, name="compare_two_products"),
path('comparison/<slug:slug1>-vs-<slug:slug2>-vs-<slug:slug3>/', views.compare_three_products, name="compare_three_products"),
The issue is that Django (3.2.6) always matches the first pattern and returns 404 when I try to access the second pattern. However if I comment out the first pattern, then it matches the third pattern just fine. I want to get both the patterns working in the format slug-vs-slug-vs-slug. Any suggestions on what I might be doing wrong ?
Thanks in advance.

Just change the order or URLs, like
path('comparison/<slug:slug1>-vs-<slug:slug2>-vs-<slug:slug3>/', views.compare_three_products, name="compare_three_products"),
path('comparison/<slug:slug1>-vs-<slug:slug2>/', views.compare_two_products, name="compare_two_products"),
that should work

Regular expression in URL for Django slug

I have 2 URL's with a slug field in the URL.
url(r'^genres/(?P<slug>.+)/$', views.genre_view, name='genre_view'),
url(r'^genres/(?P<slug>.+)/monthly/$', views.genre_month, name='genre_month'),
The first one opens fine but the second one gives a DoesNotExist error saying Genres matching query does not exist.
Here is how I'm accessing the 2nd URL in my HTML
<li>Monthly Top Songs</li>
I tried to print the slug in the view. It is passed as genre_name/monthly instead instead of genre_name.
I think the problem is with the regex in the URLs. Any idea what's wrong here?

Django always uses the first pattern that matches. For urls similar to genres/genre_name/monthly your first pattern matches, so the second one is never used. The truth is the regex is not specific enough, allowing all characters - which doesn't seem to make sense.
You could reverse the order of those patterns, but what you should do is to make them more specific (compare: urls.py example in generic class-based views docs):
url(r'^genres/(?P<slug>[-\w]+)/$', views.genre_view, name='genre_view'),
url(r'^genres/(?P<slug>[-\w]+)/monthly/$', views.genre_month, name='genre_month'),
Edit 2020:
Those days (since Django 2.0), you can (and should) use path instead of url. It provides built-in path converters, including slug:
path('genres/<slug:slug>/', views.genre_view, name='genre_view'),
path('genres/<slug:slug>/monthly/', views.genre_month, name='genre_month'),

I believe that you can also drop the _ from the pattern that #Ludwik has suggested and revise to this version (which is one character simpler :) ):
url(r'^genres/(?P<slug>[-\w]+)/$', views.genre_view, name='genre_view'),
url(r'^genres/(?P<slug>[-\w]+)/monthly/$', views.genre_month, name='genre_month'),
Note that \w stands for "word character". It always matches the ASCII characters [A-Za-z0-9_]. Notice the inclusion of the underscore and digits. more info

In Django >= 2.0, slug is included in URL by doing it like below.
from django.urls import path
urlpatterns = [
...
path('articles/<slug:some_title>/', myapp.views.blog_detail, name='blog_detail'),
...
]
Source: https://docs.djangoproject.com/en/2.0/ref/urls/#django.urls.path

How can I make this regex for a URL more specific?

I have the following regex that attempts to match URLs:
/((http|https):(([A-Za-z0-9$_.+!*(),;/?:#&~=-])|%[A-Fa-f0-9]{2}){2,}(#([a-zA-Z0-9][a-zA-Z0-9$_.+!*(),;/?:#&~=%-]*))?([A-Za-z0-9$_+!*();/?:~-]))/g
How can I modify this regex to only match URLs of a single domain?
For example, I only want to match URLs that begin with http://www.google.com?
This should simplify my regex, but I'm too much of a regex noob to get it working (after all these years...)

Did you write that RegEx? I don't know what it's trying to do, but it certainly doesn't match URLs correctly. Here's something it matches:
http:###9#?~
which I'm pretty sure isn't a valid URL.
You shouldn't be using RegEx to match URLs like this. You haven't said what language you're working in, but use whatever its equivalent of urlparse is..
Here's a relevant question: How do you validate a URL with a regular expression in Python?

Regex: Match URLs for specific domain EXCEPT when a certain querystring parameter has a certain value

In short, I need to match all URLs in a block of text that are for a certain domain and don't contain a specific querystring parameter and value (refer=twitter)
I have the following regex to match all URLs for the domain.
\b(https?://)?([a-z0-9-]+\.)*example\.com(/[^\s]*)?
I just can't get the last part to work
(?![&?]refer=twitter)\b(https?://)?([a-z0-9-]+\.)*example\.com(/[^\s]*)?
So the following SHOULD match
example.com
http://example.com/
https://www.example.com#link
www.example.com?somevalue=foo
But these should NOT
https://www.anotherexample.com#link
www.example.com?refer=twitter
EDIT:
And if you can get it to match the
http://example.com?foo=foo.bar
out of a sentence like
For examples go to http://example.com?foo=foo.bar.
without picking up the period, that would be great!
EDIT2:
Fixed the trailing period issue with this
\b(https?://)?([a-z0-9-]+\.)*example\.com/?([^\s]*[^.])?
EDIT3:
This seems to work, or at least 99% of the tests I've thrown at it
(?!\b.*[&?]refer=twitter)\b(https?://)?([a-z0-9-]+\.)*example\.com/?([^\s]*[^.])?
EDIT4:
Settled on
\b(?!.*[&?]refer=twitter)(https?://)?([a-z0-9-]+\.)*nygard\.com(?!\.)[^\s]*\b+

(?!\b.*[&?]refer=twitter)
Is what you're looking for.

To be honest, at first the thought of using a regex didn't even cross my mind (which is a good sign - using a regex must, IMO, always be a secondary option, not primary). Here is how I'd do it in my language of choice
>>> from urlparse import urlparse, parse_qs
>>> p = urlparse(r'http://foo.bar.com/baz?refer=twitter&rock=paper')
>>> parse_qs(p.query)
{'rock': ['paper'], 'refer': ['twitter']}
You can do anything from here.

Regular expression that uses an "OR" conditional

I could use some help writing a regular expression. In my Django application, users can hit the following URL:
http://www.example.com/A1/B2/C3
I'd like to create a regular expression that allows accepts any of the following as a valid URL:
http://www.example.com/A1
http://www.example.com/A1/B2
http://www.example.com/A1/B2/C3
I'm guessing I need to use the "OR" conditional, but I'm having trouble getting my regex to validate. Any thoughts?
UPDATE: Here is the regex so far. Note that I have not included the "http://www.example.com" portion -- Django handles that for me. I'm just concerned with validating 1,2, or 3 subdirectories.
^(\w{1,20})|((\w{1,20})/(\w{1,20}))|((\w{1,20})/(\w{1,20})/(\w{1,20}))$

Skip the |, use the ? and ()
http://www\.example\.com/A1(/B2(/C3)?)?
And if you replace the A1-C3 with a pattern:
http://www\.example\.com/[^/]*(/[^/]*(/[^/]*)?)?
Explanation:
it matches every string that starts with http://www.example.com/A1
it can match an additional /B2 and even an additional /C3, but /C3 is only matched, when there is a /B2
[^/]* (as many non slashes as possible)
if you need the A1-C3 in special capture groups, you can use this:
http://www\.example\.com/([^/]*)(/([^/]*)(/([^/]*))?)?
Will give (groupnumber: content):
matches: 0: (http://www.example.com/dir1/dir2/dir3)
1: (dir1)
2: (/dir2/dir3)
3: (dir2)
4: (/dir3)
5: (dir3)
You can check it out online here or get this tool (yes it's free, and it's even written in Lisp...).

There's a much more Django way to do this:
urlpatterns = patterns('',
url(r'^(?P<object_slug1>\w{2}/(?P<object_slug2>\w{2}/(?P<object_slug3>\w{2})$', direct_to_template, {"template": "two_levels_deep.html"}, name="two_deep"),
url(r'^(?P<object_slug1>\w{2}/(?P<object_slug2>\w{2})$', direct_to_template, {"template": "one_level_deep.html"}, name="one_deep"),
url(r'^(?P<object_slug1>\w{2})$', direct_to_template, {"template": "homepage.html"}, name="home"),
)
The other methods don't take advantage of Django's power to pass variables.
Edit: I switched the order of the urlpattern to be more obvious for the parser (i.e. bottom up is more defined than top down).

http://www\.example\.com/A1(/B2(/C3)?)?

^(\w{1,20})(/\w{1,20})*
this is for as many subdirectories as you like if you only want 2:
^(\w{1,20})(/\w{1,20}){0,2}

If I'm understanding, I think you just need another set of parens around the whole OR statement:
^((\w{1,20})|((\w{1,20})/(\w{1,20}))|((\w{1,20})/(\w{1,20})/(\w{1,20})))$

Be aware that Django's reverse URL matching (permalinks, reverse() and {% url %}) can handle a limited subset of regular expressions. To be able to use them, it's sometimes necessary to split complex regexes into separate URL dispatcher rules.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Django url reverse: Non-reversible reg-exp portion: '(?=' - django

Related

Django similar url pattern issue

Regular expression in URL for Django slug

How can I make this regex for a URL more specific?

Regex: Match URLs for specific domain EXCEPT when a certain querystring parameter has a certain value

Regular expression that uses an "OR" conditional

Categories

Resources