How do I match the question mark character in a Django URL? - django

In my Django application, I have a URL I would like to match which looks a little like this:
/mydjangoapp/?parameter1=hello&parameter2=world
The problem here is the '?' character being a reserved regex character.
I have tried a number of ways to match this... This was my first attempt:
(r'^pbanalytics/log/\?parameter1=(?P<parameter1>[\w0-9-]+)&parameter2=(?P<parameter2>[\w0-9-]+), 'mydjangoapp.myFunction')
This was my second attempt:
(r'^pbanalytics/log/\\?parameter1=(?P<parameter1>[\w0-9-]+)&parameter2=(?P<parameter2>[\w0-9-]+), 'mydjangoapp.myFunction')
but still no luck!
Does anyone know how I might match a '?' exactly in a Django URL?

Don't. You shouldn't match query string with URL Dispatcher.
You can access all values using request.GET dictionary.
urls
(r'^pbanalytics/log/$', 'mydjangoapp.myFunction')
function
def myFunction(request)
param1 = request.GET.get('param1')

Django's URL patterns only match the path component of a URL. You're trying to match on the querystring as well, this is why you're having trouble. Your first regex does what you wanted, except that you should only ever be matching the path component.
In your view you can access the querystring via request.GET

The ? character is a reserved symbol in regex, yes. Your first attempt looks like proper escaping of it.
However, ? in a URL is also the end of the path and the beginning of the query part (like this: protocol://host/path/?query#hash.
Django's URL dispatcher doesn't let you dispatch URLs based on the query part, AFAIK.
My suggestion would be writing a django view that does the dispatching based on the request.GET parameter to your view function.

The way to do what the original question was i.e. catch-all in URL dispatch var...
url(r'^mens/(?P<pl_slug>.+)/$', 'main.views.mens',),
or
url(r'^mens/(?P<pl_slug>\?+)/$', 'main.views.mens',),
As far as why this is needed, GET URL's don't exactly provide good "permalinks" or good presentation in general for customers and to clients.
Clients often times request the url be formatted i.e.
www.example-clothing-site.com/mens/tops/shirts/t-shirts/Big_Brown_Shirt3XL
this is a far more readable interface for the end-user and provides a better overall presentation for the client.

Related

Urlpattern regular expression not working

So i'm trying to make url like so
re_path(r'^product_list/(?P<category_slug>[\w-]+)/(?:(?P<filters>[\w~#=]+)?)$', views.ProductListView.as_view(), name='filtered_product_list'),
and at this point it works with things like:
/product_list/sdasdadsad231/bruh=1~3~10#nobruh=1~4
bruh=1~3~10#nobruh=1~4 - those are filters
but later i want to implement search by word functionality
so i want it recognize things like
/product_list/sdasdadsad231/?filters=bruh-1~3~10&nobruh-1~4&search_for=athing
/product_list/sdasdadsad231/?filters=bruh-1~3~10&nobruh-1~4
/product_list/sdasdadsad231/?search_for=athing
/product_list/sdasdadsad231/
so in different situations it will get filters and/or search_for or nothing at all
You might write the pattern as:
^product_list/(?P<category_slug>[\w-]+)/(?:\??(?P<filters>[\w~#=&-]+)?)$
Regex demo
If you want to match the leading / from the example data, you can append that in the pattern after the ^
The part after the question mark is the query string [wiki] and does not belong to the path. Django will construct a QueryDict for this, and this will be available through request.GET [Django-doc]. Indeed, if the path is for example:
/product_list/sdasdadsad231/?filters=bruh-1~3~10&nobruh-1~4&search_for=athing
Then the ?filters=bruh-1~3~10&nobruh-1~4&search_for=athing is not part of the path, and it will be wrapped in request.GET as a QueryDict that looks like:
>>> QueryDict('filters=bruh-1~3~10&nobruh-1~4&search_for=athing')
<QueryDict: {'filters': ['bruh-1~3~10'], 'nobruh-1~4': [''], 'search_for': ['athing']}>
You thus can not capture the part after (and including) the question mark, this is already stripped of the path when trying to match with the re_path(…) and path(…) definitions.

Regular expressions (RegEx) to filter string from URLs in Google Analytics

I want to filter a string from the URLs in Google Analytics. This can be done using the Views > Filter > Exclude using RegEx, but I have been unable to get it to work.
An outline of how these filters are set up, can be found here, however, I can not work out how to isolate the string using RegEx. I believe it will need to be one filter per URL type.
The URLs follow this format:
/software/11F372288FA/pagename
/software/13F412C5FA/pagename/summary
/software/XIL1P0BFXCKM81/pagename2
I need to exclude this part of the URL:
/11F372288FA/
So that the URL data (e.g. Session time) is recorded against:
/software/pagename
/software/pagename/summary
/software/pagename2
I have worked out that I can isolate the string using thing following RegEx
^\/validate\/(..........)\/accounts\/summary$
It is not very elegant and would require a filter for every URL type.
Thanks for the help!
I'm not certain if this will work in your exact case but instead of using regex for this it might be easier to just create a new string from the start to the end of "software" and append everything from pagename to the end. In Java this might look something like:
String newString = oldString.substring(0, 9) + oldString.substring(oldString.indexOf("pagename"));
Take note though that this will only work if the "software" at the start is always the same length and you are actually only excluding things between "software" and "pagename".

Django Url pattern regex for tokens

I need to pass tokens like b'//x0eaa#abc.com//x00//xf0//x7f//xff//xff//xfd//x00' in my Django Url pattern. I am not able to find matching regex for that resulting Page not found error.
My url will be like /api/users/0/"b'//x0eaa#abc.com//x00//xf0//x7f//xff//xff//xfd//x00'"/
I have tried with following regex
url(r'^api/users/(?P<username>[\w\-]+)/(?P<paging_state>[\w.%+-]+#[A-Za-z0-9.-]+\.[A-Za-z]{2,4})/$', views.getUserPagination),
Please pass the token in request header or body and then use accordingly in your view.
Considering there are some static predictable elements in your url like -
api/users/
/" before b
"/ at the end after '
So I can see the url in either of the 2 ways below. Regex's mentioned accordingly:
api/users/(set of words, digits or hyphens)/"(any character except newline)"/
REGEX: ^api\/users\/([\w\d\-]+)\/"(.*)"\/$
URL: url(r'^api\/users\/([\w\d\-]+)\/"(.*)"\/$', views.getUserPagination),
api/users/(set of words, digits or hyphens)/"(one character-b)'//(any no. of words or digits)#(any no. of words or digits).(any no. of words or digits) (any no. of words, digits, front slashes)'"/
REGEX: ^api\/users\/([\w\d\-]+)\/"([a-g]'\/\/[\w\d]*#[\w\d]*.[\w\d]*[\/\w\d]*')"\/$
URL: url(r'^api\/users\/([\w\d\-]+)\/"([a-g]'\/\/[\w\d]*#[\w\d]*.[\w\d]*[\/\w\d]*')"\/$', views.getUserPagination),
You should be able to use either of the above two. There can be multiple ways to match the token part in your url. So unless it is a big security concern, you can do with the simplest approach as mentioned in point 1.

Why my Django URL doesn't grab the right way?

I've a problem with django url, when I go to:
/cars/my_town/my_office/
It's ok and view run as expected, town="my_town" and office_name="my_office".
When I go to:
/car/my_town/my_office/my_var1/
I get an error.
When I print the vars in my views I get:
town :"my_town/my_office" and office_name:"my_var1"
My view looks like this:
def ListingCars(request,town,office_name,var1=None) :
My urls:
url(r'^cars/(?P<town>[\w|\W ]+)/(?P<office_name>[\w|\W ]+)/$', web_public_views.ListingCars, name='listingvo_cars'),
url(r'^cars/(?P<town>[\w|\W ]+)/(?P<office_name>[\w|\W ]+)/(?P<var1>[\w|\W ]+)/$', web_public_views.ListingCars, name='listingvo_cars_var1'),
SOLVE ... it wasn't a resolverurl problem, but a myfault problem ;) i didn't see a "ç" on my tags name url... that cause the problem... thx for help i clean up my regex and sorry
\W is not word, which will match a slash, you can most likely just omit that and use \w, although its not clear what your url's should match, if it is slugs then its most likely that you need [\w-]+
^cars/(?P<town>[\w-]+)/(?P<office_name>[\w-]+)/$
^cars/(?P<town>[\w-]+)/(?P<office_name>[\w-]+)/(?P<var1>[\w-]+)/$
What is actually happening is your first regex is matching your url so it never uses the second one, so most likely you could also fix this by switching the order of these urls so the second is first but I wouldn't recommend this.

Regex: Match URLs for specific domain EXCEPT when a certain querystring parameter has a certain value

In short, I need to match all URLs in a block of text that are for a certain domain and don't contain a specific querystring parameter and value (refer=twitter)
I have the following regex to match all URLs for the domain.
\b(https?://)?([a-z0-9-]+\.)*example\.com(/[^\s]*)?
I just can't get the last part to work
(?![&?]refer=twitter)\b(https?://)?([a-z0-9-]+\.)*example\.com(/[^\s]*)?
So the following SHOULD match
example.com
http://example.com/
https://www.example.com#link
www.example.com?somevalue=foo
But these should NOT
https://www.anotherexample.com#link
www.example.com?refer=twitter
EDIT:
And if you can get it to match the
http://example.com?foo=foo.bar
out of a sentence like
For examples go to http://example.com?foo=foo.bar.
without picking up the period, that would be great!
EDIT2:
Fixed the trailing period issue with this
\b(https?://)?([a-z0-9-]+\.)*example\.com/?([^\s]*[^.])?
EDIT3:
This seems to work, or at least 99% of the tests I've thrown at it
(?!\b.*[&?]refer=twitter)\b(https?://)?([a-z0-9-]+\.)*example\.com/?([^\s]*[^.])?
EDIT4:
Settled on
\b(?!.*[&?]refer=twitter)(https?://)?([a-z0-9-]+\.)*nygard\.com(?!\.)[^\s]*\b+
(?!\b.*[&?]refer=twitter)
Is what you're looking for.
To be honest, at first the thought of using a regex didn't even cross my mind (which is a good sign - using a regex must, IMO, always be a secondary option, not primary). Here is how I'd do it in my language of choice
>>> from urlparse import urlparse, parse_qs
>>> p = urlparse(r'http://foo.bar.com/baz?refer=twitter&rock=paper')
>>> parse_qs(p.query)
{'rock': ['paper'], 'refer': ['twitter']}
You can do anything from here.