Django url parsing -pass raw string - django

I'm trying to pass a 'string' argument to a view with a url.
The urls.py goes
('^add/(?P<string>\w+)', add ),
I'm having problems with strings including punctuation, newlines, spaces and so on.
I think I have to change the \w+ into something else.
Basically the string will be something copied by the user from a text of his choice, and I don't want to change it. I want to accept any character and special character so that the view acts exactly on what the user has copied.
How can I change it?
Thanks!

Notice that you can use only strings that can be understood as a proper URLs, it is not good idea to pass any string as url.
I use this regex to allow strings values in my urls:
(?P<string>[\w\-]+)
This allows to have 'slugs; in your url (like: 'this-is-my_slug')

Well, first off, there are a lot of characters that aren't allowed in URLs. Think ? and spaces for starters. Django will probably prevent these from being passed to your view no matter what you do.
Second, you want to read up on the re module. It is what sets the syntax for those URL matches. \w means any upper or lowercase letter, digit, or _ (basically, identifier characters, except it doesn't disallow a leading digit).
The right way to pass a string to a URL is as a form parameter (i.e. after a ?paramName= in the URL, and with special characters escaped, such as spaces changed to +).

Related

Match url with uppercase letters except if it contains a filename like .jpg,.css,.js etc

I need a Regular Expression that can match url with uppercase letters but do not match if it contains a filename like .jpg,.css,.js etc
I want to redirect all uppercase url to lowercase but only when it is not pointing to a file resource.
Try using a regex visualizer like regexpal.com.
Here's an example of a regular expression that approximates what you're trying to do:
\w+\.(?:com|net)(?:/[A-Z]+){1,}[/]?(?:\.jpg|\.png|\.JPG|\.PNG){0}$
\w+\.(?:com|net) captures a domain of the form word.com or word.net. (You'll need to add other domains or improve this if you want to capture subdomains as well.)
(?:/[A-Z]+){1,}[/]?captures all-caps directories like /FOO/BAR/ with an optional trailing slash.
(?:\.jpg|\.png|\.JPG|\.PNG){0}$ captures exactly zero of the extensions listed; you'll obviously need to add to this list of extensions.
But perhaps rethink your routing; it's better form to keep all assets in devoted directories on your server, so that you can simply pass any request to mysite.com/assets/ along unchanged while handling other URLs.

RegEx to cut out URL

I try to get an URL from a String of the following format:
RANDOMRUBBISHhttps://www.my-url.com/randomfirstname_randomlastnameRANDOMRUBBISH
I already tried some things, especially the the look before/after, which I used before successfully on another url format (starts https... ends .html, this was working).
But seems I'm too stupid to figure out the regex for the kind of string mentioned above. I just want the URL part from https.... to the end of the random last name. Is this even possible?
Any Ideas?
If you can guarantee that randomfirstname_randomlastname is all lowercase and RANDOMRUBBISH is all uppercase, you can use character classes [a-z] and [A-Z]. The language the regex is for will determine how to use these.
This is example works in javascript:
var str = "RANDOMRUBBISHhttps://www.my-url.com/randomfirstname_randomlastnameRANDOMRUBBISH";
var match = /https:\/\/www\.my-url\.com\/[a-z]*/.exec(str);

django url matching non characters and charaters

suppose I have this url
url(r'^delete_group/(\w+)/', 'delete_group_view',name='delete_group')
In template
{%url 'delete_group' 'mwas'%} works but when I use
{%url 'delete_group' 'mwas 45'%} is not working. Any way to modify the url to accept both mwas and mwas 45
The issue might be your regex. The URL example you're showing has a space in it. \w won't match spaces. Try this instead: r'^delete_group/([\w\s]+)/ which allows either words or spaces in multiples.
However, know that spaces are not valid in URLs and will likely get converted to %20 or something similar. A best practice is to use hyphens where you would put a space.
I'd also point you at this answer to a similar question.

REGEX - How to ignore some query strings in URLS, but not in others

I need to redirect an old blog URL to a new blog URL. The ID field is the key query string, and everything else in the query string should be ignored. The logic at a high level:
If old case insensitive URL matches: /Blog/Post.aspx? + ID=33 anywhere in the query string of the URL then I will redirect to: /newblog/newurl/
Current REGEX Code: (?i:/Blog/Post.aspx)|(\?)|(?i:id=33)
Success: /Blog/Post.aspx?id=33
Fails: /Blog/Post.aspx?ignore=me&id=33
Fails: /Blog/Post.aspx?ignore=me&id=33&ignoreme=too
How would I have it ignore the potential unknown query string ignore=me and ignoreme=too, but still come up with a REGEX match to redirect when the ID=33 is in the query string?
Thank you for the answer m.buettner!
Right now you would even redirect, if you have only ID=33 in your URL, or even if you have only a question mark in there. I suppose that is not what you want. You are probably looking for something like this:
(?i:/Blog/Post.aspx\?.*id=33(?!\w)).*
That will require /Blog/Post.aspx? and then allow arbitrary characters until the id=33 is encountered.
Depending on which language you are using this in, you could also use a lookahead, which makes it easier to check for different parameters, whose order you might not know:
(?i:/Blog/Post.aspx\?(?=.*id=33(?!\w))).*
This could then be easily extended to
(?i:/Blog/Post.aspx\?(?=.*id=33(?!\w))(?=.*another=requirement(?!\w))).*
With the first approach you would have to add two alternatives for both possible orders.
EDIT: A caveat for all three solutions: after the number they require a non-word character (that is anything but letters, digits or underscores). This means that they would give false positives in cases like ...id=33+34... and ...id=33%2F.... But these should not be generated by Wordpress in the first place.
Ops, I was going to bring a general answer to match general attributes in an url! Well I'm gonna leave it here in case that you need it
DEMO
(?:(id|noignoreme|dontignoreme)=([^&\n]+)(?:\n|&|$))
With this you can add the parameters you want to accept and it will return it as group1 (the option) and group2 (the text of that option).
After that you could see if ID = 33 then do that; else do thot;

How do I write this URL in Django?

(r'^/(?P<the_param>[a-zA-z0-9_-]+)/$','myproject.myapp.views.myview'),
How can I change this so that "the_param" accepts a URL(encoded) as a parameter?
So, I want to pass a URL to it.
mydomain.com/http%3A//google.com
Edit: Should I remove the regex, like this...and then it would work?
(r'^/(?P[*]?)/?$','myproject.myapp.views.myview'),
Add % and . to the character class:
[a-zA-Z0-9_%.-]
Note: You don't need to escape special characters inside character classes because special characters lose their meaning inside character sets. The - if not to be used to specify a range should be escaped with a back slash or put at the beginning (for python 2.4.x , 2.5.x, 2.6.x) or at the end of the character set(python 2.7.x) hence something like [a-zA-Z0-9_%-.] will throw an error.
You'll at least need something like:
(r'^the/uri/is/(?P<uri_encoded>[a-zA-Z0-9~%\._-])$', 'project.app.view'),
That expression should match everything described here.
Note that your example should be mydomain.com/http%3A%2F%2Fgoogle.com (slashes should be encoded).
I think you can do it with:
(r'^(?P<url>[\w\.~_-]+)$', 'project.app.view')