I have a search box.
I search for: <- blank spaces in my search box.
My form validation catches this.
The URL shows: ++++++++++++++++
If I search for: <script>alert(1);</script>
The URL shows: <script>alert%281%29%3B<%2Fscript>
The Question
Where in Django can I alter / change / modify the request that determines the request URL? I'm thinking middleware but I haven't found an example. Would I have to create an entirely new HttpRequest from scratch?
Why do I want to?
I want to encode the URL differently. For example, strip all punctuation from the q= value, replace whitespace, strip, replace single spaces with + to have cleaner URLs.
Really looking for a clear example with CODE.
Related
Supposing that I have the following URL as a String;
String urlSource = 'https://www.wikipedia.org/';
I want to extract the main page name from this url String; 'wikipedia', removing https:// , www , .com , .org part from the url.
What is the best way to extract this? In case of RegExp, what regular expression do I have to use?
You do not need to make use of RegExp in this case.
Dart has a premade class for parsing URLs:
Uri
What you want to achieve is quite simple using that API:
final urlSource = 'https://www.wikipedia.org/';
final uri = Uri.parse(urlSource);
uri.host; // www.wikipedia.org
The Uri.host property will give you www.wikipedia.org. From there, you should easily be able to extract wikipedia.
Uri.host will also remove the whole path, i.e. anything after the / after the host.
Extracting the second-level domain
If you want to get the second-level domain, i.e. wikipedia from the host, you could just do uri.host.split('.')[uri.host.split('.').length - 2].
However, note that this is not fail-safe because you might have subdomains or not (e.g. www) and the top-level domain might also be made up of multiple parts. For example, co.uk uses co as the second-level domain.
I found this RegEx for extracting youtube ID's:
#^http(?:s?)://(?:www\.)?youtu(?:be\.com/watch\?(?:.*?&(?:amp;)?)?v=|\.be/)([\w\-]+)(?:&(?:amp;)?[\w\?=]*)?#i
Now I'm trying to modify the RegEx to extract the youtube id for a youtube URL in this format:
http://www.youtube.com/watch?v=ESUYMoJVpYo&feature=share&a=rRL4kwOAewcP9KzId6Ks4A
How do I make sure I get the Id extracted from all possible url formats...
URLs aren't normally parsed by regular expression. If you want to modify them in any way, then you probably shouldn't use them.
URLs use what's called a Query String to pass parameters to a page. The beginning of the query string is marked by a question mark and followed by an ampersand delimited list of name/value pairs.
For example, using your own url: http://www.youtube.com/watch?v=ESUYMoJVpYo&feature=share&a=rRL4kwOAewcP9KzId6Ks4A
Page request: www.youtube.com/watch
Whole query string: ?v=ESUYMoJVpYo&feature=share&a=rRL4kwOAewcP9KzId6Ks4A
Name/Value pairs:
v -> ESUYMoJVpYo
feature -> share
a -> rRL4kwOAewcP9KzId6Ks4A
If you want to parse/modify the URL, do so by breaking down the query string. That'll be much more reliable than trying to write a RegEx for it.
In my django project, I have a url pattern like
(r'^survey/u2=([^/]+)/u3=([^/]+)/$',SurveyView.as_view()).
When I try to open the below url
http://www.sample.com/survey/u2=rc57S4/jyTJBz+==/u3=/U5pKfrV8X1MjUU2tI0AhqTF5PGR8g=/
[where u2 & u3 are encrypted value using internal keys. ]
I'm getting page not found error. This is due to, the sample url is not matching with the original url pattern at server end, as it has '/' backslash character in the url parameter.
Right now,I'm not in a position to edit the sample url by adding encode to the parameters, since this url has been mailed to customer. However if the customer opens the link I should not through error message.
How can I handle this special characters at server end while pattern match for url?.
Instead of passing as arguments in URL pass it as a GET request. seperated by ? and & characters.
In my Django application, I have a URL I would like to match which looks a little like this:
/mydjangoapp/?parameter1=hello¶meter2=world
The problem here is the '?' character being a reserved regex character.
I have tried a number of ways to match this... This was my first attempt:
(r'^pbanalytics/log/\?parameter1=(?P<parameter1>[\w0-9-]+)¶meter2=(?P<parameter2>[\w0-9-]+), 'mydjangoapp.myFunction')
This was my second attempt:
(r'^pbanalytics/log/\\?parameter1=(?P<parameter1>[\w0-9-]+)¶meter2=(?P<parameter2>[\w0-9-]+), 'mydjangoapp.myFunction')
but still no luck!
Does anyone know how I might match a '?' exactly in a Django URL?
Don't. You shouldn't match query string with URL Dispatcher.
You can access all values using request.GET dictionary.
urls
(r'^pbanalytics/log/$', 'mydjangoapp.myFunction')
function
def myFunction(request)
param1 = request.GET.get('param1')
Django's URL patterns only match the path component of a URL. You're trying to match on the querystring as well, this is why you're having trouble. Your first regex does what you wanted, except that you should only ever be matching the path component.
In your view you can access the querystring via request.GET
The ? character is a reserved symbol in regex, yes. Your first attempt looks like proper escaping of it.
However, ? in a URL is also the end of the path and the beginning of the query part (like this: protocol://host/path/?query#hash.
Django's URL dispatcher doesn't let you dispatch URLs based on the query part, AFAIK.
My suggestion would be writing a django view that does the dispatching based on the request.GET parameter to your view function.
The way to do what the original question was i.e. catch-all in URL dispatch var...
url(r'^mens/(?P<pl_slug>.+)/$', 'main.views.mens',),
or
url(r'^mens/(?P<pl_slug>\?+)/$', 'main.views.mens',),
As far as why this is needed, GET URL's don't exactly provide good "permalinks" or good presentation in general for customers and to clients.
Clients often times request the url be formatted i.e.
www.example-clothing-site.com/mens/tops/shirts/t-shirts/Big_Brown_Shirt3XL
this is a far more readable interface for the end-user and provides a better overall presentation for the client.
Many web sites support folksonomy tags. You may have heard of rel-tag, where it says that "The last path component of the URL is the text of the tag".
I am looking for a bookmarklet or greasemonkey script (javascript) to get the "last path component" for the URL currently being viewed in the browser, add that tag into another URL, and then open that page in a new tab or window.
For example, if I am looking at a delicious.com page with the tag "foo", I may want to create a new URL with the tag "foo". This should also work for multiple tags in the last path component, such as, foo+bar.
Some regexp suggestions have been offered.
Since you're using JavaScript, there's no need to worry about hostnames, querystrings, etc - just use location.pathname to get at the important bit.
For example:
var NewUrl = 'http://technorati.com/tag/';
var LastPart = location.pathname.match( /[^\/]+\/?$/ );
window.open( NewUrl + LastPart );
That allows for a potential single trailing slash.
You can use /[^\/]+\$/ to disallow trailing slashes, or /[^\/]+\/*$/ for any number of them.
If you can assume both your URLs to be valid, you can get the tag from the first URL with this regex:
^[a-z]+://[^/#?]+/[^#?]*?([^#?/]+)(?:[#?]|$)
The first (and only) capturing group will hold the tag. This regex won't match URLs that don't have any tags.
To append the tag to another URL, search for the regex:
^([^#?]*?)/?(?:[#?]|$)
and replace with:
$1/tag
This regex makes sure not to end up with two adjacent slashes in the URL if the path of the original URL ends with a slash.
implementation, as in how the servers are set up, all that jazz? I'm not very knowledgeable about that stuff =\ ahh that sounds