Regex URL in django - django

Maybe this question is repeat it but I cant find an appropriate answer for my specific issue. I have two URL's:
url(r'^dashboard/completar-perfil/(?P<pk>[-_\w]+)/$', CompleteProfileView.as_view()),
url(r'^dashboard/.*$', DashboardView.as_view()),
As you can see both begin with dashboard. Problem is the first one does not render CompleteProfileView, always renders DashboardView, if I remove dashboard/ from the first URL, it does work fine, how can I achieve that both urls render each of their respective views?

The problem is that ^dashboard/.*$ is a greedy regular expression that will match everything that start with dashbord/, including dashboard/completar-perfil/.
So, you may need specify better the second regex. Do you really need .* ?
If it is the index of your dashboard, you could use ^dashboard/$. Otherwise, you could put another word between dashboard and your greedy regex, like the following:
r"^dashboard/another-word/.*$"

Related

REGEX: find URL with specific words/pages

I have the current regex exp:
http[s]?://(?:[a-zA-Z]|[0-9]|[$-_#.&+]|[!*\(\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+
Which retrieves all the urls from a file, but I need it to only get the urls with a specific page, let's say page-to-find and I can't seem to do it without having the expression to add to a second group and I only want it natively in one group instead of two, as direct as possible.
Any tips?
Thanks
If its a page what does it end in? .asp? .php? .aspx? .htm? .html? (Something else?)
Try this for a start:
http[s]?://.*page-to-find

Find multiple '/' forward slashes in string of URLs for sitemap

We are trying to clean up our site map as our Magento store has created duplicate pages. I want to use a regular expression to select, or invert select, all of the pages which are linked to the top level URL.
For example, we want to find the first line-
/site/product<<
/site/category/product/
/site/category/product
Is there any way to find only two instances of a forward slash in the whole string, which are not next to each other?
Thank you for your help in advance.
I've tried something like this
(.*(?<!\/)$)
Your pattern (.*(?<!\/)$) matches any char except a newline until the end of the string and after that asserts that what is on the left is not a forward slash which will give you the first and the third match.
You could match from the start of the string ^ 2 times a forward slash and then 1+ times not a forward slash or a newline [^/\n]+ and then assert the end of the string $
^/[^/\n]+/[^/\n]+$
Regex demo
I would like like to provide a quick answer to this problem in case it helps anyone else in the future. Our sitemap had too many duplicate URLs due to an incorrect set up on our Magento store. Instead of submitting a sitemap with 20,000+ top-level URLs we decided to manually remove the top level items ourselves.
Not ideal at all.
We tweaked with the site map PHP generation code to pull top-level URLs as site/category/id/###. Then we used Notepad++ to bookmark and delete these lines accordingly.

How to dismiss the end of the url parameters with regex?

I have a script that is supposed to trigger when a certain page path is open.
The issue: the page path contains multiple parameters including the parameter "returnUrl", returning the previous page visited.
Here is the url I want to check :
/cxsSearchApply?positionId=a0w0X000004IceYQAS&lang=en&returnUrl=https://example.com/cxsrec__cxsSearchDetail?id=a0w0X000004IceYQAS&lang=en&returnUrl=https://example.com/cxsrec__cxsSearch&lang=en
I initially used this regex code to get triggered on this page :
(cxsSearchApply.*)
But I have others regex codes like:
(cxsSearchSearchDetail.*)
And they also trigger because of the page path included in the url...
What reggex I should use to match the first part of the url but nothing after "returnUrl" ?
So you want to match cxsSearchApply on the text before &returnUrl. You could use a lookahead:
(cxsSearchApply.*)(?=returnUrl=)
However, what you really want is to match everything before the first &returnUrl. So you need a non-greedy operator:
(cxsSearchApply.*?)(?=returnUrl=)
Likewise, for your other search, it should no longer match because it is also only looking at the first part:
(cxsSearchSearchDetail.*?)(?=returnUrl=)
I believe that will get you what you want.
Nothing after "returnUrl"
If this is literally what you want, you can simply do (.*)(&returnUrl=.*) and take the first capture group as your result.

Google Analytics Regex excluding a certain url in a sub folder

Currently on my GA Account I have the following URL's from our website tracked:
domain/contact-us/
domain/contact-us/global-contact-list.aspx
domain/contact-us/contactlist.aspx
The first two are from our new website which we want to track, the last one is from our old website (traffic is still being tracked but we do not want to use this)
I tried using a regex filter on this as the following:
(^/contact-us/global-contact-list\.aspx)|(^/contact-us/)
Reading up, I believe this looks for matches of exactly:
/contact-us/global-contact-list or /contact us/ but would disallow /contact-us/contactlist/
for some reason, the above one is still coming through. Can someone please see as to why this may be happening or know why this is happening?
You need to add a negative look-behind or a end of string anchor:
(^/contact-us/global-contact-list\.aspx)|(^/contact-us/$)
or
(^/contact-us/global-contact-list\.aspx)|(^/contact-us/(?!contactlist/))
This way, you will exclude /contact-us/contactlist/ from matching.
Have a look at the Demo 1 and Demo 2.
BTW, /contact us/ will not pass since (^/contact-us/) only allows a hyphen. You should add a space, e.g. (^/contact-us/global-contact-list\.aspx)|(^/contact[-\s]us/$).
Also, (^/contact-us/global-contact-list\.aspx) won't match /contact-us/global-contact-list because it needs to match .aspx.

Perl/lighttpd regex

I'm using regex in lighttpd to rewrite URLs, but I can't write an expression that does what I want (which I thought was pretty basic, apparently not, I'm probably missing something).
Say I have this URL: /page/variable_to_pass/ OR /page/variable_to_pass/
I want to rewrite the URL to this: /page.php?var=variable_to_pass
I've already got rules like ^/login/(.*?)$ to handle specific pages, but I wanted to make one that can match any page without needing one expression per page.
I tried this: ^/([^.?]*) but it matches the whole /page/variable_to_pass/ instead of just page.
Any help is appreciated, thanks!
This regexp should do what you need
/([^\/]+)/(.+)
First match would be page name, and the second - variable value
Try:
/([^.?])+/([^.?])+/
That should give you two matches.