Regex match url without file extension - regex

I would like some help matching the following urls.
/settings => /settings.php
/657_46hallo => /657_46hallo.php
/users/create => /users.php/create
/contact/create/user => /contact.php/create/user
/view/info.php => /view.php/info.php
/view/readme - now.txt => /view.php/readme - now.txt
/ => [NO MATCH]
/filename.php => /unknown.php
/filename.php/users/create => /unknown.php
if the first part after the domain name is a filename ending with ".php"
(see last 2 examples) It should redirect to /unknown.php
I think I need 2 regular expressions
1st should be almost something like: ^/([a-zA-Z0-9_]+)(/)?(.*)?$
2nd to catch the direct filename "/filename.php" or "/filename.php/create/user"
so I can redirect to unknown.php
The 1st regular expression that I got almost works for the first part.
==============================================
request url: http://domain.com/user/create
regex: ^/([a-zA-Z0-9_]+)(/)?(.*)?$
replace http://domain.com/$1.php$2$3
makes: http://domain.com/user.php/create
Problem is it also matches http://domain.com/user.php/create
If someone could help me with both regular expressions that would be great.

If you want to match those .php cases you can try this:
^\/([a-zA-Z0-9_]+)(\/)?(.*)?$
See here on Regexr
If you want to avoid those cases try this:
^/([a-zA-Z0-9_]+)(?!\.php)(?:(/)(.*)|)$
See here on Regexr
The (?!\.php) is a negative look ahead that ensures that there is no .php at this place.

When all you have is a hammer...
While this probably could be solved with a regexp, it is probably the wrong tool for the job, unless you have constraints that MANDATE the use of regexps.
Split the string using '/' as the delimiter, see whether the first component ends with '.php'; if so, reject it, otherwise append '.php' to the first component and join the components back using '/'.

Related

Fluentvalidation 6.4.1.0 support me with Incorrect regex

In my case, i want to validate for url image, some url is valid but result is wrong.
Eg: link image is "https://fuvitech.online/wpcontent/uploads/2021/02/bta16600brg.jpg" or "https://fuvitech.online/wp-content/uploads/2021/02/bta16-600brg.jpg" reponse "The image link is not in the correct format".
My code here:
RuleFor(product => product.Images)
.Length(1, 3000).WithMessage(Labels.importProduct_ExceedDescription, p => ImportHelpers.GetColumnName(typeof(ProductEntity).GetProperty(nameof(p.Images))))
.Matches(#"^(http:\/\/|https:\/\/){1}?[a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,5}(:[0-9]{1,5})?(\/.*)?$").WithMessage(Labels.importProduct_UrlNotCorrect, p => ImportHelpers.GetColumnName(typeof(ProductEntity).GetProperty(nameof(p.Images))));
Please help me where the above regex is wrong. Thank you.
Try this:
NOTE the following regex pattern may trigger false positives and also may ignore valid image URLs, because it is very difficult to validate whether a given URL is valid.
^https?:\/\/(?:(?:[A-Za-z0-9]+(?:-[A-Za-z0-9]+)+|[A-Za-z0-9]{2,})\.)+[A-Za-z]{2,}(?::\d+)?\/(?:(?:[A-Za-z0-9]+(?:(?:-[A-Za-z0-9]+)+)?\/)+|)[\w-]+\.(?:jpg|jpeg|png)$
Explanation
^ the start of a line/string.
https?:\/\/ match http with an optional letter s, followed by ://.
(?:(?:[A-Za-z0-9]+(?:-[A-Za-z0-9]+)+|[A-Za-z0-9]{2,})\.)+ This will match things like foo-foo.bar-bar., foo.bar-bar. and foo.
[A-Za-z]{2,} this will match the TLD part, e.g., com, org, this part with the previous part will match things like foo-foo.bar-bar.com, foo.bar-bar.com or foo.com.
(?::\d+)? optional group of (a colon : followed by one or more digits) for port part.
\/(?:(?:[A-Za-z0-9]+(?:(?:-[A-Za-z0-9]+)+)?\/)+|) this check for two things, the first one is /uploads/public-images/, /uploads/images/, the second one is a single /.
[\w-]+ this part for the file name, e.g., bta16-600brg.
\.(?:jpg|jpeg|png) you can add here multiple extensions, you can allow uppercase letters by using for example, [Jj][Pp][Gg] for jpg.
$ the end of the line/string.
See regex demo
Thanks #SaSkY answer my question.
I found my mistake.
This source [.[a-z]{2,5}] only allows domain extensions from 2-5 characters. Example [.com] is valid. But in my case [.online] was not valid.
I changed to [.[a-z]{1,10}].

ReWrite Rule in IIS using RegEx

For the last couple hours I tried to create a ReWrite Rule in IIS which does meet my requirements, but I just don't get it. So maybe someone can help me out :-)
What I have till now is the following:
Rewrite URL
index.php?page={R:1}&param={R:2}
RegEx Pattern
^(.*)/([0-9]+)
This is how I'd like to write my URLs at the end:
http://url.com/path1/path2/path3/param/ for example http://url.com/news/detail/1/
In this example I should have "news/detail" in "?page" and "1" in "?param".
With the rule I created so far that seems to work quite good, as long as I have a number at the end (param).
My only problem is that I want to make the number (param) optional.
Thanks a lot for your support.
I can't come up with something more permissive than this:
^([a-z-/]+)([0-9])?
Base on your comment Paths never have a number in it, I went a bit further and allowed only characters from a to z (use the ignore case option if needed).
This rule will match any of the following url:
news/detail/1/ => {R:1} = news/detail/ and {R:2} = 1
news/detail/1 => {R:1} = news/detail/ and {R:2} = 1
news/detail/ => {R:1} = news/detail/ and {R:2} is empty
news/detail => {R:1} = news/detail and {R:2} is empty
You probably will have to deal with the trailing / in your code.
The limitation comes from the fact that as far as I know, the regex in the rewrite rule doesn't support negative lookahead/lookbehind pattern.
To allow for the final match capture to be optional put a ? by it. Also specify that it will be at the by anchoring it with the end character $.
^(.*)/?(\d+)?$
I have also made the final / optional since if there is no digit you don't want match to fail if it does not have a / at the end (which should be optional).

Ruby Puppet Regex string matching

I'm somewhat new to ruby and have done a ton of google searching but just can't seem to figure out how to match this particular pattern. I have used rubular.com and can't seem to find a simple way to match. Here is what I'm trying to do:
I have several types of hosts, they take this form:
Sample hostgroups
host-brd0000.localdomain
host-cat0000.localdomain
host-dog0000.localdomain
host-bug0000.localdomain
Next I have a case statement, I want to keep out the bugs (who doesn't right?). I want to do something like this to match the series of characters. However, it starts matching at host-b, host-c, host-d, and matches only a single character as if I did a [brdcatdog].
case $hostgroups { #variable takes the host string up to where the numbers begin
# animals to keep
/host-[["brd"],["cat"],["dog"]]/: {
file {"/usr/bin/petstore-friends.sh":
owner => petstore,
group => petstore,
mode => 755,
source => "puppet:///modules/petstore-friends.sh.$hostgroups",
}
}
I could do something like [bcd][rao][dtg] but it's not very clean looking and will match nonsense like "bad""cot""dat""crt" which I don't want.
Is there a slick way to use \A and [] that I'm missing?
Thanks for your help.
-wootini
How about using negative lookahead?
host-(?!bug).*
Here is the RUBULAR permalink matching everything except those pesky bugs!
Is this what you're looking for?
host-(brd|cat|dog)
(Following gtgaxiola's example, here's the Rubular permalink)

Regex to match any character or fullstop?

I'm trying to create a regex that takes a filename like:
/cloud-support/filename.html#pagesection
and redirects it to:
/cloud-platform/filename#pagesection
Could anyone advise how to do this?
Currently I've got part-way there, with:
"^/cloud-support/(.*)$" => "/cloud-platform/$1",
which redirects the directory okay - but still has a superfluous .html.
Could I just match for a literal .html with optional #? How would I do that?
Thanks.
Maybe something like this:
"^/cloud-support/(.*?)(\.html)?(#.+)$" => "/cloud-platform/$1$3"
where the first group is a non-greedy match (.*?)
"^/cloud-support/(\w+).html(.*)" => "/cloud-platform/$1$2"
Would something like this work?
"^/cloud-support/([^.]+)[^#]*(.*)$" => "/cloud-platform/$1$2"
Can you try the regex
"^/cloud-support/(.*)\.html(#.*)?$"
The \.html part matches .html while (#.*)? allows an optional # plus something.

RegEx to exclude a string

I have the following strings in my application.
/admin/stylesheets/11
/admin/javascripts/11
/contactus
what I want to do is to write a regular expression to capture anything other than string starting with 'admin'
basically my regex should capture only
/contactus
by excluding both
/admin/stylesheets/11
/admin/javascripts/11
to capture all i wrote
/.+/
and i wrote /(admin).+/ which captures everything starts with 'admin'. how can i do the reverse. I mean get everything not starting with 'admin'
thanks in advance
cheers
sameera
EDIT - Thanks all for the answers
I'm using ruby/ Rails3 and trying to map a route in my routes.rb file
My original routes file is as followss
match '/:all' => 'page#index', :constraints => { :all => /.+/ }
and i want the RegEx to replace /.+/
thanks
If the language/regular expression implementation you are using supports look-ahead assertions, you can do this:
^/(?!admin/).+/
Otherwise, if you only can use basic syntax, you will need to do something like this:
^/([^a].*|a($|[^d].*|d($|[^m].*|m($|[^i].*|i($|[^n].*)))))