I am trying to get this regex dialed-in to validate whether a URL begins with https and if a port is supplied the only valid values are 443 or 5443. This regex is pretty close but not quite there.
^(https:\/\/)([a-zA-Z\d\.]{2,})\.([a-zA-Z]{2,})(:5{0,1}443)?(.)*
How do I solve this problem?
This is a mainstream URL validator that tests if it's between whitespace boundary's.
It only allows https device and the port numbers 5443 or 443.
(?<!\S)https://(?:\S+(?::\S*)?#)?(?:(?:(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z\u00a1-\uffff0-9]+-?)*[a-z\u00a1-\uffff0-9]+)(?:\.(?:[a-z\u00a1-\uffff0-9]+-?)*[a-z\u00a1-\uffff0-9]+)*(?:\.(?:[a-z\u00a1-\uffff]{2,})))|localhost)(?::5?443)?(?:/[^\s]*)?(?!\S)
Readable version
(?<! \S )
https ://
(?:
\S+
(?: : \S* )?
#
)?
(?:
(?:
(?:
[1-9] \d?
| 1 \d\d
| 2 [01] \d
| 22 [0-3]
)
(?:
\.
(?: 1? \d{1,2} | 2 [0-4] \d | 25 [0-5] )
){2}
(?:
\.
(?:
[1-9] \d?
| 1 \d\d
| 2 [0-4] \d
| 25 [0-4]
)
)
| (?:
(?: [a-z\u00a1-\uffff0-9]+ -? )*
[a-z\u00a1-\uffff0-9]+
)
(?:
\.
(?: [a-z\u00a1-\uffff0-9]+ -? )*
[a-z\u00a1-\uffff0-9]+
)*
(?:
\.
(?: [a-z\u00a1-\uffff]{2,} )
)
)
| localhost
)
(?: : 5? 443 )?
(?: / [^\s]* )?
(?! \S )
You should append a / after this optional port group so it doesn't allow any digits before a /. Try using this regex,
^(https:\/\/)([a-zA-Z\d\.]{2,})\.([a-zA-Z]{2,})(:5?443)?\/\S*
Notice, I've also changed (:5{0,1}443)? to (:5?443)? and changed last .* to \S* so the url doesn't capture spaces as spaces in URL is not a valid thing. Besides that, you can also get rid of so many groups in your regex, unless you need them.
Regex Demo
Edit:
As you said in comments, that you want to match following URLs too,
https://example.com
https:example.com
https:example.com:443
you need to make \/\S* part optional by placing a ? after them. The modified regex becomes this, which will match above URLs.
^https:\/\/([a-zA-Z\d\.]{2,})\.([a-zA-Z]{2,})(:5?443)?(\/\S*)?
Demo with filepath part being optional
Your RegEx seems to work okay. You may try using this RegEx and add additional boundaries, just for safety, if you wish so:
^(https:\/\/)([a-zA-Z\d\.]{2,})\.([a-zA-Z]{2,}):(5443|443)?$
I only added a $ end char so that to bound your original expression from the right. You may just simply add a few port numbers, if you may have, in this capturing group:
(5443|443)
You can also remove unnecessary boundaries, if you wish.
Related
This question already has answers here:
Regex for find All ip address except IP address starts with 172
(3 answers)
Closed 3 years ago.
The following regex captures IP addresses as well as DNS hostnames.
What I'd like is to add some IPs to ignore, such as 1.0.0.0 and 0.0.0.0 for example. I tried some negative lookahead without success.
[\w-]+(\.[\w-]+)+
for example :
www.google.com 255.255.255.255 1.0.0.0 stackoverflow.com 0.0.0.0
should match 3 out of 5 in that line
Any tips would be great.
edit : I tried this, which somewhat works but also filters out other values such as 1.1.1.1 for example
(?![1\.0\.0\.0]|[0\.0\.0\.0])[\w-]+(\.[\w-]+)+
To find IP's and domains while ignoring IP's 1.0.0.0 and 0.0.0.0 and
validation ov Ipv4 and domain contains at least a letter, all wrapped inside
a white space boundary is thisr :
(?<!\S)(?!0{0,2}[01](?:\.0{1,3}){3})(?:(?:0{0,2}\d|0?[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])(?:\.(?:0{0,2}\d|0?[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])){3}|(?=\S*[a-zA-Z])[\w-]+(?:\.[\w-]+)+)(?!\S)
https://regex101.com/r/ZPQS5K/1
Expanded
(?<! \S )
(?! # Not 0.0.0.0 or 1.0.0.0
0{0,2} [01]
(?: \. 0{1,3} ){3}
)
(?:
(?: # IP address
0{0,2} \d
| 0? [1-9] \d
| 1 \d{2}
| 2 [0-4] \d
| 25 [0-5]
)
(?:
\.
(?:
0{0,2} \d
| 0? [1-9] \d
| 1 \d{2}
| 2 [0-4] \d
| 25 [0-5]
)
){3}
| # or
(?= \S* [a-zA-Z] ) # At least a letter
[\w-]+ # Domain
(?: \. [\w-]+ )+
)
(?! \S )
I am trying to match to automatically grade a student's answer to a question where the correct answer is:
read and execute for owner and read for everyone
The order of their answer doesn't matter so
read for everyone and read and execute for owner
is an acceptable answer. Then all the fluff (and, for) doesn't matter so I am really just looking for either of these
read execute owner read everyone
read everyone read execute owner
I can get a regex to accept either answer
(?=.*read.*execute.*owner)(?=.*read.*everyone)
But obviously that accepts more answer that are clearly wrong like "read execute owner read execute everyone". So I tried using the negative look-ahead for "execute" with everyone, but then it still matches the "execute" for owner and says no regex match.
Is there are way to accomplish what I am trying to do? Thanks.
Just make the and/for optional.
# (?=.*(\bread(?:\s+and)?\s+execute(?:\s+for)?\s+owner\b))(?=.*(\bread(?:\s+for)?\s+everyone\b))
(?=
.*
( # (1 start)
\b read
(?: \s+ and )?
\s+ execute
(?: \s+ for )?
\s+
owner \b
) # (1 end)
)
(?=
.*
( # (2 start)
\b read
(?: \s+ for )?
\s+ everyone \b
) # (2 end)
)
Edit: You could also allow for optionally any words between the
key words by excluding all the keywords from between the keywords.
Like this -
# (?=.*(\bread(?:\s+(?:(?!\b(?:read|execute|owner|everyone)\b).)+?)?\s+execute(?:\s+(?:(?!\b(?:read|execute|owner|everyone)\b).)+?)?\s+owner\b))(?=.*(\bread(?:\s+(?:(?!\b(?:read|everyone|execute|owner)\b).)+?)?\s+everyone\b))
(?=
.*
( # (1 start)
\b read
(?:
# Optional words Between Keywords -
# not any of this or the other ones keywords
\s+
(?:
(?!
\b
(?:
read # this
| execute # this
| owner # this
| everyone # other
)
\b
)
.
)+?
)?
\s+ execute
(?:
# Optional words Between Keywords
\s+
(?:
(?!
\b
(?:
read
| execute
| owner
| everyone
)
\b
)
.
)+?
)?
\s+
owner \b
) # (1 end)
)
(?=
.*
( # (2 start)
\b read
(?:
# Optional words Between Keywords -
# not any of this or the other ones keywords
\s+
(?:
(?!
\b
(?:
read # this
| everyone # this
| execute # other
| owner # other
)
\b
)
.
)+?
)?
\s+ everyone \b
) # (2 end)
)
I searched a lot but can not find this regular expression. My problem is that I made a calculator but can not validate my display entirely. My case is with the dot
I need my regular expression to be: digit dot digit operator digit dot ( 1.23+1.23+1.). The dot must be placed only once not like (1..23+ 1.1.1). I have found similar regular expression but it didn't cover the case (1.23 +1.)
Here is my regEx -> /[0-9-+/*]+(\.[0-9][0-9]?)?/g
Could use this
^[+-]?(?:\d+(?:\.\d*)?|\.\d+)(?:[+-](?:\d+(?:\.\d*)?|\.\d+))*$
Expanded:
^ # BOS
[+-]? # Optional Plus or minus
(?: # Decimal term
\d+
(?: \. \d* )?
| \. \d+
)
(?: # Optionally, many more terms
[+-] # Required Plus or minus
(?: # Decimal term
\d+
(?: \. \d* )?
| \. \d+
)
)*
$ # EOS
Check this out(demo):
/^(([-+*\/ ]+)?(\b(\d+\.\d+)\b|\d))+$/
but it will work only if there is one equation per string - it matches at beginning (^) and ant the end ($) of a string. However you can also use it with /m or/and /g modifiers.
EDIT
If it is only about '–' character it is enough to add it to character class:
/^(([-–+*\/ ]+)?(\b(\d+\.\d+)\b|\d))+$/
I have a .txt file which contains:
"'the url address i checked is: https://www.google.com/ for 2times and it's awesome!."
After parsing, the expected output should be:
['"',"'",'the','url','address','i','checked','is',':','https://www.google.com/','for','2','times','and',"it's",'awesome','!','.','"']
How do I split this list to get the output using the re module.
I came up with this pattern:
pattern = re.compile(r"\d+|[a-zA-Z]+[a-zA-Z']*|[^\w\s]")
but this is also splitting my URL.
Can any one please help?
Just pick a url regex from somewhere and make it first in the alternations.
An example only -
# (?!mailto:)(?:(?:https?|ftp)://)?(?:\S+(?::\S*)?#)?(?:(?:(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z\u00a1-\uffff0-9]+-?)*[a-z\u00a1-\uffff0-9]+)(?:\.(?:[a-z\u00a1-\uffff0-9]+-?)*[a-z\u00a1-\uffff0-9]+)*(?:\.(?:[a-z\u00a1-\uffff]{2,})))|localhost)(?::\d{2,5})?(?:/[^\s]*)?|\d+|[a-zA-Z]+[a-zA-Z']*|[^\w\s]
(?! mailto: )
(?:
(?: https? | ftp )
://
)?
(?:
\S+
(?: : \S* )?
#
)?
(?:
(?:
(?:
[1-9] \d?
| 1 \d\d
| 2 [01] \d
| 22 [0-3]
)
(?:
\.
(?: 1? \d{1,2} | 2 [0-4] \d | 25 [0-5] )
){2}
(?:
\.
(?:
[1-9] \d?
| 1 \d\d
| 2 [0-4] \d
| 25 [0-4]
)
)
| (?:
(?: [a-z\u00a1-\uffff0-9]+ -? )*
[a-z\u00a1-\uffff0-9]+
)
(?:
\.
(?: [a-z\u00a1-\uffff0-9]+ -? )*
[a-z\u00a1-\uffff0-9]+
)*
(?:
\.
(?: [a-z\u00a1-\uffff]{2,} )
)
)
| localhost
)
(?: : \d{2,5} )?
(?: / [^\s]* )?
| \d+
| [a-zA-Z]+ [a-zA-Z']*
| [^\w\s]
Outputs:
['"',"'",'the','url','address','i','checked','is',':','https://www.google.com/','for','2','times','and',"it's",'awesome','!','.','"']
I have an input string ("My Email id is abc # gmail.com"). From the input string I need to validate Email id using Regex and need to replace it with (xxxxxxx).
I am using the below pattern but it doesn't work if the Email Id contains white Space.
\\w+([-+.']\\w+)*#\\w+([-.]\\w+)*\\.\\w+([-.]\\w+)*
Thanks.
If all you want to do is add whitespaces to word characters and maintain the original
regex integrity, it starts to get ugly:
// (?=\\s*\\w)[\\w\\s]+(?:[-+.'](?=\\s*\\w)[\\w\\s]+)*#(?=\\s*\\w)[\\w\\s]+(?:[-.](?=\\s*\\w)[\\w\\s]+)*\\.(?=\\s*\\w)[\\w\\s]+(?:[-.](?=\\s*\\w)[\\w\\s]+)*
(?= \s* \w )
[\w\s]+
(?:
[-+.']
(?= \s* \w )
[\w\s]+
)*
#
(?= \s* \w )
[\w\s]+
(?:
[-.]
(?= \s* \w )
[\w\s]+
)*
\.
(?= \s* \w )
[\w\s]+
(?:
[-.]
(?= \s* \w )
[\w\s]+
)*