Regex for URLs shows ".finance" as invalid/error - regex

I got this regex for my laravel URL validator and if i put a ".finance" domain it shows it's against the regex. What is wrong? All other tested domain endings work so far.
$regex = '/^(http:\/\/www\.|https:\/\/www\.|http:\/\/|https:\/\/)?[a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,5}(:[0-9]{1,5})?(\/.*)?$/';

Replace [a-z]{2,5} with [a-z]{2,} to allow any two or more letters in the TLD.
Remove {1}, these are always unnecessary.
(http:\/\/www\.|https:\/\/www\.|http:\/\/|https:\/\/)? is way to repeptitive, shorten it with optional groups / chars to (?:https?:\/\/(?:www\.)?)?.
Use
/^(?:https?:\/\/(?:www\.)?)?[a-z0-9]+(?:[-.][a-z0-9]+)*\.[a-z]{2,}(?::[0-9]{1,5})?(\/.*)?$/
See proof.
Explanation
--------------------------------------------------------------------------------
^ the beginning of the string
--------------------------------------------------------------------------------
(?: group, but do not capture (optional
(matching the most amount possible)):
--------------------------------------------------------------------------------
http 'http'
--------------------------------------------------------------------------------
s? 's' (optional (matching the most amount
possible))
--------------------------------------------------------------------------------
: ':'
--------------------------------------------------------------------------------
\/ '/'
--------------------------------------------------------------------------------
\/ '/'
--------------------------------------------------------------------------------
(?: group, but do not capture (optional
(matching the most amount possible)):
--------------------------------------------------------------------------------
www 'www'
--------------------------------------------------------------------------------
\. '.'
--------------------------------------------------------------------------------
)? end of grouping
--------------------------------------------------------------------------------
)? end of grouping
--------------------------------------------------------------------------------
[a-z0-9]+ any character of: 'a' to 'z', '0' to '9'
(1 or more times (matching the most amount
possible))
--------------------------------------------------------------------------------
(?: group, but do not capture (0 or more times
(matching the most amount possible)):
--------------------------------------------------------------------------------
[-.] any character of: '-', '.'
--------------------------------------------------------------------------------
[a-z0-9]+ any character of: 'a' to 'z', '0' to '9'
(1 or more times (matching the most
amount possible))
--------------------------------------------------------------------------------
)* end of grouping
--------------------------------------------------------------------------------
\. '.'
--------------------------------------------------------------------------------
[a-z]{2,} any character of: 'a' to 'z' (at least 2
times (matching the most amount possible))
--------------------------------------------------------------------------------
(?: group, but do not capture (optional
(matching the most amount possible)):
--------------------------------------------------------------------------------
: ':'
--------------------------------------------------------------------------------
[0-9]{1,5} any character of: '0' to '9' (between 1
and 5 times (matching the most amount
possible))
--------------------------------------------------------------------------------
)? end of grouping
--------------------------------------------------------------------------------
( group and capture to \1 (optional
(matching the most amount possible)):
--------------------------------------------------------------------------------
\/ '/'
--------------------------------------------------------------------------------
.* any character except \n (0 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
)? end of \1 (NOTE: because you are using a
quantifier on this capture, only the LAST
repetition of the captured pattern will be
stored in \1)
--------------------------------------------------------------------------------
$ before an optional \n, and the end of the
string

Related

Regex uppercase for 2 character only

I've got the below which is for a Git push rule, it works fine with the below example:
^(fix|feature|wip|dev-task|release)\/([a-zA-Z]+)\-\d+\-(\w+\-*)*
feature/ab-1234-fix
How can enforce uppercase letters ONLY for the AB? This will only ever be 2 characters.
feature/AB-1234-fix
Use
^(fix|feature|wip|dev-task|release)\/(AB|a[^b]|[^a]b|[a-zA-Z]|[a-zA-Z]{3,})\-\d+\-(\w+\-*)*
See proof.
EXPLANATION
--------------------------------------------------------------------------------
^ the beginning of the string
--------------------------------------------------------------------------------
( group and capture to \1:
--------------------------------------------------------------------------------
fix 'fix'
--------------------------------------------------------------------------------
| OR
--------------------------------------------------------------------------------
feature 'feature'
--------------------------------------------------------------------------------
| OR
--------------------------------------------------------------------------------
wip 'wip'
--------------------------------------------------------------------------------
| OR
--------------------------------------------------------------------------------
dev-task 'dev-task'
--------------------------------------------------------------------------------
| OR
--------------------------------------------------------------------------------
release 'release'
--------------------------------------------------------------------------------
) end of \1
--------------------------------------------------------------------------------
\/ '/'
--------------------------------------------------------------------------------
( group and capture to \2:
--------------------------------------------------------------------------------
AB 'AB'
--------------------------------------------------------------------------------
| OR
--------------------------------------------------------------------------------
a 'a'
--------------------------------------------------------------------------------
[^b] any character except: 'b'
--------------------------------------------------------------------------------
| OR
--------------------------------------------------------------------------------
[^a] any character except: 'a'
--------------------------------------------------------------------------------
b 'b'
--------------------------------------------------------------------------------
| OR
--------------------------------------------------------------------------------
[a-zA-Z] any character of: 'a' to 'z', 'A' to 'Z'
--------------------------------------------------------------------------------
| OR
--------------------------------------------------------------------------------
[a-zA-Z]{3,} any character of: 'a' to 'z', 'A' to 'Z'
(at least 3 times (matching the most
amount possible))
--------------------------------------------------------------------------------
) end of \2
--------------------------------------------------------------------------------
\- '-'
--------------------------------------------------------------------------------
\d+ digits (0-9) (1 or more times (matching
the most amount possible))
--------------------------------------------------------------------------------
\- '-'
--------------------------------------------------------------------------------
( group and capture to \3 (0 or more times
(matching the most amount possible)):
--------------------------------------------------------------------------------
\w+ word characters (a-z, A-Z, 0-9, _) (1 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
\-* '-' (0 or more times (matching the most
amount possible))
--------------------------------------------------------------------------------
)* end of \3 (NOTE: because you are using a
quantifier on this capture, only the LAST
repetition of the captured pattern will be
stored in \3)

Regex help - match one string but not another

I have been using this:
~^\/student-accommodation\/(?:[^\/]+?)\/([^\/]+)\/$
to match for URLs like
/student-accommodation/manchester/ropemaker-court-manchester/
But now I need to edit this regex so it also matches for URLs like the below. All these new URLs will follow the same pattern and add a string that starts with #utm-source. Importantly they won't have another / in them.
/student-accommodation/manchester/ropemaker-court-manchester/#utm_source=afs&utm_medium=email&utm_campaign=ropemakercourt_afs_dec20
But then I don't want the regex to match for URLs like the below:
/student-accommodation/manchester/ropemaker-court-manchester/en-suite/
Can anyone help? I am a novice at regex! Thanks
Use
^\/student-accommodation\/[^\/]+\/([^\/]+)\/(?:#utm_source.*)?$
See proof
Explanation
--------------------------------------------------------------------------------
^ the beginning of the string
--------------------------------------------------------------------------------
\/ '/'
--------------------------------------------------------------------------------
student- 'student-accommodation'
accommodation
--------------------------------------------------------------------------------
\/ '/'
--------------------------------------------------------------------------------
[^\/]+ any character except: '\/' (1 or more
times (matching the most amount possible))
--------------------------------------------------------------------------------
\/ '/'
--------------------------------------------------------------------------------
( group and capture to \1:
--------------------------------------------------------------------------------
[^\/]+ any character except: '\/' (1 or more
times (matching the most amount
possible))
--------------------------------------------------------------------------------
) end of \1
--------------------------------------------------------------------------------
\/ '/'
--------------------------------------------------------------------------------
(?: group, but do not capture (optional
(matching the most amount possible)):
--------------------------------------------------------------------------------
#utm_source '#utm_source'
--------------------------------------------------------------------------------
.* any character except \n (0 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
)? end of grouping
--------------------------------------------------------------------------------
$ before an optional \n, and the end of the
string

Regex for all sort of name

I am using this regex for handling all sort of names:
String Regex_Name="^([A-Za-z]*|\\p{L})+([ ]*|[A-Za-z]*|[']*|\\p{L}*)+([\\s]?[A-Za-z]*)+[A-Za-z]$";
While running the code I am getting this error:
Unknown character property name {​​​​​​​L} near index 44
^[A-Za-z][[A-Za-z]*\p{​​​​​​​L}​​​​​​​*[,]?[ ]?[-]?[A-Za-z]+]+([ ]?[.]?[,]?[(]?[A-Za-z]+[)]?[-]?\p{​​​​​​​L}​​​​​​​*)+([,]?|[.]?)$
How can I solve the issue?
Use
String Regex_Name="^\\p{L}+(?:[’'-]\\p{L}+)*(?:\\s+\\p{L}+(?:[’'-]\\p{L}+)*)*$";
See proof.
The expression does not support shortened, abbreviated names, like John G. Smith.
Explanation
--------------------------------------------------------------------------------
^ the beginning of the string
--------------------------------------------------------------------------------
\p{L}+ any character of: letters (1 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
(?: group, but do not capture (0 or more times
(matching the most amount possible)):
--------------------------------------------------------------------------------
[’'-] any character of: '’', ''', '-'
--------------------------------------------------------------------------------
\p{L}+ any character of: letters (1 or more
times (matching the most amount
possible))
--------------------------------------------------------------------------------
)* end of grouping
--------------------------------------------------------------------------------
(?: group, but do not capture (0 or more times
(matching the most amount possible)):
--------------------------------------------------------------------------------
\s* whitespace (\n, \r, \t, \f, and " ") (0 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
\p{L}+ any character of: letters (1 or more
times (matching the most amount
possible))
--------------------------------------------------------------------------------
(?: group, but do not capture (0 or more
times (matching the most amount
possible)):
--------------------------------------------------------------------------------
[’'-] any character of: '’', ''', '-'
--------------------------------------------------------------------------------
\p{L}+ any character of: letters (1 or more
times (matching the most amount
possible))
--------------------------------------------------------------------------------
)* end of grouping
--------------------------------------------------------------------------------
)* end of grouping
--------------------------------------------------------------------------------
$ before an optional \n, and the end of the
string

Sort Email:Pass List by first occurrence of each email

I want to sort a email:password by first occurrence of each email.
Example list:
email#example.com:passsword1
email#example.com:passsword2
email#example.com:passsword3
email1#example.com:passsword1
email1#example.com:passsword2
email1#example.com:passsword2
So only
email#example.com:passsword1
email1#example.com:passsword1
should be kept as result.
With my limited Regex skills I worked out this one but I guess I misunderstand something:
^(.*)(\r?\n\1)+(?=:)
Use
^((.*:).*)(?:\r?\n\2.*)+
See proof, use g and m flags.
Explanation
--------------------------------------------------------------------------------
^ the beginning of the string
--------------------------------------------------------------------------------
( group and capture to \1:
--------------------------------------------------------------------------------
( group and capture to \2:
--------------------------------------------------------------------------------
.* any character except \n (0 or more
times (matching the most amount
possible))
--------------------------------------------------------------------------------
: ':'
--------------------------------------------------------------------------------
) end of \2
--------------------------------------------------------------------------------
.* any character except \n (0 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
) end of \1
--------------------------------------------------------------------------------
(?: group, but do not capture (1 or more times
(matching the most amount possible)):
--------------------------------------------------------------------------------
\r? '\r' (carriage return) (optional
(matching the most amount possible))
--------------------------------------------------------------------------------
\n '\n' (newline)
--------------------------------------------------------------------------------
\2 what was matched by capture \2
--------------------------------------------------------------------------------
.* any character except \n (0 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
)+ end of grouping

Making some info optional in regex

I am writing regex for below pattern: sftp://user:password#host[:port]/path
I have written the following sftp://(.+):(.+)#(.+):(\d+)/(.*) which matches the pattern, where group1 matches user, group2 matches password, group3 matches host name and group4 matches port number and group5 matches path
However, the port number can be optional parameter, I have tried the below regex where port group is followed by a ?.
sftp://(.+):(.+)#(.+)(:(\d+))?\/(.*)
Here group3 matches with host:port which is not what is expected.
How to make the regex where the port param is optional ?
Use
sftp://([^/#]+):([^/#]+)#([^/]+?)(?::(\d+))?/(.*)
See proof
EXPLANATION
NODE EXPLANATION
--------------------------------------------------------------------------------
sftp:// 'sftp://'
--------------------------------------------------------------------------------
( group and capture to \1:
--------------------------------------------------------------------------------
[^/#]+ any character except: '/', '#' (1 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
) end of \1
--------------------------------------------------------------------------------
: ':'
--------------------------------------------------------------------------------
( group and capture to \2:
--------------------------------------------------------------------------------
[^/#]+ any character except: '/', '#' (1 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
) end of \2
--------------------------------------------------------------------------------
# '#'
--------------------------------------------------------------------------------
( group and capture to \3:
--------------------------------------------------------------------------------
[^/]+? any character except: '/' (1 or more
times (matching the least amount
possible))
--------------------------------------------------------------------------------
) end of \3
--------------------------------------------------------------------------------
(?: group, but do not capture (optional
(matching the most amount possible)):
--------------------------------------------------------------------------------
: ':'
--------------------------------------------------------------------------------
( group and capture to \4:
--------------------------------------------------------------------------------
\d+ digits (0-9) (1 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
) end of \4
--------------------------------------------------------------------------------
)? end of grouping
--------------------------------------------------------------------------------
/ '/'
--------------------------------------------------------------------------------
( group and capture to \5:
--------------------------------------------------------------------------------
.* any character except \n (0 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
) end of \5