Email Regular Expression Explanation - regex

I'm trying to understand a regular expression which is currently being used to validate the input of an email address on a website. The value of this email address is used to populate a target system; validation of which can be expressed in plain English.
I would like to be able to highlight, with the use of examples, where the website validated email address imposes validation rules that are not required in the target system. To this end, I have obtained the regular expression from the developer, and am requiring some assistance in translating it to allow it to be understood in plain English:
^[_A-Za-z0-9_%+-]+(\\.[_A-Za-z0-9_%+-]+)*#[A-Za-z0-9]+(\\.[A-Za-z0-9]+)*(\\.[A-Za-z]{2,4})$
So far, I have gained some understanding from a previous post.
... which would seem to confirm the following:
^ = The matched string must begin here, and only begin here
[ ] = match any character inside the brackets, but only match one.
I'm not sure of the relevance of "only match one". Can anyone advise?
\+ = match previous expression at least once, unlimited number of times.
Presumably this means the previous expression refers to the characters contained within the preceding square brackets and it can be repeated unlimited times?
() = make everything inside the parentheses a group (and make them referencable).
I'm not sure what this might mean.
\\. = match a literal full stop (.)
Then we have a repeat of the square bracket content, though I'm unsure what the relevance is here since the initial square brackets character class can be repeated unlimited times?
# = match a literal # symbol
The final parenthesis seems to match the top level domain which must be at least 2 characters but no more than 4 characters.
I think my main issue is in understanding the relevance of the round brackets as I can't understand what they add beyond what the square brackets add.
Any help would be much appreciated.

^[_A-Za-z0-9_%+-]+(\\.[_A-Za-z0-9_%+-]+)*#[A-Za-z0-9]+(\\.[A-Za-z0-9]+)*(\\.[A-Za-z]{2,4})$
Assert position at the beginning of the string «^»
Match a single character present in the list below «[_A-Za-z0-9_%+-]+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
The character “_” «_»
A character in the range between “A” and “Z” «A-Z»
A character in the range between “a” and “z” «a-z»
A character in the range between “0” and “9” «0-9»
One of the characters “_%” «_%»
The character “+” «+»
The character “-” «-»
Match the regular expression below and capture its match into backreference number 1 «(\\.[_A-Za-z0-9_%+-]+)*»
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
Note: You repeated the capturing group itself. The group will capture only the last iteration. Put a capturing group around the repeated group to capture all iterations. «*»
Match the character “\” literally «\\»
Match any single character that is not a line break character «.»
Match a single character present in the list below «[_A-Za-z0-9_%+-]+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
The character “_” «_»
A character in the range between “A” and “Z” «A-Z»
A character in the range between “a” and “z” «a-z»
A character in the range between “0” and “9” «0-9»
One of the characters “_%” «_%»
The character “+” «+»
The character “-” «-»
Match the character “#” literally «#»
Match a single character present in the list below «[A-Za-z0-9]+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
A character in the range between “A” and “Z” «A-Z»
A character in the range between “a” and “z” «a-z»
A character in the range between “0” and “9” «0-9»
Match the regular expression below and capture its match into backreference number 2 «(\\.[A-Za-z0-9]+)*»
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
Note: You repeated the capturing group itself. The group will capture only the last iteration. Put a capturing group around the repeated group to capture all iterations. «*»
Match the character “\” literally «\\»
Match any single character that is not a line break character «.»
Match a single character present in the list below «[A-Za-z0-9]+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
A character in the range between “A” and “Z” «A-Z»
A character in the range between “a” and “z” «a-z»
A character in the range between “0” and “9” «0-9»
Match the regular expression below and capture its match into backreference number 3 «(\\.[A-Za-z]{2,4})»
Match the character “\” literally «\\»
Match any single character that is not a line break character «.»
Match a single character present in the list below «[A-Za-z]{2,4}»
Between 2 and 4 times, as many times as possible, giving back as needed (greedy) «{2,4}»
A character in the range between “A” and “Z” «A-Z»
A character in the range between “a” and “z” «a-z»
Assert position at the end of the string (or before the line break at the end of the string, if any) «$»

Related

Password REGEX with minimum eight characters, small and large letters or letters and at least one number or special character [duplicate]

This question already has answers here:
Regular Expression for password validation
(6 answers)
Closed 1 year ago.
I'm new to React Native, and I need to implement new password requirements.
The new requirements are small and large letters or letters and at least one number or special character.
The requirement for the password to be at least eight characters.
Here is my code:
.matches(
/^(?=.*[a-z])(?=.*\d)(?=.*[\W_])[\w\W].+$/,
I think that should work:
((^|, )((?=.[a-z])|(?=.\d)|(?=.*[\W_])[\w\W]))+$.
This works. It requires at least one uppercase letter, one lowercase letter, one number, and one special character such as # or # or $, with a length of at least eight characters.
(?m)^((?=.*\\d)(?=.*[a-z])(?=.*[A-Z])(?=.*[\\W]).{8,})$
The (?m) at the beginning makes sure that the . in the regex does not match a newline.
From RegexBuddy:
^((?=.\d)(?=.[a-z])(?=.[A-Z])(?=.[\W]).{8,})$
Options: ^ and $ match at line breaks
Assert position at the beginning of a line (at beginning of the string or after a line break character) «^»
Match the regular expression below and capture its match into backreference number 1 «((?=.\d)(?=.[a-z])(?=.[A-Z])(?=.[\W]).{8,})»
Assert that the regex below can be matched, starting at this position (positive lookahead) «(?=.*\d)»
Match any single character that is not a line break character «.*»
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
Match a single digit 0..9 «\d»
Assert that the regex below can be matched, starting at this position (positive lookahead) «(?=.*[a-z])»
Match any single character that is not a line break character «.*»
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
Match a single character in the range between “a” and “z” «[a-z]»
Assert that the regex below can be matched, starting at this position (positive lookahead) «(?=.*[A-Z])»
Match any single character that is not a line break character «.*»
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
Match a single character in the range between “A” and “Z” «[A-Z]»
Assert that the regex below can be matched, starting at this position (positive lookahead) «(?=.*[\W])»
Match any single character that is not a line break character «.*»
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
Match a single character that is a “non-word character” «[\W]»
Match any single character that is not a line break character «.{8,}»
Between eight and unlimited times, as many times as possible, giving back as needed (greedy) «{8,}»
Assert position at the end of a line (at the end of the string or before a line break character) «$»
It matches
abcDefg1$
1zBA^frmb
1#Basdfadsfadsf
It does not match
abcd123
123abc
abcdEFGH
abcdEFG2
abCDeF1E
1a2bc

RegExp: Ignore all links starting with a specific set of characters

I'm using the RegExp below to find all links in a string. How to add a condition that ignores all links that start with one of these characters: ._ -? (e.g.; .sub.example.com, -example.com)
AS3:
var str = "hello world .sub.example.com foo bar -example.com lorem http://example.com/test";
var filter:RegExp = /((https?:\/\/|www\.)?[äöüa-z0-9]+[äöüa-z0-9\-\:\/]{1,}+\.[\*\!\'\(\)\;\:\#\&\=\$\,\?\#\%\[\]\~\-\+\_äöüa-z0-9\/\.]{2,}+)/gi
var links = str.match(filter)
if (links !== null) {
trace("Links: " + links);
}
You can use the following regex:
\b((https?:\/\/|www\.)?(?<![._ -])[äöüa-z0-9]+[äöüa-z0-9\-\:\/]{1,}+\.[\*\!\'\(\)\;\:\#\&\=\$\,\?\#\%\[\]\~\-\+\_äöüa-z0-9\/\.]{2,}+)\b
Edits:
Added word boundaries \b
Added negative look behind for [._ -] i.e.. (?<![._ -])
This is the regex I use to find in full text :
/\b(https?|ftp|file):\/\/[-A-Z0-9+&##\/%?=~_|$!:,.;]*[A-Z0-9+&##\/%=~_|$]/i
Regex explanation:
\b(https?|ftp|file)://[-A-Z0-9+&##/%?=~_|$!:,.;]*[A-Z0-9+&##/%=~_|$]
Assert position at a word boundary «\b»
Match the regex below and capture its match into backreference number 1 «(https?|ftp|file)»
Match this alternative «https?»
Match the character string “http” literally «http»
Match the character “s” literally «s?»
Between zero and one times, as many times as possible, giving back as needed (greedy) «?»
Or match this alternative «ftp»
Match the character string “ftp” literally «ftp»
Or match this alternative «file»
Match the character string “file” literally «file»
Match the character string “://” literally «://»
Match a single character present in the list below «[-A-Z0-9+&##/%?=~_|$!:,.;]*»
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
The literal character “-” «-»
A character in the range between “A” and “Z” «A-Z»
A character in the range between “0” and “9” «0-9»
A single character from the list “+&##/%?=~_|$!:,.;” «+&##/%?=~_|$!:,.;»
Match a single character present in the list below «[A-Z0-9+&##/%=~_|$]»
A character in the range between “A” and “Z” «A-Z»
A character in the range between “0” and “9” «0-9»
A single character from the list “+&##/%=~_|$” «+&##/%=~_|$»

Regular Expression for alphabets,numbers,spaces and underscores

How can I create a regex expression that will match only letters and numbers, one space between each word and underscores?
Good Examples:
Vamshi1
vamshi_pendota
vamshi pendota
Bad Examples:
vam shi1
vam_shi pendota
You should use a regex tester site like http://regex101.com/
You can enter in your examples, and use the quick reference to help you construct the correct regular expression.
With this simple regex:
^[a-zA-Z0-9]+(?:[ _][a-zA-Z0-9]+)?$
See demo
Option 2 for capitalization
If only the first letter of each word can be a capital letter, use
^[A-Z]?[a-z0-9]+(?:[ _][A-Z]?[a-z0-9]+)?$
What it means
^[a-zA-Z0-9]+(?:[ _][a-zA-Z0-9]+)?$
Assert position at the beginning of the string ^
Match a single character present in the list below [a-zA-Z0-9]+
Between one and unlimited times, as many times as possible, giving back as needed (greedy) +
A character in the range between “a” and “z” (case sensitive) a-z
A character in the range between “A” and “Z” (case sensitive) A-Z
A character in the range between “0” and “9” 0-9
Match the regular expression below (?:[ _][a-zA-Z0-9]+)?
Between zero and one times, as many times as possible, giving back as needed (greedy) ?
Match a single character from the list “ _” [ _]
Match a single character present in the list below [a-zA-Z0-9]+
Between one and unlimited times, as many times as possible, giving back as needed (greedy) +
A character in the range between “a” and “z” (case sensitive) a-z
A character in the range between “A” and “Z” (case sensitive) A-Z
A character in the range between “0” and “9” 0-9
Assert position at the end of the string, or before the line break at the end of the string, if any (line feed) $
Unless you provide any further information, I suspect that what you are after cannot be achieved through a regular expression.
Regular expressions are used to match patterns of strings. In your case, the Good and Bad cases you want to match look the same from a pattern perspective.
Assuming that Vamshi is a valid name but Vam shi is not (despite both having alpha numeric characters and one white space) in your language, I suspect you need to look at a dictionary implementation and not simply a regular expression one.
EDIT: After seeing your change, something like so should work for you: ^[a-z0-9_]+(\s[a-z0-9_]+)*$. The regular expression should expect the string to start with one or more lower case letters and/or underscores optionally followed by a white space and more text.

R regular expression repetition ignores upper bound

I try to make regular expression which helps me filter strings like
blah_blah_suffix
where suffix is any string that has length from 2 to 5 characters. So I want accept strings
blah_blah_aa
blah_blah_abcd
but discard
blah_blah_a
blah_aaa
blah_blah_aaaaaaa
I use grepl in the following way:
samples[grepl("blah_blah_.{2,5}", samples)]
but it ignores upper bound for repetition (5). So it discards strings blah_blah_a,
blah_aaa, but accepts string blah_blah_aaaaaaa.
I know there is a way to filter strings without usage of regular expression but I want to understand how to use grepl correctly.
You need to bound the expression to the start and end of the line:
^blah_blah_.{2,5}$
The ^ matches beginning of line and $ matches end of line. See a working example here: Regex101
If you want to bound the expression to the beginning and end of a string (not multi-line), use \A and \Z instead of ^ and $.
Anchors Tutorial
/^[\w]+_[\w]+_[\w]{2,5}$/
DEMO
Options: dot matches newline; case insensitive; ^ and $ match at line breaks
Assert position at the beginning of a line (at beginning of the string or after a line break character) «^»
Match a single character that is a “word character” (letters, digits, and underscores) «[\w]+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match the character “_” literally «_»
Match a single character that is a “word character” (letters, digits, and underscores) «[\w]+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match the character “_” literally «_»
Match a single character that is a “word character” (letters, digits, and underscores) «[\w]{2,5}»
Between 2 and 5 times, as many times as possible, giving back as needed (greedy) «{2,5}»
Assert position at the end of a line (at the end of the string or before a line break character) «$»

Give a valid statement for this regular expression

https://pastee.org/sg4xy
I'm not to well with regexp, hopefully someone can give me a valid statement with that, and explain how it works and maybe point me to a good site to learn regexp.
The explanation of the regular expression that you posted (created using RegexBuddy):
^.syn ((([0-9]{1,3}\.){3}[0-9]{1,3})|([a-zA-Z0-9_-]+\.[a-zA-Z0-9_-]+\.[a-zA-Z0-9_.-]+)) [0-9]{1,5} [0-9]{1,15} [0-9]{1,15}
Assert position at the beginning of the string «^»
Match any single character that is not a line break character «.»
Match the characters “syn ” literally «syn »
Match the regular expression below and capture its match into backreference number 1 «((([0-9]{1,3}\.){3}[0-9]{1,3})|([a-zA-Z0-9_-]+\.[a-zA-Z0-9_-]+\.[a-zA-Z0-9_.-]+))»
Match either the regular expression below (attempting the next alternative only if this one fails) «(([0-9]{1,3}\.){3}[0-9]{1,3})»
Match the regular expression below and capture its match into backreference number 2 «(([0-9]{1,3}\.){3}[0-9]{1,3})»
Match the regular expression below and capture its match into backreference number 3 «([0-9]{1,3}\.){3}»
Exactly 3 times «{3}»
Note: You repeated the capturing group itself. The group will capture only the last iteration. Put a capturing group around the repeated group to capture all iterations. «{3}»
Match a single character in the range between “0” and “9” «[0-9]{1,3}»
Between one and 3 times, as many times as possible, giving back as needed (greedy) «{1,3}»
Match the character “.” literally «\.»
Match a single character in the range between “0” and “9” «[0-9]{1,3}»
Between one and 3 times, as many times as possible, giving back as needed (greedy) «{1,3}»
Or match regular expression number 2 below (the entire group fails if this one fails to match) «([a-zA-Z0-9_-]+\.[a-zA-Z0-9_-]+\.[a-zA-Z0-9_.-]+)»
Match the regular expression below and capture its match into backreference number 4 «([a-zA-Z0-9_-]+\.[a-zA-Z0-9_-]+\.[a-zA-Z0-9_.-]+)»
Match a single character present in the list below «[a-zA-Z0-9_-]+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
A character in the range between “a” and “z” «a-z»
A character in the range between “A” and “Z” «A-Z»
A character in the range between “0” and “9” «0-9»
The character “_” «_»
The character “-” «-»
Match the character “.” literally «\.»
Match a single character present in the list below «[a-zA-Z0-9_-]+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
A character in the range between “a” and “z” «a-z»
A character in the range between “A” and “Z” «A-Z»
A character in the range between “0” and “9” «0-9»
The character “_” «_»
The character “-” «-»
Match the character “.” literally «\.»
Match a single character present in the list below «[a-zA-Z0-9_.-]+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
A character in the range between “a” and “z” «a-z»
A character in the range between “A” and “Z” «A-Z»
A character in the range between “0” and “9” «0-9»
The character “_” «_»
The character “.” «.»
The character “-” «-»
Match the character “ ” literally « »
Match a single character in the range between “0” and “9” «[0-9]{1,5}»
Between one and 5 times, as many times as possible, giving back as needed (greedy) «{1,5}»
Match the character “ ” literally « »
Match a single character in the range between “0” and “9” «[0-9]{1,15}»
Between one and 15 times, as many times as possible, giving back as needed (greedy) «{1,15}»
Match the character “ ” literally « »
Match a single character in the range between “0” and “9” «[0-9]{1,15}»
Between one and 15 times, as many times as possible, giving back as needed (greedy) «{1,15}»
^.syn ((([0-9]{1,3}\.){3}[0-9]{1,3})|([a-zA-Z0-9_-]+\.[a-zA-Z0-9_-]+\.[a-zA-Z0-9_.-]+)) [0-9]{1,5} [0-9]{1,15} [0-9]{1,15}
Some example text that match above regular expression:
jsyn 467.317.98.0 7259 04124798576 90058
xsyn 3.5.0.545 952 7940 261348
Tsyn 9.47.-.u 3 12 5
Useful sites
http://www.regex101.com/
http://www.debuggex.com/