Regex Length issue - regex

I'm trying to build a regex where it accepts domain names with the following conditions:
Allows DNS names (only hyphens, periods and alphanumeric characters allowed) upto 255 characters.
Hyphens can only appear in between letters
Should start with a letter and end with a letter. It will have minimum 3 characters (letters and periods mandatory, hyphen is optional.)
The length of the label before a period should be 63
Possible Cases:
a.b.c
a-a.b
Cases that should not pass
a-.b
qwertqwertqwertqwertqwertqwertqwertqwertqwertqwertqwertqwertqwerhhg.v
aaaa
aaa-a
What I have built looks like this:
^(([a-zA-z0-9][A-Z0-9a-z-]{1,61}[a-zA-Z0-9][.])+[a-zA-Z0-9]+)$
But this does not accept a.b.c

You may use
^(?=.{1,255}$)(?=[^.]{1,63}(?![^.]))[a-zA-Z0-9]+(?:-[a-zA-Z0-9]+)*(?:[.](?=[^.]{1,63}(?![^.]))[a-zA-Z0-9]+(?:-[a-zA-Z0-9]+)*)+(?:[.][a-zA-Z0-9-]*[a-zA-Z0-9])?$
See the regex demo here.
Pattern details
^ - start of string
(?=.{1,255}$) - the whole string should have 1 to 255 chars
(?=[^.]{1,63}(?![^.])) - there must be 1 to 63 chars other than . before the char other than . or end of string
[a-zA-Z0-9]+ - 1 or more alphanumeric chars
(?: - start of a non-capturing group:
- - a hyphen
[a-zA-Z0-9]+ - 1+ alphanumeric chars
)* - zero or more repetitions
(?: - start of a non-capturing group...
[.] - a dot
(?=[^.]{1,63}(?![^.])) - there must be 1 to 63 chars other than . before the char other than . or end of string
[a-zA-Z0-9]+ - 1+ alphanumeric chars
(?:-[a-zA-Z0-9]+)* - 0 or more repetitions of a - followed with 1+ alphanumeric chars
)+ -... 1 or more times
(?: - start of a non-capturing group...
[.] - a dot
[a-zA-Z0-9-]* - 1+ alphanumeric or - chars
[a-zA-Z0-9] - an alphanumeric char (no hyphens at the end)
)? -... 1 or 0 times (it is optional)
$ - end of string.

You can use the following regex:
/^(?=[A-Z])((?:[A-Z\d]|(?<=[A-Z])-(?=[A-Z])){1,63})(?<=[A-Z])(?:\.[A-Z\d]+){1,2}$/im
Details:
^ - Start of the string.
(?=[A-Z]) - Positive lookahead: The whole string must start with a letter.
( - A capturing group - the domain name.
(?: - Start of a non-capturing group, needed due to the following quantifier.
[A-Z\d] - The first alternative: Either a letter or a digit.
| - Or.
(?<=[A-Z])-(?=[A-Z]) - The second alternative: A hyphen, preceded with a letter
and followed with a letter.
) - End of the non-capturing group.
{1,63} - This group (either alternative) must occur up to 63 times.
) - End of the capturing group.
(?<=[A-Z]) - Positive lookbehid: The capturing group just matched (domain name)
must end with a letter.
(?: - A non-capturing group, also needed due to the following quantifier.
\.[A-Z\d]+ - A dot and a sequence of letters or digits.
) - End of the non-capturing group.
{1,2} - This group must occur 1 or 2 times.
$ - End of the string.
You should definitely use i (case insensitive) option and if you check
a number of strings, each in a separate row, also m (multiline) option.
I didn't include any test for the whole length, but you didn't include it either.
I think, the main task here was to show how to match the case your regex failed.

Related

Regex exclude whitespaces from a group to select only a number

I need to take only a number (a float number) from a text, but I can't remove the whitespaces...
** Update
I have a problem with this method, I only need to consider numbers and ',' between '- EUR' and 'Fee' as rule.
You can use
- EUR\W*(.*?)\W*Fee
See the regex demo.
Variations of the regex that might work in different regex engines:
- EUR\W*\K.*?(?=\W*Fee)
(?<=- EUR\W*).*?(?=\W*Fee)
Details:
- EUR - literal text
\W* - zero or more non-word chars
(.*?) - Group 1: any zero or more chars other than line break chars as few as possible
\W*- zero or more non-word chars
Fee - a string.
You could also match the number format in capture group 1
- EUR\b\D*(\d+(?:,\d+)?)\s+Fee\b
- EUR\b Match - EUR and a word boundary
\D* Match 0+ times any char except a digit
( Capture group 1
\d+(?:,\d+)? Match 1+ digits with an optional decimal part
) Close group 1
\s+Fee\b Match 1+ whitespace chars, Fee and a word boundary
Regex demo
this is working i removed the , from (.) in test string.
Regex example - working

Regex to match variable length, spaces and special chars?

I've got some strings like so
2020-03-05 11:23:25: zone 10 type Interior name 'Study PIR'
2020-03-05 11:57:15: zone 13 type Entry/Exit 1 name 'Front Door'
I've got the below regex that works for the first string, however I'm not sure how to get the product group to match the full group "Entry/Exit 1" The number can range from 1 - 100
(?<Date>[0-9]{4}-[0-2][1-9]-[0-2][1-9]) (?<Time>2[0-3]|[01][0-9]:[0-5][0-9]:[0-5][0-9]): (?<msgType>\w+) (?<id>[0-9]+) (?<type>\w+) (?<product>\w+) \w+ (?<deviceName>'([^']*)')
Any ideas how I can modify this to match?
Your product group pattern should be
(?<product>\w+(?:\/\w+\s+\d+)?)
See the regex demo
Details
\w+ - 1+ word chars
(?:\/\w+\s+\d+)? - an optional sequence of
\/ - a / char
\w+ - 1+ word chars
\s+ - 1+ whitespaces
\d+ - 1+ digits.
If the format is unknown, or does not fit the above description, just use (?<product>.*?), see demo.

Task for matching floating point numbers

Task:
MATCH:
3.45
5,4
.45
3e4
,54
4
4.
4,
DON'T MATCH:
4,5e
2e
.3.
2e,4
,4.
d34
2.45t
2,45.
Currently i came up with the following:
(?<=\s|^)[-+]?(?:(?:[.,]?\d+[.,]?\d*[eE]\d+(?!\w|[.,]))|[.,]?\d+[.,]?\d*(?!\w|[.,]))\b
That works for almost everything, except 2 last numbers (4. and 4,) and got stucked
You may use
(?<!\S)[-+]?[0-9]*(?:[.,]?[0-9]+(?:[eE][-+]?[0-9]+)?|(?<=\d)[,.])(?!\S)
See the regex demo
Details
(?<!\S) - start of string or a whitespace must appear immediately to the left
[-+]? - an optional + or -
[0-9]* - 0+ digits
(?:[.,]?[0-9]+(?:[eE][-+]?[0-9]+)?|[,.]) - either
[.,]?[0-9]+(?:[eE][-+]?[0-9]+)? - an optional . or ,, then 1+ digits, then an optional sequence of e or E, followed with an optional . or , and 1+ digits
| - or
(?<=\d)[,.] - a dot or comma only if preceded with a digit (to avoid matching standalone . or ,)
(?!\S) - end of string or a whitespace must appear immediately to the right.
Regex graph:
You could use an alternation to match 1+ digits followed by a dot or comma and 0+ digits or match the Ee part followed by 1+ digits.
Or match starting with a dot or comma followed by 1+ digits.
If this is the only thing to match on the line, you could use anchors ^ and $ or use lookarounds to assert that there are no non whitespace chars on the left and right.
(?<!\S)(?:\d+(?:[.,]\d*|[eE]\d+)?|[.,]\d+)(?!\S)
Pattern parts
(?<!\S) Assert what is directly to the left is non a non whitespace char
(?: Non capturing group
\d+ Match 1+ digits
(?: Non capturing group
[.,]\d* Match either . or , and 0+ digits
| Or
[eE]\d+ Match e or E and 1+ digits
)? Close group and make it optional
| Or
[.,]\d+ Match . or , and 1+ digits
) Close group
(?!\S) Assert what is directly to the right is non a non whitespace char
Regex demo

Greedy regex quantifier not matching password criteria

/(^[a-zA-Z]+-?[a-zA-Z0-9]+){5,15}$/g
regex criteria
match length must be between 6 and 16 characters inclusive
must start with a letter only
must contain letters, numbers and one optional hyphen
must not end with a hyphen
the above regular expression doesnt satisfy all 4 conditions. tried moving the ^ before the group and omitting the + quantifiers but doesnt work
You are setting the limiting quantifier on a group that already has quantified subpatterns, thus, the length restriction won't work.
To set the length restriction, add the (?=.{6,16}$) lookahead after ^ and then feel free to set your consuming pattern.
You may use
/^(?=.{6,16}$)[a-zA-Z][a-zA-Z0-9]*(?:-[a-zA-Z0-9]+)?$/
See the regex demo. Note you should not use g modifier when validating the whole input string against a regex.
Details
^ - start of string
(?=.{6,16}$) - 6 to 16 chars in the string input allowed/required
[a-zA-Z] - a letter as the first char
[a-zA-Z0-9]* - 0+ alphanumeric chars
(?:-[a-zA-Z0-9]+)? - an optional sequence of - and then 1+ alphanumeric chars
$ - end of string.
All you need
^(?i)(?=.{6,16}$)(?!.*-.*-)[a-z][a-z\d-]*\d[a-z\d-]*(?<!-)$
Readable
^
(?i)
(?= .{6,16} $ ) # 6 - 16 chars
(?! .* - .* - ) # Not 2 dashes
[a-z] # Start letter
[a-z\d-]* # Optional letters, digits, dashes
\d # Must be digit
[a-z\d-]* # Optional letters, digits, dashes
(?<! - ) # Not end in dash
$
Well, at least my regex forces a number be present.

Regular expression number starting with zero,one or more whitespaces followed by plus or minus

I have following regex;
^(\s)*[+-]?\d+$
It fails if input contains multiple whitespaces before first non-whitespace character.
Currently it is working on next examples
- :false
-1 :true
+1 :true
What I want is same logic if there is 0,1 or more whitespaces at the beginning:
: true (empty input string)
: true (one or more spaces)
-: false
-1: true
+1: true
235: true
Here I'm matching numbers, but on more general scheme I would like same behaviour if there are decimalan, on some special words etc.
So, basicly, I want that my regex match if there is any number of whitespaces at the beginning or empty string, followed by something I wannna match (number, email, special words...)
You need to make the whole pattern optional with an optional grouping construct and put the \s* before the group:
^\s*(?:[+-]?\d+)?$
^^^ ^^
See the regex demo
Details:
^ - start of a string
\s* - 0+ whitespaces
(?: - start of a non-capturing group (if the engine does not support non-capturing groups, remove ?:) matching....
[+-]? - an optional (1 or 0 occurrences) + or - symbols
\d+ - 1+ digits
)? - .... 1 or 0 times
$ - end of string.
I think you want the asterisk in with the \s:
^\s*[-+]?\d+$