Regex match for complete string only (max. characters)

Regex match for complete string only (max. characters) - regex

I'm looking for a regex to match following 'good' values.
100.100.100.10
100.100.100.1
100.100.100
100.100.10
100.100.1
100.100
Following conditions need to be valid:
Minimum of 7 characters (xxx.xxx)
Maximum of 14 characters (xxx.xxx.xxx.xx)
Groups can contain max 3 characters (xxx)
Groups need to be separated by a '.'
In case of 3 groups, the third group can contain 1 to 3 characters (x or xx or xxx)
In case of 4 groups, the fourth group can contain 1 to 2 characters (x or xx)
All previous groups need to contain 3 characters (xxx)
To test the validity of a string value, following regex has been created.
([0-9]{3}(\.[0-9]{3}){2}(\.[0-9]{1,2}))|
([0-9]{3}(\.[0-9]{3})(\.[0-9]{1,3})?)
I had to use the OR operator, but can't find how I can exclude values containing more than 14 characters. I've tested the below 'bad' examples (via http://regexr.com/) and do get a match on PART of the string. However, my rule would have to be NOT to accept these strings, since there's 'noise' at the end of each 'word' (whereas the word is considered a string without spaces)
100.100.100.100.100
The last .100 needs to make the full string invalid, no partial match is accepted. Adding \b or ^$ in combination with the OR does not provide me the required result.
100.100.100.100100
100.100.100100100
100.100.100.100
The above need to invalid as well.

You may use optional groups:
^[0-9]{3}\.[0-9]{3}(?:\.[0-9]{1,3}(?:\.[0-9]{1,2})?)?$
See the regex demo
Details:
^ - start of string
-[0-9]{3} - 3 digits (your group 1)
\.[0-9]{3} - a dot and 3 digits (group 2)
(?:\.[0-9]{1,3}(?:\.[0-9]{1,2})?)? - an optional group matching
\.[0-9]{1,3} - a dot and 1 to 3 digits (Group 3)
(?:\.[0-9]{1,2})? - an optional group (group 4):
\. - a dot
[0-9]{1,2} - any 1 to 2 digits
$ - end of string

Logically, there's 1-3 complete groups, followed by a partial group:
^(?=.{7,14})(\d{3}\.){1,3}\d{1,3}$
The look ahead enforces the length.
See live demo.

Related

regex in python/ansible

I am new bee to regex, I have an example string : account-device-v2-2-3-63-21900
and using this regular expression [1-9]-[0-9]-[0-9]*
I am getting output as 1-2-3
but my intention is to match/extract pattern 2-3-63
Meaning to get digits with hyphens after v2 (or v1 etc), I don't need last digit part (21000 or any other number)
Any suggestions please?

You want to get 1 or more digit except 0, dash, 1 or more digit, dash, 1 or more digit from account-device-v2-2-3-63-21900 or account-device-v1-2-3-63-21900?
Use v[12]-([1-9]+?-[0-9]+?-[0-9]+?)- and get first group.
Demo: https://regex101.com/r/hMLGsK/1

The pattern [1-9]-[0-9]-[0-9]* matches 2-2-3 because your pattern does not match the v and a digit part and this is the first part it can match.
Note that [0-9]* Matches optional digits, so 2-2- could also be a match.
Using a capture group to get the value:
\bv[1-9][0-9]*-([1-9][0-9]*-[0-9]+-[0-9]+)
\bv[1-9][0-9]*- Match v1 or also possibly v20 etc..
( Capture group 1
[1-9][0-9]* Match a digit starting at 1
-[0-9]+-[0-9]+ 2 parts matching - and 1 or more digits starting from 0
) Close group 1
Regex demo

Regex to block more than 3 numbers in a string

I am trying to block any strings that contain more than 3 numbers and prevent special characters. I have the special characters part down. I'm just missing the number part.
For example:
"Hello 1234" - Not Allowed
"Hello 123" - Allowed
I've tried the following:
/^[!?., A-Za-z0-9]+$/
/((^[!?., A-Za-z]\d)([0-9]{3}+$))/
/^((\d){2}[a-zA-Z0-9,.!? ])*$/
The last one is the closest I got as it prevents any special characters and any numbers from being entered at all.
I've looked through previous posts, but am coming up short.
Edit for clarification
Essentially I'm trying to find a way to prevent customers from entering PII on a form. No submission should be allowed that contains more than 3 numbers in a string.
Hello1234 - Not allowed
12345 - Not allowed
1111 - not allowed
No where in the comment section when the user enters the string should there be more than 3 numbers in total.

About the patterns that you tried
^[!?., A-Za-z0-9]+$ The pattern matches 1+ times any of the listed, including 1 or more digits
((^[!?., A-Za-z]\d)([0-9]{3}+$)) If {3}+ is supported, the pattern matches a single char from the character class, 1 digit followed by 3 digits
^((\d){2}[a-zA-Z0-9,.!? ])*$ The pattern repeats 0+ times matching 2 digits and 1 of the listed in the character class
You can use a negative lookahead if that is supported to assert not 4 digits in a row.
^(?!.*\d{4})[a-zA-Z0-9,.!? ]+$
regex demo
If there can not be 4 digits in total, but 0-3 occurrences:
^[a-zA-Z,.!? ]*(?:\d[a-zA-Z,.!? ]*){0,3}$
Explanation
^ Start of string
[a-zA-Z,.!? ]* Match 0+ times any of the listed (without a digit)
(?:\d[a-zA-Z,.!? ]*){0,3} Repeat 0 - 3 times matching a single digit followed by optional listed chars (Again without a digit)
$ End of string
regex demo
If you don't want to match an empty string and a lookahead is supported:
^(?!$)[a-zA-Z,.!? ]*(?:\d[a-zA-Z,.!? ]*){0,3}$
See another regex demo

Here is my two cents:
^(?!(.*\d){4})[A-Za-z ,.!?\d]+$
See the online demo
^ - Start string anchor.
(?! - Open a negative lookahead.
( - Open capture group.
.*\d - Match anything other than newline up to a digit.
){4} - Close capture group and match it 4 times.
) - Close negative lookahead.
[A-Za-z ,.!?\d]+ - 1+ Characters from specified class.
$ - End string anchor.
I think it should cover what you described.

Assuming you mean <= 3 digits, this may be a naive one but how about
[ALLOWED_CHARS]*[0-9]?[ALLOWED_CHARS]*[0-9]?[ALLOWED_CHARS]*[0-9][ALLOWED_CHARS]*?
Fill [ALLOWED_CHARS] to whatever you define is not special character and nums.

REGEXP_REPLACE for exact regex pattern, not working

I'm trying to match an exact pattern to do some data cleanup for ISSN's using the code below:
select case when REGEXP_REPLACE('1234-5678 ÿþT(zlsd?k+j''fh{l}x[a]j).,~!##$%^&*()_+{}|:<>?`"\;''/-', '([0-9]{4}[\-]?[Xx0-9]{4})(.*)', '$1') not similar to '[0-9]{4}[\-]?[Xx0-9]{4}' then 'NOT' else 'YES' end
The pattern I want match any 8 digit group with a possible dash in the middle and possible X at the end.
The code above works for most cases, but if capture group 1 is the following example: 123456789 then it also returns positive because it matches the first 8 digits, and I don't want it to.
I tried surrounding capture group 1 with ^...$ but that doesn't work either.
So I would like to match exactly these examples and similar ones:
1234-5678
1234-567X
12345678
1234567X
BUT NOT THESE (and similar):
1234567899
1234567899x
What am I missing?

You may use
^([0-9]{4}-?[Xx0-9]{4})([^0-9].*)?$
See the regex demo
Details
^ - start of string
([0-9]{4}-?[Xx0-9]{4}) - Capturing group 1 ($1): four digits, an optional -, and then four x / X or digits
([^0-9].*)? - an optional Capturing group 2: any char other than a digit and then any 0+ chars as many as possible
$ - end of string.

REGEX Capturing differing sets of repeating groups

this is a two-part question, but I feel the answers will be related.
I have this regex pattern:
(\d+)(aa|bb) which I use to capture this string: 1bb2aa3aa4bb5bb6aa7bb8cc9cc
See demo: example 1
The way it captures the random series of aa and bb (both preceded by a digit) is exactly what I want, and is good as far as it goes.
So we get this match on regex101:
Match 1
Full match 0-3 `1bb`
Group 1. 0-1 `1`
Group 2. 1-3 `bb`
Match 2
Full match 3-6 `2aa`
Group 1. 3-4 `2`
Group 2. 4-6 `aa`
Match 3
Full match 6-9 `3aa`
Group 1. 6-7 `3`
Group 2. 7-9 `aa`
Match 4
Full match 9-12 `4bb`
Group 1. 9-10 `4`
Group 2. 10-12 `bb`
Match 5
Full match 12-15 `5bb`
Group 1. 12-13 `5`
Group 2. 13-15 `bb`
Match 6
Full match 15-18 `6aa`
Group 1. 15-16 `6`
Group 2. 16-18 `aa`
Match 7
Full match 18-21 `7bb`
Group 1. 18-19 `7`
Group 2. 19-21 `bb`
As expected, the 8cc9ccbit at the end is ignored. I would like capture this as well, in the same way I have captured the first repeating groups, in the same expression. So in the final output, I'd get something like this added to the end of the output. This should work for any amounts of matches on either side. This text is just one example.
Full match 21-24 `8cc`
Group 1. 21-22 `8`
Group 2. 22-24 `cc`
Match 7
Full match 24-27 `9cc`
Group 1. 24-25 `9`
Group 2. 25-27 `cc`
Also, I'd like to do similar but flipping the 'or' group to the end i.e. this:
1cc2cc3cc4cc5cc6cc7ccb8aa9bb
My current regex pattern (\\d+)(cc) only matches the repeating 'cc' groups.
See demo: example 2
I would like a similar full capture, with any amount of permissible entries of each group.
Any thoughts?

You may use
(?:\G(?!^)(?(?=\d+(?:aa|bb))(?<!\dcc))|(?=(?:\d+(?:aa|bb))+(?:\d+cc)+))(\d+)(aa|bb|cc)
See the regex demo
The regex will only match the string that meets the pattern in the (?=(?:\d+(?:aa|bb))+(?:\d+cc)+) lookahead, and then will consecutively match and capture digits and aa, bb or cc, but digits + aa or bb will be matched unless digits + cc is not in front.
Details
(?:\G(?!^)(?(?=\d+(?:aa|bb))(?<!\dcc))|(?=(?:\d+(?:aa|bb))+(?:\d+cc)+)) - either of the two alternatives:
\G(?!^) - end of the previous successful match
(?(?=\d+(?:aa|bb))(?<!\dcc)) - if-then-else construct: if there is 1+ digits and aa or bb immediately to the right of the current location ((?=\d+(?:aa|bb)), then only continue matching if there is no digit followed with cc immediately to the left of the current location ((?<!\dcc))
| - or
^ - start of string
(?=(?:\d+(?:aa|bb))+(?:\d+cc)+) - a positive lookahead that, immediately to the right of the current location, searches for the following (and returns true if it finds the patterns, or false if it does not):
(?:\d+(?:aa|bb))+ - one or more occurrences of 1+ digits followed with aa or bb
(?:\d+cc)+ - one or more occurrences of 1+ digits followed with cc
(\d+) - Group 1: one or more digits
(aa|bb|cc) - aa, bb or cc.
For the second pattern, replace cc with (?:aa|bb):
(?:\G(?!^)(?(?=\d+cc)(?<!\d(?:aa|bb)))|(?=(?:\d+cc)+(?:\d+(?:aa|bb))+))(\d+)(aa|bb|cc)

I'm no expert with perl, so I'll give a bit of pseudo code here. Feel free to suggest an edit.
You can start by matching any number of xaa or xbb combos, followed by one or more xcc combos using this pattern: ^(?:\d+(?:aa|bb))+(?:\dcc)+$
Once you have that you can use this pattern to capture the appropriate groups: (\d+)(aa|bb|cc)
Demo 1
Demo 2
Something like:
if(ismatch("^(?:\d+(?:aa|bb))+(?:\dcc)+$", inputString))
{
match = match("(\d+)(aa|bb|cc)", inputString);
}
from here you can extract the information using the groups.

Strange behavior of regex

Background:
I need to identify a pair of numbers separated by a hyphen (-), the numbers can optionally include +/- and can be decimal.
So below are examples of that:
3-4, +3-+4, .3-.4, 0.3-0.4, -0.3--0.4, 0.3--0.4 etc...
I was using below expression:
(-?\+?\d*.?\d*)-(-?\+?\d*.?\d*)
It works well in most cases but fails in below:
-0.3--0.4
The groups it forms are: -0.3- and 0.4
But if i replace it like:
(-?\+?\d*.?\d+)-(-?\+?\d*.?\d+), it works fine.
I am wondering what difference replacing the * with + is making?
We have used this in javascript.

The wrong capturing is accounted for by the fact that your patterns inside capturing groups (-?\+?\d*.?\d*) can match an empty string and - more importantly here - . matches any char, not only a dot. You must escape it to match a literal dot. Note how (-?\+?\d*.?\d*)-(-?\+?\d*.?\d*) matches 3-4, (the , is captured with Group 2 pattern .) and note Matches 5 and 6 where . matches a space and a hyphen.
Also, your -?\+? actually allows matching -+ sequence of signs, which does not seem what you need. Just use [-+]? optional character class.
So, you might want to use ([-+]?\d*\.?\d*)-([-+]?\d*\.?\d*) pattern, but I'd advise to make sure at least 1 digit is matched, and you may use ([-+]?\d*\.?\d+)-([-+]?\d*\.?\d+) pattern for it.
Details:
([-+]?\d*\.?\d+) - Group 1: a sequence of
[-+]? - an optional - or +
\d* - 0+ digits
\.? - an optional .
\d+/\d* - 1 or more digits (or 0 or more with *)
- - a hyphen
([-+]?\d*\.?\d+) - see above.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Regex match for complete string only (max. characters) - regex

Logically, there's 1-3 complete groups, followed by a partial group: ^(?=.{7,14})(\d{3}\.){1,3}\d{1,3}$ The look ahead enforces the length. See live demo.

Related

regex in python/ansible

Regex to block more than 3 numbers in a string

REGEXP_REPLACE for exact regex pattern, not working

REGEX Capturing differing sets of repeating groups

Strange behavior of regex

Categories

Resources