perl regex - catching four numbers separated by space - regex

I know that we can use regex in perl to catch numbers using [\d], but my pattern is like this:
261 193 546 302
or it could be like this:
16 0 98 120
The point is - I just want to catch a line that has any four numbers separated by a space. Each number can be made up of any number of digits, it could be a single-digit number, or a double-digit number, and so on.

^\d+(?:\s+\d+){3}$
Try this.This should do it for you.

You don't explicitly have to wrap the token inside of a character class. And for this you want to assert the start of the string and end of the string positions, so I would use anchors and quantify a non-capturing group "3" times.
^\d+(?: \d+){3}$
Explanation:
^ # the beginning of the string
\d+ # digits (0-9) (1 or more times)
(?: # group, but do not capture (3 times):
# ' '
\d+ # digits (0-9) (1 or more times)
){3} # end of grouping
$ # before an optional \n, and the end of the string

Based on your requirements to "catch a line that has any four numbers separated by a space". I would use the following as it contains a capture group which will contain your number sequence and will ignore any leading or tailing spaces.
((?:\d+\s){3}\d+)
REGEX101
Usage in Perl
$re = "/((?:\\d+\\s){3}\\d+)/";
As you can see it will match exactly 4 numbers separated by a single space and will ignore preceding and trailing characters.
Alternate
If you where being explicit and actually want to capture the whole line including any other characters this will be better suited.
(^.*(?:\d+\s){3}\d+.*$)
REGEX101
Usage In Perl
$re = "/(^.*(?:\\d+\\s){3}\\d+.*$)/mx";
Note this will match numbers with decimal places due to the way it is structured.

Try ^\d+\s\d+\s\d+\s\d+$. That will match 4 numbers with spaces and nothing else.
Sample

Related

RegEx - Finding multiple dashes in entire string... and a little more

So I am HORRIBLE with RegEx... can you help me with the following?
only allows 52 total characters
only has "a-z", "A-Z", "0-9", & "-" (letters, numbers, and dashes)
does not start with "-" (dash)
does not end with "-" (dash)
does not have "--" (two consecutive dashes together)
does not have more than 2 "-" (dashes) in the entire string (this is what I'm having problems with)
So here is a helpful list (I guess):
Pass:
abc-123
abc-123-abc
Fail:
-abc-123 (fails due to starting with a dash)
abc-123- (fails due to ending with a dash)
abc-123-abc-123 (fails due to 3 dashes)
abc-12#-abc (failed due to having a character that is not a-z, A-Z, 0-9, or a dash)
This is what I currently have, but feel free to change it however you would like:
(?!.*--)^[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,50}[a-zA-Z0-9])?$
I'm sure there's a better way to do this, but as mentioned above, I'm horrible with expressions. My expression works, it just doesn't find more than two dashes.
Thanks for your help.
You could use assert a maximum of 52 chars in a positive lookahead.
Then match 1 or more times [a-zA-Z0-9]+ and repeat 0, 1 or 2 times or more times the same pattern preceded with a -
^(?=[a-zA-Z0-9-]{1,52}$)[a-zA-Z0-9]+(?:-[a-zA-Z0-9]+){0,2}$
Explanation
^ Start of line
(?= Positive lookahead, assert what is on the right is
[a-zA-Z0-9-]{1,52}$ Match any of the listed 1-52 times and assert end of string
) Close lookahead
[a-zA-Z0-9]+ Match 1+ times any of the listed to prevent matching an empty string
(?: Non capture group
-[a-zA-Z0-9]+ Match - and 1+ times any of the listed without the -
){0,2} Close group and repeat 0-2 times
$ End of line
Regex demo
You can use the following PCRE-flavoured regex:
/^(?!.*\-\-)(?!.*\-.+\-.*\-)(?!-)[a-z0-9-]{0,52}(?<!-)$/gmi
Demo
The regex can be made self-documenting by writing it in free-spacing mode:
/
^ # match beginning of line
(?!.*\-\-.*$) # the line may not contain two consecutive hyphens
(?!.*\-.*\-.*\-.*$) # the line may not contain more than two hyphens
(?!-) # the first char cannot be a hyphen
[a-z0-9-]{0,52} # match 0-52 letters, digits and hyphens
(?<!-) # the last char cannot be a hyphen
$ # match end of the line
/xgmi # free-spacing, global, multiline, case indifferent modes
(?!.*\-\-.*$), (?!.*\-.*\-.*\-.*$) and (?!-) are negative lookaheads; (?<!-) is a negative lookbehind.
This matches each line of a string (convenient for showing test cases at the demo). If the string contains a single line the regex can be simplified somewhat:
\A(?!.*\-\-)(?!.*\-.+\-.*\-)(?!-)[a-z0-9-]{0,52}(?<!-)\z
Not that \A and \z are beginning and end of string anchors, whereas ^ and $ are beginning and end of line anchors. Compare the negative lookaheads in this regex with those in the earlier one.
Should it matter, this matches empty strings.

Regex to match list of numbers

I'm trying to write a regex to match a very long list of numbers separated by commas and an optional space. It can't match a single integer. The list of numbers is approx 7000 bytes long bounded by text on either side.
12345 => don't match
12345,23456,34567,45678 => match
12345, 23456, 34567, 45678 => match
My current regex,
(?<!\.)(([0-9]+,)+[0-9]+)(?!\.)
causes a stack overflow. A few I have tried so far are:
([0-9,]+) => doesn't match with optional spaces
((\d+,[ ]?)+\d+) => worse than the original
[ ]([0-9, ]+)[ ] => can't be certain the numbers will be bounded by spaces
I'm using https://regex101.com/ to test the number of steps each regex takes, the original is approx 3000 steps.
Example (elided) string:
Processing 145363,145386,145395,145422,145463,145486 from batch 59
Any help would be appreciated.
You can use this regex:
^\d+(?:[ \t]*,[ \t]*\d+)+$
RegEx Demo
\d+ matches 1 or more digits
(?:...)+ matches 1 or more of following numbers separated by comma optionally surrounded with space/tab.
(\d+,\s*)+\d+
\d+,\s* matches all the numbers with a comma followed by a space/nospace. However we need to lookout for the last number which doesn't have a "," as in the above group. So end it with last number by \d+.
How about
(?:\d+,\s*)+\d+
Breakdown:
(?: # begin group
\d+ # digits
,\s* # ",", optional whitespace
)+ # end group, repeat
\d+ # digits (last item in the list)
Note that \s includes whitespace characters besides space and tab, most notably line breaks (\n). Use [ \t] in place of \s to prevent false positives, if your input requires it.

Regex to match total number of interrupted characters

I need to be able to detect when a string has 8 or more numbers, even when they are separated by periods - but that total 8 or more cannot include the periods. In other words, I can ignore periods when looking for a numeral string that includes 8 or more numbers.
I've tried countless combinations of capturing groups and non-capturing groups, regular sets and negated sets, and I just can't figure it out. But I've simplified my examples, below, to show the issue.
For example, the following regex will match, even though there are only 6 numbers total, but there are 8 total characters (obviously):
Expression: [0-9\.]{8,}
Text: 12.34.56
Is there a regex expression that would allow me to ignore those periods?
This will validate a digit 8 or more times ^\D*(?:\d\D*){8,}$
Basically, the single digit \d is surrounded by optional non-digits \D,
and matched in a group, 8 or more times.
Formatted:
^ # Beginning of string
\D* # 0 to many non-digits
(?: # Cluster group
\d # single digit
\D* # 0 to many non-digits
){8,} # 8 or more times
$ # End of string

Regular Expression, with number spaces dashes limited to 8-13 numbers

I am trying to do a regular expression to validate a number between 9 and 13 numbers, but the sequence can have dashes and spaces and the ideal is to not have more than one space or dash consecutively.
this rule allow me to control the validation between 9 and 13
/^[\d]{9,13}$/
now to add dashes and spaces
/^[\d -]{9,13}$/
I think I need something like that, but I need to count the numbers
/^[ -](?:\d){9,13}$/
Any tips?
Notice how my regex starts and ends with a digit. Also, this prevents consecutive spaces and dashes.
/^\d([ \-]?\d){7,12}$/
It appears that you don't want leading or trailing spaces and dashes. This should do it.
/^\d([- ]*\d){8,12}$/
Regular expression:
\d digits (0-9)
( group and capture to \1 (between 8 and 12 times)
[- ]* any character of: '-', ' ' (0 or more times)
\d digits (0-9)
){8,12} end of \1
Another option: A digit followed any number of space or dash 8-12 times, followed by a digit.
/^(\d[- ]*){8,12}\d$/
Use look aheads to assert the various constraints:
/^(?!.*( |--))(?=(\D*\d){9,13}\D*$)[\d -]+$/
Assuming a dash following a space or vice versa is ok:
^( -?|- ?)?(\d( -?|- ?)?){9,13}$
Explanation:
( -?|- ?) - this is equivalent to ( | -|-|- ). Note that there can't be 2 consecutive dashes or spaces here, and this can only appear at the start or directly after a digit, so this prevents 2 consecutive dashes or spaces in the string.
And there clearly must be exactly one digit in (\d( -?|- ?)?), thus the {9,13} enforces 9-13 digits.
Assuming a dash following a space or vice versa is NOT ok:
^[ -]?(\d[ -]?){9,13}$
Explanation similar to the above.
Both of the above allows the string to start or end with a digit, dash or space.

Regex to find repeating numbers

Can anyone help me or direct me to build a regex to validate repeating numbers
eg : 11111111, 2222, 99999999999, etc
It should validate for any length.
\b(\d)\1+\b
Explanation:
\b # match word boundary
(\d) # match digit remember it
\1+ # match one or more instances of the previously matched digit
\b # match word boundary
If 1 should also be a valid match (zero repetitions), use a * instead of the +.
If you also want to allow longer repeats (123123123) use
\b(\d+)\1+\b
If the regex should be applied to the entire string (as opposed to finding "repeat-numbers in a longer string), use start- and end-of-line anchors instead of \b:
^(\d)\1+$
Edit: How to match the exact opposite, i. e. a number where not all digits are the same (except if the entire number is simply a digit):
^(\d)(?!\1+$)\d*$
^ # Start of string
(\d) # Match a digit
(?! # Assert that the following doesn't match:
\1+ # one or more repetitions of the previously matched digit
$ # until the end of the string
) # End of lookahead assertion
\d* # Match zero or more digits
$ # until the end of the string
To match a number of repetitions of a single digit, you can write ([0-9])\1*.
This matches [0-9] into a group, then matches 0 or more repetions (\1) of that group.
You can write \1+ to match one or more repetitions.
Use a backreference:
(\d)\1+
Probably you want to use some sort of anchors ^(\d)\1+$ or \b(\d)\1+\b
I used this expression to give me all phone numbers that are all the same digit.
Basically, it means to give 9 repetitions of the original first repetition of a given number, which results in 10 of the same number in a row.
([0-9])\1{9}
(\d)\1+? matches any digit repeating
you can get repeted text or numbers easily by backreference take a look on following example:
this code simply means whatever the pattern inside [] . ([inside pattern]) the \1 will go finding same as inside pattern forward to that.