what pattern should i use in regex if i want to match the first pattern but then i want to unmatch the second pattern.
for example i want to match the string 'id' followed by decimal as long as that decimal is not 6 or 9.
so it should match id1,id2,id3 ... etc but not id6 and id9.
I tried this pattern and it's not working :
"id(\d|(?!6|9))"
You can use negative lookahead like this.
Regex: \bid(?![69])\d\b
Explanation:
\b ensures the word boundary.
(?![69]) negative lookahead makes sure that number is not 6 or 9.
\d matches a single digit after id.
Regex101 Demo
Its not the best solution but you can also do this using positive look ahead as
\bid(?=\d)(?:\d\d+|[^69])\b
Regex Breakdown
\b #word boundary
id #Match id literally
(?=\d) #Find if the next position contains digit (otherwise fails)
(?: #Non capturing group
\d\d+ #If there are more than one digits then match is success
| #OR (alternation)
[^69] #If its single digit don't match 6 or 9
) #End of non capturing group
\b
Regex Demo
If you want to check id is not followed by 6 or 9 and you want to accept cases like id16 but not id61, then you can use
\bid(?=\d)[^69]\d*\b
Regex Demo
The id(\d|(?!6|9)) pattern matches id followed with any 1 digit or if there is no 6 or 9. That alternation (\d or (?!6|9)) allows id6 and id9 because the first alternative "wins" in NFA regex (i.e. the further alternatives after one matches are not tested against).
If you need to only exclude id matches with 6 or 9 use
\bid(?![69]\b)\d+\b
See the regex demo
If you want to avoid matching all id with 6 and 9 following it, use
\bid(?![69])\d+
See another regex demo.
Here, \d+ matches one or more digits, \b stands for a word boundary (the digits should be preceded and followed with non-"word" characters), and the (?![69]) lookahead fails the match if there is 6 or 9 after id (with or without a word boundary check - depending on what you need).
UPDATE
If you need to exclude the id whose number does not start with 6 or 9, you can use
\bid[0-578]\d*
(demo)
Based on Shafizadeh's comment.
Related
I'm looking for help with a regex.
My input field should allow only groups of up to 7 digits, and an unlimited number spaces whether at the beginning, middle, or end.
Here are a few examples of valid matches
Match1:
478 2635478 14587 9652
Match2 (spaces at the end):
14 2 55586
I tried this regex
^( )*[0-9]{1,7}(( )*[0-9]{1-7})*( )*$
It matches when the group is 8 digits.
Converting my comment to answer so that solution is easy to find for future visitors.
You may use this regex:
^ *[0-9]{1,7}(?: +[0-9]{1,7})* *$
RegEx Demo
RegEx Breakup:
^: Start
*: Match 0 or more spaces
[0-9]{1,7}: Match 1 to 7 digits
-(?: +[0-9]{1,7})*: Match 1+ spaces followed by a match of 1 to 7 digits. Repeat this group 0 or more times
*: Match 0 or more spaces
$: End
An idea with one group and use of a word boundary to separate blocks:
^ *(?:\d{1,7}\b *)+$
See this demo at regex101 (more explanation on the right side)
\b will require a space or the end after each \d{1,7} repetition.
In Scala, is it possible to actually insert commas via a regex to separate thousands in numbers where the comma definitely is not there to start with?
For example, I'd like to convert 30000.00 into 30,000.00.
I am not sure this is exactly what you need, but you can use this:
val formatter = java.text.NumberFormat.getNumberInstance
println(formatter.format(30000.00)) // prints 30,000
This is not scala based answer.
You can use regex \d{1,3}(?=(?:\d{3})+\.) to find the matches and substitute each match with the same match plus an extra comma $0,.
See the online demo.
Explanation:
\d{1,3} This matches a decimal character between 1 and 3 times
(?= Positive lookahead starts
(?: This indicates a Non-capturing group
\d{3} matches a digit exactly 3 times
) end of Non-capturing group.
+ matches the previous group one or more times
\. matches the character . literally
) Positive lookahead ends.
Does anyone how to know to substring in regular expression? I am currently profiling data and i saw different format such as :
EB0000000
EB00000000PHL00000000F00000000
P0000000A
When I used my expression:
\b(?:[A-Z]{1}\d{7}[A-Z]{1}|[A-Z]{1}\d{7,8}|[A-Z]{2}\d{6}|[A-Z]{2}\d{7,8})\b
I captured the first and last sample, but the second looks improper data but i still want to capture EB and those 8 digits before PHL. Is it possible in regexp? TIA
Why is it so hard to write? Maybe there are some lines nearby that should not fall into the selection?
\b[A-Z\d]{8,}\b
It is possible, but you could change the order of the alternatives to put the most specific one at the beginning and then remove the word boundary at the end.
Note that you can omit {1}
\b(?:[A-Z]{2}\d{7,8}|[A-Z]\d{7}[A-Z]|[A-Z]\d{7,8}|[A-Z]{2}\d{6})
In parts
\b Word boundary
(?: Non capture group
[A-Z]{2}\d{7,8} Match 2 times A-Z and 7-8 digits
| Or
[A-Z]\d{7}[A-Z] Match A-Z, 7 digits and A-Z
| Or
[A-Z]\d{7,8} Match A-Z and 7-8 digits
| Or
[A-Z]{2}\d{6} Match 2 times A-Z and 6 digits
) Close group
Regex demo
what is the best way to build a regex for numbers between 10 and 240, and another one between 10 and 360?
Regexes are no good to deal with numbers. Unless this is the only alternative that you got, you should probably choose another solution.
10-240: ^(?:2(?:[0-3]\d|40)|1\d\d|[1-9]\d)$
Explanation:
^: Anchor that match the beginning of the string
(?: Non-capturing group (more performant than capturing groups). I use those for alternation.
2: Literal character '2'
[0-3]: A single digit between 0 and 3.
\d: A single digit character (0-9)
|: Or
3-6. 2(?:[0-3]\d|40): A number that starts with 2 followed by 0-3 and any digit or literally '40'. That match 200-240
|1\d\d: Or one followed by two digits (0-9). That match 100-199.
|[1-9]\d : Or a digit between 1-9 followed by any digit (0-9). That match 10-99.
$: Anchor that match the end of the string.
Test it here: https://regex101.com/r/rO4fZ0/1
10-360: ^(?:3(?:[0-5]\d|60)|[12]\d\d|[1-9]\d)$
3(?:[0-5]\d|60): Literal character 3 followed by 0-5 and any digit or literally 60. That match 300-360.
|[12]\d\d: Or one or two followed by two digits (0-9). That match 100-299.
|[1-9]\d : Or a digit between 1-9 followed by any digit (0-9). That match 10-99.
Test it here: https://regex101.com/r/lD8oM4/1
The best way to do so, is with a tester, http://regexr.com
Here is the RegEx for the 10 to 240 match.
^(([1-9][0-9])|(1[0-9][0-9])|(2[0-3][0-9])|(240))$
However, I do feel this is probably not the right tool of what you want to achieve.
Mike
Can anyone help me or direct me to build a regex to validate repeating numbers
eg : 11111111, 2222, 99999999999, etc
It should validate for any length.
\b(\d)\1+\b
Explanation:
\b # match word boundary
(\d) # match digit remember it
\1+ # match one or more instances of the previously matched digit
\b # match word boundary
If 1 should also be a valid match (zero repetitions), use a * instead of the +.
If you also want to allow longer repeats (123123123) use
\b(\d+)\1+\b
If the regex should be applied to the entire string (as opposed to finding "repeat-numbers in a longer string), use start- and end-of-line anchors instead of \b:
^(\d)\1+$
Edit: How to match the exact opposite, i. e. a number where not all digits are the same (except if the entire number is simply a digit):
^(\d)(?!\1+$)\d*$
^ # Start of string
(\d) # Match a digit
(?! # Assert that the following doesn't match:
\1+ # one or more repetitions of the previously matched digit
$ # until the end of the string
) # End of lookahead assertion
\d* # Match zero or more digits
$ # until the end of the string
To match a number of repetitions of a single digit, you can write ([0-9])\1*.
This matches [0-9] into a group, then matches 0 or more repetions (\1) of that group.
You can write \1+ to match one or more repetitions.
Use a backreference:
(\d)\1+
Probably you want to use some sort of anchors ^(\d)\1+$ or \b(\d)\1+\b
I used this expression to give me all phone numbers that are all the same digit.
Basically, it means to give 9 repetitions of the original first repetition of a given number, which results in 10 of the same number in a row.
([0-9])\1{9}
(\d)\1+? matches any digit repeating
you can get repeted text or numbers easily by backreference take a look on following example:
this code simply means whatever the pattern inside [] . ([inside pattern]) the \1 will go finding same as inside pattern forward to that.