I'm trying to use regex to find tax numbers with the formats:
nnn-nnn-nnn | nn-nnn-nnn
nnn nnn nnn | nn nnn nnn
nnnnnnnnn | nnnnnnnn
EDIT: some samples are 062-225-505, 62-225-505, 062 225 505, 62 225 505, 062225502, 62225505. The numbers should not be any longer than 9 numbers in total
So far I have ([0-9]{2,3}(\s|-|)+[0-9]{3,8}(\s|-|)+[0-9]{3,9})
This works, BUT it is also finding 050821862257111 which is too long for what I'm trying to find. How do I limit the total string as well as each part being limited?
Thanks!
Try ^\d{1,9}(?:(?:-| )\d{1,9})*$
Explanation:
^ - match beginning of a string
\d{1,9} - match between 1 and 9 digits
(?:...) - non-captuirng group
-| - alterantion: match or -
* - match zero or more times
$ - match end of a string
Demo
With a small change to your regex, you can limit the length to eight or nine numbers, although this would still allow a mix and match of the delimiters:
([0-9]{2,3}[\s-]?[0-9]{3}[\s-]?[0-9]{3})
If the actual number of delimiters is not important, then you could just remove then, and then just check the length of the remaining numbers.
^\d{2}\d?(?:-|\s)?\d{3}(?:-| )?\d{3}$
demo at regex101
This regex will only match if the spaces and dashes are in the right place.
This will match: 062-225-505
This will not match: 062-2255-05 or 062225--505
Found with a combination of all of your help! :)
\s\d{2,3}\d?(-|\s)?\d{3}(\1)?\d{3}(?!\d)
Found 62-225-505, 62225505, 062 225 505, and did not find 060821067254101
Thanks all :)
Related
I have multiple formats of strings from which I have to extract exactly 10 digit number.
I have tried the following regexes for it. But it extracts the first 10 digits from the number instead of ignoring it.
([0-9]{10}|[0-9\s]{12})
([[:digit:]]{10})
These are the formats
Format 1
KINDLY AUTH FOR FUNDS
ACC 1469007967 (Number needs to be extracted)
AMT R5 000
DD 15/5
FROM:006251
Format 2
KINDLY AUTH FOR FUNDS
ACC 146900796723423 **(Want to ignore this number)**
AMT R5 000
AMT R30 000
DD 15/5
FROM:006251
Format 3
PLEASE AUTH FUNDS
ACC NAME-PREMIER FISHING
ACC NUMBER -1186 057 378 **(the number after - sign needs to be extracted)**
CHQ NOS-7132 ,7133,7134
AMOUNTS-27 000,6500,20 000
THANKS
FROM:190708
Format 4
PLEASE AUTHORISE FOR FUNDS ON AC
**1162792833** CHQ:104-R8856.00 AND (The number in ** needs to be extracted)
CHQ:105-R2772.00
REGARDS,
To match those numbers including the formats to have either 10 digits or 4 space space 3 space 3, you might use a backreference \1 to a capturing group which will match an optional space.
Surround the pattern by word boundaries \b to prevent the digits being part of a larger word.
\b\d{4}( ?)\d{3}\1\d{3}\b
Regex demo
Your expression seems to be fine, just missing a word boundary and we might want to likely modify the second compartment, just in case:
\b([0-9]{10}|[0-9]{4}\s[0-9]{3}\s[0-9]{3})\b
In this demo, the expression is explained, if you might be interested.
Adding a word boundary \b helps. The regex becomes: (\b([0-9]{10}|[0-9\s]{12})\b).
Check it here https://regex101.com/r/6Hm8PD/2
I need some help with creating a regex string. I have this long list of numbers:
7001 7002 7003 7004 7005 7006 7007 7008 7009 7010 7011 7012 7013 7014
7015 7016 7017 7018 7019 7020 7021 7022 7023 7024 7025 7026 7027 7028
7029 7030 7031 7032 7033 7034 7035 7036 7037 7038 7039 7040 7041 7042
7043 7044 7045 7046 7047 7048 7049 7050 7051 7052 7053 7054 7055 7056
7057 7058 7059 7060 7061 7062 7063 7064 7065 7066 7067 7068 7069 7070
7071 7072 7073 7074 7075 7076 7077 7078 7079 7080 7081 7082 7083 7084
7085 7086 7087 7088 7089 7090 7091 7092 7093 7094 7095 7096 7097 7098
7099 7100 7101 7102 7103 7104 7105 7106 7107 7108 7109 7110 7111 7112
7113 7114 7115 7116 7117 7118 7119 7120 7121 7122 7123 7124 7125 7126
7127 7128 7129 7130 7131 7132 7133 7134 7135 7136 7137 7138 7139 7140
7141 7142 7143 7144 7145 7146 7147 7148 7149 7150 7151 7152 7153 7154
7155 7156 7157 7158 7159 7160 7161 7162 7163 7164 7165 7166 7167 7168
7169 7170 7171 7172 7173 7174 7175 7176 7177
Basically, I need to find the numbers that contain numbers 8 and 9 so I can remove them from the list.
I tried this regex: ([0-7][0-7][8-9]{2}) but that will only match numbers that strictly have both numbers 8 & 9.
How about you just write some simple code rather than trying to cram everything into a regex?
#!/usr/bin/perl -i -p # Process the file in place
#n = split / /; # Split on whitespace into array #n
#n = grep { !/[89]/ } #n; # #n now contains only those numbers NOT containing 8 or 9
$_ = join( ' ', #n ); # Rebuild the line
Dalorzo answer would work, but I suggest a different approach:
/\b(?=\d{4}\b)(\d*[89]\d*)\b/g
Assuming you are only looking for 4 digit numbers, then it is using a positive lookahead to ensure you have those (so it won't match, say, 3 or 5 digit numbers) and then checks if at least one of the digits is 8 or 9.
http://regex101.com/r/hW4vQ3
If you need to catch all numbers, not just four digit ones, then
/\b(?=\d+\b)(\d*[89]\d*)\b/g
See it in action:
http://regex101.com/r/bW2gH3
And as a bonus, the regex is also capturing the numbers so you can do a replace afterwards, if you wish
This is a bit long-winded, but easier to decipher:
/\b([89]\d{3}|\d[89]\d{2}|\d{2}[89]\d|\d{3}[89])\b/g
It also restricts the search to 4-digit groups.
How about:
/\b((?:[\d]+)?[89](?:[\d]+)?)\b/g
Online Demo
\b will match the end and the begging of each number.
(?:[\d]+)? a non matching group of numbers, we need optional at the begging [89] and ending [89] and containing [89].
?: The non-matching group may be optional in this expression but there was not need to match the sub-groups.
You can use this pattern:
[0-7]*(?:8[0-8]*9|9[0-9]*8)[0-9]*
or with a backreference:
(?:[0-9]*(?!\1)([89])){2}[0-9]*
re.findall(r"(\d\d[0-7][89])|(\d\d[89][0-7])|(\d\d[89][89])",x)
Works for the input given.
Slightly simpler regex with lookahead:
(?=\d*[89])\d+
Demo
I have a string in the format A123ABC
First letter cannot contain <I,O,Q,U,Z>
Next 3 digits (0-9) from 21-998
Last 3 letters cannot include <I,Q,Z>
I used the following expression [A-HJ-NPR-TV-Y]{1}[0-9]{2,3}[A-HJ-PR-Y]{3}
But I am not able to restrict the number in the range 21-998.
Your letter part is fine, below is just the numbers portion:
regex = "(?:2[1-9]|[3-9][0-9]|[1-8][0-9][0-9]|9[0-8][0-9]|99[0-8])"
(?:...) group, but do not capture.
2[1-9] covers 21-29
[3-9][0-9] covers 30-99
[1-8][0-9][0-9] covers 100-899
9[0-8][0-9] covers 900-989
99[0-8] covers 990-998
| stands for "or"
Note: [0-9] may be replaced by \d. So, a more concise representation would be:
regex = "(?:2\d|[3-9]\d|[1-8]\d{2}|9[0-8]\d|99[0-8])"
One option would be matching (\d+) and checking if that falls in the range 21 - 998 outside a regex, in the language you're using, if possible.
If that is not feasible, you have to break it up (just showing the middle part):
(2[1-9]|[3-9]\d|[1-8]\d\d|9[0-8]\d|99[0-8])
Breakdown:
2[1-9] matches 21 - 29
[3-9]\d matches 30 - 99
[1-8]\d\d matches 100 - 899
9[0-8]\d matches 900 - 989
99[0-8] matches 990 - 998
Also, the {1} is superfluous and can be omitted, making the complete regex
[A-HJ-NPR-TV-Y](2[1-9]|[3-9]\d|[1-8]\d\d|9[0-8]\d|99[0-8])[A-HJ-PR-Y]{3}
Assuming the numbers between 21 and 99 are displayed with three digits (ie. : 021, 055, 099), here's a solution for the number part :
((02[1-9])|(0[3-9][0-9])|([1-8][0-9]{2})|(9([0-8][0-9])|(9[0-8])))
Entire regex :
[A-HJ-NPR-TV-Y]{1}((02[1-9])|(0[3-9][0-9])|([1-8][0-9]{2})|(9([0-8][0-9])|(9[0-8])))[A-HJ-PR-Y]{3}
There are probably easier ways to do this, but one way would be to use:
^((?=[^IOQUZ])([A-Z]))((02[^0])|(0[3-9]\d)|([1-8]\d\d)|(9[0-8]\d)|(99[0-8]))((?=[^IQZ])([A-Z])){3}$
To explain:
^ denotes the beginning of the string.
((?=[^IOQUZ])([A-Z])) would give you any capital letter not in <I, O, Q, U, Z>.
((02[^0])|(0[3-9]\d)|([1-8]\d\d)|(9[0-8]\d)|(99[0-8])) denotes any number between ((21 to 29) or (30 to 99) or (100 to 899) or (900 to 989) or (990 to 998)).
((?=[^IQZ])([A-Z])){3} would match any three capital letters not in <I, Q, Z>.
$ would denote the end of the string.
Hello I should think of this regular expression:
The telephone number should begin with 087 OR 088 OR 089 and then it should be followed by7 digits:
This is what I made but it doesn't work correctly: it accepts only numbers which begin with 089
(087)|(088)|(089)[0-9]{7}";
/08[789]\d{7}/
that will match 087xxxxxxx, 088xxxxxxx, 089xxxxxxx numbers.
See it in action
Maybe /08[7-9][0-9]{7}/ is what you're searching for?
Autopsy:
08 - a literal 08
[7-9] - matches the numbers from 7-9 once
[0-9]{7} - matches the numbers from 0-9 repeated exactly 7 times
That said, you might prefer /^08[7-9][0-9]{7}$/ if your string is only the phone number. (^ means "the string MUST start here" and $ means "the string MUST end here").
Actually that will be far better regex for Bulgarian phone numbers:
/(\+)?(359|0)8[789]\d{1}(|-| )\d{3}(|-| )\d{3}/
It checks:
Phones that start with country code(+359) or 0 instead;
if the phone number use delimiters like - or space.
I tried it in https://regex101.com and it did not work against my test set. So I tweaked it a little bit with the below regex pattern:
^([+]?359)|0?(|-| )8[789]\d{1}(|-| )\d{3}(|-| )\d{3}$
i want to validate my phone number with the regex for following formats.i have googled the things but i could not find the regex for following formats...
079-26408300 / 8200
(079) 26408300
079 264 083 00
9429527462
can anyone please guide me how can i do validate the phone number field for above formats?
I want to validate the phone number for only above formats as right now am using only following regex var phone_pattern = /^[a-z0-9]+$/i;
#Ali Shah Ahmed
var phone_pattern = "(\d{10})|(\d{3}-\d{8}\s/\s\d{4})|((\d{3}\s){3}\d{2})|((\d{3})\s\d{8})";
here is the way am checking if its valid
if (!phone_pattern.test(personal_phone))
{
$("#restErrorpersonalphone").html('Please enter valid phone number');
$("#personal_phone").addClass('borderColor');
flag = false;
} else {
$("#restErrorpersonalphone").html('');
$("#personal_phone").removeClass('borderColor');
}
its not working. Am I implementing in wrong way?
lets start with the simplest phone number 9429527462
As this has 10 characters and all are numbers, regex for it could be \d{10}
Now the next phone number 079 264 083 00. Regex for this pattern could be (\d{3}\s){3}\d{2}
First we are expecting a group of 3 digits and a space to repeat thrice (\d{3}\s){3}, this will cover 079 264 083 (space included in it), so left will be the last two characters which are handled using \d{2}
For the phone number (079) 26408300, \(\d{3}\)\s\d{8} regex could be use. The regex first looks for a opening bracket, then three digits inside it, and then the closing bracket. It then looks for a space, and then for 8 digits.
The phone number 079-26408300 / 8200 could be validated using regex \d{3}-\d{8}\s/\s\d{4}. It first looks for 3 digits then a -, then 8 digits followed by a space. Then looks for a / and then a space and then 4 digits.
If you wish to know a single regex for validating all the above patterns, do let me know.
Final combined regex would be:
/(\d{10})|(\d{3}-\d{8}\s\/\s\d{4})|((\d{3}\s){3}\d{2})|(\(\d{3}\)\s\d{8})/
Straightforward solution is simple, use |
String ex = "\\d{3}-\\d{8} / \\d{4}|\\(\\d{3}\\) \\d{8}|...