I have a bunch of numbers which I want to parse.
+79261234567
89261234567
79261234567
9261234567
+7 926 123 45 67
8(926)123-45-67
123-45-67
79261234567
(495)1234567
(495) 123 45 67
89261234567
8-926-123-45-67
8 927 1234 234
8 927 12 12 888
8 927 12 555 12
8 927 123 8 123
What I came with at first is cycle through all the variants like this
(\+[\d]{11}|[\d]{10,11}|\+\d\ [\d]{3}\ [\d]{3}\ [\d]{2}\ [\d]{2}|\d\([\d]{3}\)[\d\-]{9}|[\d\ ]{14,15}|[\d\-]{14,15}|[\d\-]{9}|\(\d\d\d\)[\d\-]{9,10}|\(\d\d\d\)[\d\ ]{9,10}|\(\d\d\d\)[\d\-]{7})
Is there more elegant way to match these numbers?
This regex will match all of the examples and not much extra:
[+]?(\b\d{1,2}[ -]?)?([(]?\d{3}[)]?)((?:[ -]?\d){4,7})(?![ -]?\d)
It can contain between 7 to 12 digits.
Although it would still match with something like this :
+12 (345) 6-7-8 9-0-1
But that should be within acceptable limits.
However, that one could still match part of a longer number.
And to avoid that it would need some negative look-behinds.
(note that there are no look-behinds in javascript regex)
[+]?(?<!\d)(?<!\d[ -])(?:((\d{1,2}[ -]?)?[(]?\d{3}[)]?[ -]?)(\d(?:[ -]?\d){3,6}))(?![ -]?\d)
Here's a regex101 test for that last one.
To have a more elegant solution, you will have to make the pattern more relaxed. One option is to capture 7, 10, or 11 numbers separated by 0 or more delimiters:
\+?(?:[ ()-]*\d){10,11}|(?:[ ()-]*\d){7}
Regex101 Tested
Related
There are a number of questions about regex for Australian phone numbers. They cover things like:
0411 123 123
0411123123
+61 411 123 123
+61411123123
(03) 9999 9999
(02)99999999
07 9999 9999
0899999999
The working JS regex for this is below and here https://regex101.com/r/bRbrVZ/1
/^(?:\+?(61))? ?(?:\((?=.*\)))?(0?[2-57-8])\)? ?(\d\d(?:[- ](?=\d{3})|(?!\d\d[- ]?\d[- ]))\d\d[- ]?\d[- ]?\d{3})$/
BUT, I can't work out where to add to this our free call and local call numbers:
13 11 22
131122
1300 111 222
1300111222
1800 111 222
1800111222
And these numbers can't be prefixed with +61
Thanks in advance!
I have come up with this. I feel it's a bit crude, but it works
^(?:\+?(61))? ?(?:\((?=.*\)))?(0?[2-57-8])\)? ?(\d\d(?:[- ](?=\d{3})|(?!\d\d[- ]?\d[- ]))\d\d[- ]?\d[- ]?\d{3})|(13\s?(\d?\s?\d{3}?|\s?\d{2}\s?\d{2})|1[38]00\s?(\d{2}\s?\d{2}\s?\d{2}|\d{3}\s?\d{3}))$
https://regex101.com/r/ay4R67/2/
I have a problem that my Googling tells me can be solved with Regex, but I'm completely unfamiliar and I tried following some tutorials but I'm entirely lost. I have this sample data set:
59 65 21366 CLEMENTINES 4.89 2.00 9.78
59 61 22384 PORK BACK RIBS 6.50 2.40 15.59
59 65 30669 BANANAS 1.89 1.00 1.89
59 13 391314 KODIAK POWER CAKES 14.69 1.00 14.69
59 65 392373 BAJA CHOPPED SALAD KIT 2.99 1.00 2.99
59 39 429227 FILA MENS ANKLE SOCK 6PK 9.99 1.00 9.99
59 65 1056187 ASIAN CASHEW SALAD KIT 2.99 1.00 2.99
59 28 1159696 SHOPKINS GG/TWOZIES ASST 5.97 1.00 5.97
59 13 1221327 KODIAK POWER CAKES -3.00 -3.00 COUPON
59 14 1270070 KLEENEX ULTRA SOFT 12 PCK 16.49 1.00 16.49
59 21 5221111 10 DRAWER STORAGE CART 29.99 1.00 29.99
59 17 1019 HALF + HALF 1 L 1.99 1.00 1.99
I want to import it into a spreadsheet. Visually I can see what I want (3 numeric columns at the beginning, then a description that may or may not contain spaces, then usually 3 numeric columns, but sometimes 2 + a word (see the line that ends in "coupon").
But because of the spaces and lack of quotes, my Excel skills (which are also marginal) don't allow me to import this in a sensible way.
I thought of doing multiple processes: pull off the 3 columns at the left and then 3 columns at the right... but in Excel I see no way to operate "from the right".
Any help appreciated. Thanks.
[edit] I realize from the comments that my ignorance has resulted in a poor question.
I didn't realize "Regex" was specific to language, etc. I am trying to import a csv into Excel, but I was using Notepad++ to perform the regex operations. I don't know what "flavor" that uses but the answer below helped greatly.
You can match this with:
^(\S*) (\S*) (\S*) (.*) (\S*) (\S*) (\S*)$
^ matches the start of a line
\S* matches one or more non-whitespace characters
.* matches anything, including spaces
the parentheses capture the matches into capture groups
$ matches the end of a line.
You haven't said what tool you intend to use to do this.
One way is with a Perl one-liner:
perl -pe 's/^(\S*) (\S*) (\S*) (.*) (\S*) (\S*) (\S*)$/"\1","\2","\3","\4","\5","\6","\7"/' input.txt
Returning:
"59","65","21366","CLEMENTINES","4.89","2.00","9.78"
...
"59","13","1221327","KODIAK POWER CAKES","-3.00","-3.00","COUPON"
... etc.
Goal;
Match all variations of phone numbers with 8 digits + (optional) country code.
Stop match when "keyword" is found, even if more matches exist after the "keyword".
Need this in a one-liner and have tried a plethora of variations with lookahead/behind and negate [^keyword] but I am unable to understand how to achieve this.
Example of text;
abra 90998855
kadabra 04 94 84 54
cat 132 23 564
oh the nice Hat +41985 32 565
+17 98 56 32 56
Ladida
keyword
I Want It To Stop Matching Here Or Right Before The "keyword"
more nice text with some matches
cat 132 23 564
oh the nice Hat +41985 32 565
+17 98 56 32 56
Example of regex;
(\+\d{1,2})?[\s]?\(?\d{2,3}\)?[\s]?(\d{2})[\s]?(\d{2})?[\s]?(\d{2,3})
-> This matches all numbers also below the keyword
(\+\d{1,2})?[\s]?\(?\d{2,3}\)?[\s]?(\d{2})[\s]?(\d{2})?[\s]?(\d{2,3})[^keyword]
-> This matches all numbers also below the keyword
(\+\d{1,2})?[\s]?\(?\d{2,3}\)?[\s]?(\d{2})[\s]?(\d{2})?[\s]?(\d{2,3})(?!keyword)
-> This matches all numbers also below the keyword
(\+\d{1,2})?[\s]?\(?\d{2,3}\)?[\s]?(\d{2})[\s]?(\d{2})?[\s]?(\d{2,3})(?=keyword)
-> This matches nothing
((\+\d{1,2})?[\s]?\(?\d{2,3}\)?[\s]?(\d{2})[\s]?(\d{2})?[\s]?(\d{2,3})(?:(?!keyword))*)
-> This matches all numbers also below the keyword
I need to validate that a string follows these rules:
contains numerals
may optionally contain any number of space characters in any position
may not contain any other kind of character
the first two numerals must be one of the set: 02; 03; 07; 08; 13; 18
and the number of numerals must be exactly 10 unless the first two numerals are 1 and 3, in which case the number of numerals may be 10 or 6.
Essentially these are Australian landline (with area code), free-call and 13 numbers.
Ideally the regex should be as implementation-agnostic as possible.
Examples of valid input:
0299998888
02 99998888
02 9999 8888
02 99 998 888
0299 998 888
0299 998888
131999
131 999
13 19 99
1300123456
1300 123456
1300 123 456
1300 12 34 56
1300 12 34 56
PS. I've checked at least 5 other answers and searched for multiple variations of this question, to no avail.
The nearest I have is:
^(?=\d{10}$)(02|03|04|07|08|13|18)\d+
... however this does not account for spacing and won't accept 6 digit numbers beginning with 13.
Note, in theory, the following is acceptable:
1 3 1999
1 3 1 9 9 9
By this I mean that first pair of numerals may have a space between them (as bad as that looks).
Following are examples of random numbers that should fail:
13145 (not enough numerals)
1300-123-456 (hyphens not permitted)
9999 8888 (not enough numerals)
(02) 9999 8888 (parentheses not permitted)
You can make a separate pattern for 13 in alternation:
^(?:(?=(?:\s*\d\s*){10}$)(?:0\s*[2378]|1\s*[38])|(?=(?:\s*\d\s*){6}$)1\s*3).*
Demo: https://regex101.com/r/Hkjus2/2
I am trying to write one regular express to search for a phone number similar to
011 (134) 1234567892.
The country code must only be 011. And the area code in () can be 134 132 131 138 136 or 137. The last 10 numbers can be random. I have this
((\<011[\-\. ])?(\(|\<)\d\d\d[\)\.\-/]?)?\<\d\d\d\d\d\d\d\d\d\d\>
but it is only giving me one result.
If any could please give me some help..that would be great! Thanks.
This one should work:
(011 \(13[124678]\) \d{10})
You can see working DEMO which shows couple of correct and incorrect inputs.
^011 \(13[124678]\) \d{10}$
seems to match all of the phone numbers I tried given your constraints
^ matches the start of string
011 matches only 011
\(13[124678]\) matches 134 132 131 138 136 or 137
\d{10} matches a digit using the digit character class exactly 10 times using the repeat N syntax {n}
/011 \(13[124678]\) \d{10}/g
Don't forget the g flag to match all the occurrences.