phone number RegEx not working for some strings

phone number RegEx not working for some strings - regex

I want to recognize phone number as 9 consecutive figures which can be separated by white spaces, non-breaking spaces etc. with regEx "(\s*\d\s*){9}"
I run VBA macro (JS RegEx) and here are example strings which work fine with above RegEx:
ul. 27 Grudnia 16, tel. 21 287 31 61, fax 61 286 69 60 –
ul. Wrzosowa 110/120/222, kom. 692 601 428
And here is an example where phone number is not detected in VBA, but is detected by RegEx JS online tools:
al. Mazowieckiego 63, kom. 622 769 694 –
Strings which are detected and these which are not, have the same structure, so I have no idea why VBA doesn't detect phone number in some of them.

It came out that VBA changed some strings to look in - replaced a whitespace - chr(32) with a non breaking chr(160).
Removing chr(160) from string to look in solves the problem.
Also I will try to find RegEx which will let non-breaking spaces, because \s* doesn't do so, at least in VBA.

Related

Regex - Extract string between characters if they exist

I would need to use RegEx to extract a string between characters if they exist (The colon character).
Examples:
SX: 22AA 001 267
2294 0BB 267: 09
2294 0CC 267
In all cases, I want the result.
2294 001 267
Thank you all.

You can use this regex to match them all
(?:^|:)\s?([A-Z\d]+(?: [A-Z\d]+)+)(?:$|:)
NOTE: As you did not mention what language you're using I decided to not use lookarounds. So you have to get the first group from the match.

How do I format a list of phone numbers using regular expression in vim commands?

Given the following list of phone numbers
8144658695
812 673 5748
812 453 6783
812-348-7584
(617) 536 6584
834-674-8595
Write a single regular expression (use vim on loki) to reformat the numbers so they look like this
814 465 8695
812 673 5748
812 453 6783
812 348 7584
617 536 6584
834 674 8595
I am using the search and replace command. My regular expression using back referencing:
:%s/\(\d\d\d\)\(\d\d\d\)\(\d\d\d\d\)/\1 \2 \3\g
only formats the first line.
Any ideas?

Try this:
:%s,.*\(\d\d\d\).*\(\d\d\d\).*\(\d\d\d\d\).*,\1 \2 \3,

First use count to match a pattern multiple times, it is a bad habbit to repeat the pattern:
\d\{3} "instead of \d\d\d
Than you also have to match the whitespaces etc:
:%s/.*\(\d\{3}\).*\(\d\{3}\).*\(\d\{4}\).*/\1 \2 \3/g
Or even better, escape the whole regex with \v:
:%s/\v.*(\d{3}).*(\d{3}).*(\d{4}).*/\1 \2 \3/g
This greatly increases readability

Italian phone 10-digit number regex issue

I'm trying to use the regex from this site
/^([+]39)?((38[{8,9}|0])|(34[{7-9}|0])|(36[6|8|0])|(33[{3-9}|0])|(32[{8,9}]))([\d]{7})$/
for italian mobile phone numbers but a simple number as 3491234567 results invalid.
(don't care about spaces as i'll trim them)
should pass:
349 1234567
+39 349 1234567
TODO: 0039 349 1234567
TODO: (+39) 349 1234567
TODO: (0039) 349 1234567
regex101 and regexr both pass the validation..what's wrong?
UPDATE:
To clarify:
The regex should match any number that starts with either
388/389/380 (38[{8,9}|0])|
or
347/348/349/340 (34[{7-9}|0])|
or
366/368/360 (36[6|8|0])|
or
333/334/335/336/337/338/339/330 (33[{3-9}|0])|
328/329 (32[{8,9}])
plus 7 digits ([\d]{7})
and the +39 at the start optionally ([+]39)?

The following regex appears to fulfill your requirements. I took out the syntax errors and guessed a bit, and added the missing parts to cover your TODO comments.
^(\((00|\+)39\)|(00|\+)39)?(38[890]|34[7-90]|36[680]|33[3-90]|32[89])\d{7}$
Demo: https://regex101.com/r/yF7bZ0/1
Your test cases fail to cover many of the variations captured by the regex; perhaps you'll want to beef up the test set to make sure it does what you want.
The beginning allows for an optional international prefix with or without the parentheses. The basic pattern is (00|\+)39 and it is repeated with or without parentheses around it. (Perhaps a better overall approach would be to trim parentheses and punctuation as well as whitespace before processing begins; you'll want to keep the plus as significant, of course.)
Updated with information from #Edoardo's answer; wrapped for legibility and added comments:
^ # beginning of line
(\((00|\+)39\)|(00|\+)39)? # country code or trunk code, with or without parentheses
( # followed by one of the following
32[89]| # 328 or 329
33[013-9]| # 33x where x != 2
34[04-9]| # 34x where x not in 1,2,3
35[01]| # 350 or 351
36[068]| # 360 or 366 or 368
37[019] # 370 or 371 or 379
38[089]) # 380 or 388 or 389
\d{6,7} # ... followed by 6 or 7 digits
$ # and end of line
There are obvious accidental gaps which will probably also get filled over time. Generalizing this further is likely to improve resilience toward future changes, but of course may at the same time increase the risk of false positives. Make up your mind about which is worse.

I found this and i updated with new operators and MVNO prefixes (Iliad, ho.)
^(\((00|\+)39\)|(00|\+)39)?(38[890]|34[4-90]|36[680]|33[13-90]|32[89]|35[01]|37[019])\d{6,7}$

I improved the regex adding the case to handle space between numbers:
^(\((00|\+)39\)|(00|\+)39)?(38[890]|34[4-90]|36[680]|33[13-90]|32[89]|35[01]|37[019])(\s?\d{3}\s?\d{3,4}|\d{6,7})$
so, for example, I can match phone number like this (0039) 349 123 4567 or this 349 123 4567

Following doc:
https://it.qaz.wiki/wiki/Telephone_numbers_in_Italy
A simple regex for MOBILE italian numbers without special chars is:
/^3[0-9]{8,9}$/
it match a string starting with the digit '3' and followed by 8 or 9 digits, ex:
3345678103
you can add then ITALIAN prefix like '+39 ' or '0039 '
/^+39 3[0-9]{8,9}$/ --- match --> +39 3345678103
/^\0039 3[0-9]{8,9}$/ --- match --> 0039 3345678103

Writing a Regular Expression

I am trying to write one regular express to search for a phone number similar to
011 (134) 1234567892.
The country code must only be 011. And the area code in () can be 134 132 131 138 136 or 137. The last 10 numbers can be random. I have this
((\<011[\-\. ])?(\(|\<)\d\d\d[\)\.\-/]?)?\<\d\d\d\d\d\d\d\d\d\d\>
but it is only giving me one result.
If any could please give me some help..that would be great! Thanks.

This one should work:
(011 \(13[124678]\) \d{10})
You can see working DEMO which shows couple of correct and incorrect inputs.

^011 \(13[124678]\) \d{10}$
seems to match all of the phone numbers I tried given your constraints
^ matches the start of string
011 matches only 011
\(13[124678]\) matches 134 132 131 138 136 or 137
\d{10} matches a digit using the digit character class exactly 10 times using the repeat N syntax {n}

/011 \(13[124678]\) \d{10}/g
Don't forget the g flag to match all the occurrences.

Regular expression for dividing country calling codes

I have a list of calling codes for all countries(the phone number prefixes), I would like to split them up in the
country name and the actual code so I can put then into an xml.
I have tried back and forth but can not get a regexp going that takes all cases into account.
I think it is fairly simple for someone with a bit of experience.
The codes have these formats:
Afghanistan 93
Anguilla 1 264
Antarctica 6721
Antigua and Barbuda 1 268
Bosnia and Herzegovina 387
Canada 1
Congo, Republic of the 242
Cote d'Ivoire 225
Ireland (Eire) 353
United States of America 1
There are around 235 of them in total, but these are the regulars and the exceptions.
^[a-zA-Z]\s,'()] for between 1 and X words and then it is [0-9\s]{1,5}$ for the numbers:
X
XX
XXX
XXXX
X XXX
So if I should express it as a sentence it would be: "from beginning of a line, take all characters (1) including space,'() until you encounter digits, then take all of these including space(2) until you encounter a line break."
I am using TextMate, and the docs says:
TextMate uses the Oniguruma regular
expression library by K. Kosako.
I would appreciate any help given:)
Thank you.

This posix regex should be sufficient: ^[a-zA-Z ]+[0-9 ]+$

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

phone number RegEx not working for some strings - regex

It came out that VBA changed some strings to look in - replaced a whitespace - chr(32) with a non breaking chr(160). Removing chr(160) from string to look in solves the problem. Also I will try to find RegEx which will let non-breaking spaces, because \s* doesn't do so, at least in VBA.

Related

Regex - Extract string between characters if they exist

How do I format a list of phone numbers using regular expression in vim commands?

Italian phone 10-digit number regex issue

Writing a Regular Expression

Regular expression for dividing country calling codes

Categories

Resources