Writing a Regular Expression - regex

I am trying to write one regular express to search for a phone number similar to
011 (134) 1234567892.
The country code must only be 011. And the area code in () can be 134 132 131 138 136 or 137. The last 10 numbers can be random. I have this
((\<011[\-\. ])?(\(|\<)\d\d\d[\)\.\-/]?)?\<\d\d\d\d\d\d\d\d\d\d\>
but it is only giving me one result.
If any could please give me some help..that would be great! Thanks.

This one should work:
(011 \(13[124678]\) \d{10})
You can see working DEMO which shows couple of correct and incorrect inputs.

^011 \(13[124678]\) \d{10}$
seems to match all of the phone numbers I tried given your constraints
^ matches the start of string
011 matches only 011
\(13[124678]\) matches 134 132 131 138 136 or 137
\d{10} matches a digit using the digit character class exactly 10 times using the repeat N syntax {n}

/011 \(13[124678]\) \d{10}/g
Don't forget the g flag to match all the occurrences.

Related

Regex to capture and reposition the same pattern

I have a list of numbers that I would like to reformat, but I'm having difficulty with (I think) the substitution -- I'm capturing the groups as I intend to, but they aren't being rendered the way I expect them to be.
Here's some of the text:
Rear seal:
102
111
112
113
137
156
And the expected output is this:
Rear seal:
102 111 112
113 137 156
I'm using this regex to distinguish the first, second, and third lines:
(\d{3}[\n\r])(\d{3}[\n\r])(\d{3}[\n\r]) coupled with \1\t\2\t\3\n for the substitution. But for some reason it comes out as
Rear seal:
102
111
112
113
137
156
I'm using the excellent site regex101.com for testing, but I could use some human input. Specific link is
https://regex101.com/r/R7niEU/1 for this issue.
Thanks in advance.
You are capturing the newline in the capturing group. That way it will also be part of the replacement.
You can only capture the digits and match the newline instead.
Then replace with \1\t\2\t\3\n
(\d{3})[\n\r](\d{3})[\n\r](\d{3})[\n\r]
Regex demo

Regex match multiple numbers stop at string (word) despite more matches exist

Goal;
Match all variations of phone numbers with 8 digits + (optional) country code.
Stop match when "keyword" is found, even if more matches exist after the "keyword".
Need this in a one-liner and have tried a plethora of variations with lookahead/behind and negate [^keyword] but I am unable to understand how to achieve this.
Example of text;
abra 90998855
kadabra 04 94 84 54
cat 132 23 564
oh the nice Hat +41985 32 565
+17 98 56 32 56
Ladida
keyword
I Want It To Stop Matching Here Or Right Before The "keyword"
more nice text with some matches
cat 132 23 564
oh the nice Hat +41985 32 565
+17 98 56 32 56
Example of regex;
(\+\d{1,2})?[\s]?\(?\d{2,3}\)?[\s]?(\d{2})[\s]?(\d{2})?[\s]?(\d{2,3})
-> This matches all numbers also below the keyword
(\+\d{1,2})?[\s]?\(?\d{2,3}\)?[\s]?(\d{2})[\s]?(\d{2})?[\s]?(\d{2,3})[^keyword]
-> This matches all numbers also below the keyword
(\+\d{1,2})?[\s]?\(?\d{2,3}\)?[\s]?(\d{2})[\s]?(\d{2})?[\s]?(\d{2,3})(?!keyword)
-> This matches all numbers also below the keyword
(\+\d{1,2})?[\s]?\(?\d{2,3}\)?[\s]?(\d{2})[\s]?(\d{2})?[\s]?(\d{2,3})(?=keyword)
-> This matches nothing
((\+\d{1,2})?[\s]?\(?\d{2,3}\)?[\s]?(\d{2})[\s]?(\d{2})?[\s]?(\d{2,3})(?:(?!keyword))*)
-> This matches all numbers also below the keyword

phone number RegEx not working for some strings

I want to recognize phone number as 9 consecutive figures which can be separated by white spaces, non-breaking spaces etc. with regEx "(\s*\d\s*){9}"
I run VBA macro (JS RegEx) and here are example strings which work fine with above RegEx:
ul. 27 Grudnia 16, tel. 21 287 31 61, fax 61 286 69 60 –
ul. Wrzosowa 110/120/222, kom. 692 601 428
And here is an example where phone number is not detected in VBA, but is detected by RegEx JS online tools:
al. Mazowieckiego 63, kom. 622 769 694 –
Strings which are detected and these which are not, have the same structure, so I have no idea why VBA doesn't detect phone number in some of them.
It came out that VBA changed some strings to look in - replaced a whitespace - chr(32) with a non breaking chr(160).
Removing chr(160) from string to look in solves the problem.
Also I will try to find RegEx which will let non-breaking spaces, because \s* doesn't do so, at least in VBA.

Italian phone 10-digit number regex issue

I'm trying to use the regex from this site
/^([+]39)?((38[{8,9}|0])|(34[{7-9}|0])|(36[6|8|0])|(33[{3-9}|0])|(32[{8,9}]))([\d]{7})$/
for italian mobile phone numbers but a simple number as 3491234567 results invalid.
(don't care about spaces as i'll trim them)
should pass:
349 1234567
+39 349 1234567
TODO: 0039 349 1234567
TODO: (+39) 349 1234567
TODO: (0039) 349 1234567
regex101 and regexr both pass the validation..what's wrong?
UPDATE:
To clarify:
The regex should match any number that starts with either
388/389/380 (38[{8,9}|0])|
or
347/348/349/340 (34[{7-9}|0])|
or
366/368/360 (36[6|8|0])|
or
333/334/335/336/337/338/339/330 (33[{3-9}|0])|
328/329 (32[{8,9}])
plus 7 digits ([\d]{7})
and the +39 at the start optionally ([+]39)?
The following regex appears to fulfill your requirements. I took out the syntax errors and guessed a bit, and added the missing parts to cover your TODO comments.
^(\((00|\+)39\)|(00|\+)39)?(38[890]|34[7-90]|36[680]|33[3-90]|32[89])\d{7}$
Demo: https://regex101.com/r/yF7bZ0/1
Your test cases fail to cover many of the variations captured by the regex; perhaps you'll want to beef up the test set to make sure it does what you want.
The beginning allows for an optional international prefix with or without the parentheses. The basic pattern is (00|\+)39 and it is repeated with or without parentheses around it. (Perhaps a better overall approach would be to trim parentheses and punctuation as well as whitespace before processing begins; you'll want to keep the plus as significant, of course.)
Updated with information from #Edoardo's answer; wrapped for legibility and added comments:
^ # beginning of line
(\((00|\+)39\)|(00|\+)39)? # country code or trunk code, with or without parentheses
( # followed by one of the following
32[89]| # 328 or 329
33[013-9]| # 33x where x != 2
34[04-9]| # 34x where x not in 1,2,3
35[01]| # 350 or 351
36[068]| # 360 or 366 or 368
37[019] # 370 or 371 or 379
38[089]) # 380 or 388 or 389
\d{6,7} # ... followed by 6 or 7 digits
$ # and end of line
There are obvious accidental gaps which will probably also get filled over time. Generalizing this further is likely to improve resilience toward future changes, but of course may at the same time increase the risk of false positives. Make up your mind about which is worse.
I found this and i updated with new operators and MVNO prefixes (Iliad, ho.)
^(\((00|\+)39\)|(00|\+)39)?(38[890]|34[4-90]|36[680]|33[13-90]|32[89]|35[01]|37[019])\d{6,7}$
I improved the regex adding the case to handle space between numbers:
^(\((00|\+)39\)|(00|\+)39)?(38[890]|34[4-90]|36[680]|33[13-90]|32[89]|35[01]|37[019])(\s?\d{3}\s?\d{3,4}|\d{6,7})$
so, for example, I can match phone number like this (0039) 349 123 4567 or this 349 123 4567
Following doc:
https://it.qaz.wiki/wiki/Telephone_numbers_in_Italy
A simple regex for MOBILE italian numbers without special chars is:
/^3[0-9]{8,9}$/
it match a string starting with the digit '3' and followed by 8 or 9 digits, ex:
3345678103
you can add then ITALIAN prefix like '+39 ' or '0039 '
/^+39 3[0-9]{8,9}$/ --- match --> +39 3345678103
/^\0039 3[0-9]{8,9}$/ --- match --> 0039 3345678103

regex - if the first digit is 1 return 1 but if it is 145 return 145 but if its 133 return 133

here is my regex demo
as the question states:
if the first digit is 1 return 1 but if it is 145 return 145 but if its 133 return 133
sample dataa:
K'8134567
K'81345678
K'6134516789
K'61345678
K'643456
K'646345678
K'1234567890
K'12345678901
K'1454567890 <<<--- want 145 returned and not 1
K'13345678901 <<<--- want 133 returned and not 1
K'3214567890123
K'32134567890123
K'3654567890123
K'8934567890123
K'6554567890123
regex exprtession:
K'(?|(?P<name1>81)\d+|(61)\d+|(64)\d+|(1)\d+|(44)\d+|(86)\d+|(678)\d+|(41)\d+|(49)\d+|(33)\d+|(685)\d+|(\d{1,3})\d+)
the regex explained:
I am interested in the digits after K'
I am looking to do this using regex but not sure if it can be done.
What I want is:
if the number starts with 81 return 81
if the number starts with 61 return 61
...
if the number starts with something i am not interested in return other(or its first digits of 1-3)
The above criteria works:
but my question is how do I do the following:
if the fist digit is 1 then return 1 BUT
if the fist digit is 1 and the 2nd and 3rd digit are 45 return 145 and don't return just 1
if the fist digit is 1 and the 2nd and 3rd digit are 33 return 133 and don't return just 1
I presume I have to put something inside this part of the regex |(1)\d+|
Som other questions for my own reference:
Does regex sort the data first?
Is the order of the regex search important to how it is implemented? i deally I do not want this.
You can use this regex:
K'(?P<name1>81|61|64|44|86|678|41|49|33|685|1(?:33|45)?|\d{2,3})\d+
Updated RegEx Demo
Try with:
K'(?|(?P<name1>81)\d+|(61)\d+|(64)\d+|(1(?:45|33)?)\d+|(44)\d+|(86)\d+|(678)\d+|(41)\d+|(49)\d+|(33)\d+|(685)\d+|(\d{1,3})\d+)
DEMO
regex doesn't sorts anything but the order of your regex is important, actually based on your regex engine it would be a bit different but since most of regex engines use Traditional NFA for parsing string the order is important.
And in this case you can simply us following regex or add it to your regex :
(?<=K')1(?:45|33)?
See demo https://regex101.com/r/rT2yJ0/1