I figured out a regular expresion for my country's phone but I've something missing.
The rule here is: (Area Code) Prefix - Sufix
Area Code could be 3 to 5 digits
Prefix could be 2 to 4 digits.
Area Code + Prefix is 7 digits long.
Sufix is always 4 digits long
Total digits are 11.
I figured I could have 3 simple regex chained with an OR "|" like this:
/(\(?\d{3}\)?[- .]?\d{4}[- .]?\d\d\d\d)|(\(?\d{4}\)?[- .]?\d{3}[- .]?\d\d\d\d)|(\(?\d{5}\)?[- .]?\d{2}[- .]?\d\d\d\d)/
The thing I'm doing wrong is that \d\d\d\d doesn't match only 4 digits for the sufix, for example: (011) 4740-5000 which is a valid phone number, works ok but if put extra digits it will also return as a valid phone number, ie: (011) 4740-5000000000
You should use ^ and $ to match whole string
For example ^\d{4}$ will match exactly 4 digits not more not less.
Here is the complete regex pattern
^((\(?\d{3}\)? \d{4})|(\(?\d{4}\)? \d{3})|(\(?\d{5}\)? \d{2}))-\d{4}$
Online demo
As per your regex pattern delimiter can be -,. or single space then try
^((\(?\d{3}\)?[-. ]?\d{4})|(\(?\d{4}\)?[-. ]?\d{3})|(\(?\d{5}\)?[-. ]?\d{2}))[-. ]?\d{4}$
This pattern works fine for me:
/^\\(?(\d{3,5})?\\)?\s?(15)?[\s|-]?(4)\d{2,3}[\s|-]?\d{4}$/
I've tested this in regex101:
/^((?:\(?\d{3}\)?[- .]?\d{4}|\(?\d{4}\)?[- .]?\d{3}|\(?\d{5}\)?[- .]?\d{2})[- .]?\d{4})$/
RegEx Demo
^ Matches the beginning of a string
( Beginning of capture group
(?: Beginning of non-capturing group
Your different options for area code & prefix
) End non-capturing group
[- .]?\d{4} The last four digits of the phone number
) End capture group
$ Matches the end of a string
If you're trying to validate such a phone number, then the following one should suit your needs:
^(?=.{15}$)[(]\d{3,5}[)] \d{2,4}-\d{4}$
Debuggex Demo
You need to match the complete expression by indicating the start and end with anchors. You also don't need alternation for the different lengths.
/^(?=(\D*\d){11}$)\(?\d{3,5}\)?[- .]?\d{2,4}[- .]?\d{4}$/
Here's the breakdown:
(?=(\D*\d){11}$) is a non-capturing group ensuring that there are 11 digits total,
with any number of non-digits amongst them
\(?\d{3,5}\)?[- .]? matches 3-5 digits in parens (area code), followed by a separator
\d{2,4}[- .]? matches 2-4 digits (prefix), followed by a separator
\d{4} matches the suffix
Related
I am trying to create a basic regular expression to match a phone number which can either use dots [.] or hyphens [-] as the separator.
The format is 123.456.7890 or 123-456-7890.
The expression I am currently using is:
\d\d\d[-.]\d\d\d[-.]\d\d\d\d
The issue here is that it also matches the phone numbers that have both separators in them which I want to be termed as invalid/not a match. For example, with my expression, 123.456-7890 and 123-456.7890 show up as a match, something I do not want happening.
Is there a way to do that?
Use a backreference:
^\d{3}([.-])\d{3}\1\d{4}$
Here is an explanation of the regex:
^ from the start of the number
\d{3} match any 3 digits
([.-]) then match AND capture either a dot or a dash separator
\d{3} match any 3 digits
\1 match the SAME separator seen earlier
\d{4} match any 4 digits
$ end of the number
You can use this regex:
^\d{3}([-.])\d{3}\1\d{4}$
You can see that it works here.
Key point here - is that you capture your desired character using brackets ([-.])
and then reuse it with back reference \1.
I need to extract any number between 4-10 digits that following directly after 'PO#' OR 'PO# ' (with a whitespace). I do not want to include the PO# with the actual value that is extracted, however I do need it as criteria to target the value within a string. If the digits are less than 4 or greater than 10, I do not wish to capture the value and would like to otherwise ignore it.
A sample string would look like this:
PO#12445 for Vendor Enterprise
or
Invoice# 21412556 for Vendor Enterprise for PO# 12445
My current RegEX expression captures PO# with '#' and I use additional logic after the fact to remove the '#', however my expression is also capturing Invoice# and Inv# which I don't want it to do. I'd like it to only target PO#.
Current Expression: [P][O][#]\s*[0-9]{3,9}\d+\w
Any help would be greatly appreciated!
If you need only the digits, you can use \b(?<=PO#)\s?(\d{4,10})\b, with:
(?<=PO#): positivive lookbehind, be sure that this pattern is present before the needed pattern (PO followed by #)
\s?: 0 or 1 whitespace
(\d{4,10}): between 4 and 10 digits
\b: word boundaries to avoid ie. the 10 first digits of a 11 digits pattern match or 'SPO#' to match
Edit: Alexander Mashin is right about the lookbehind having to be fixed width, so \b(?<=PO#)\s?(\d{4,10})\b is better https://regex101.com/r/1KBQd1/5
Edit: added word boundaries
You can use a capturing group and repeat matching the digits 4-10 times using [0-9]{4,10}.
Note that [P][O][#] is the same as PO#
\bPO#\s*([0-9]{4,10})\b
\bPO#\s* Match PO# preceded by a word boundary and match 0+ whitespace chars
( Capture group 1
[0-9]{4,10} Match 4 - 10 digits
)\b Close group followed by a word boundary to prevent the match being part of a larger word
Regex demo
If PCRE is available, how about:
PO#\s*\K\d{4,10}(?=\D|$)
PO#\s* matches the leading substring "PO#" followed by 0 or more whitespaces.
\K resets the starting position of the match and works as a positive (zero length) lookbehind.
\d{4,10} matches a sequence of digits of 4 <= length <= 10.
(?=\D|$) is the positive lookahead to match a non-digit character or the end of the string.
Example string
fgcfghhfghfgch1234567890fghfghfgh fhghghfgh+916546546165fghfghfghfgh fhfghfghfghfgh+915869327425ghfghfghfgh
I want to match
1234567890
6546546165
5869327425
In essence i would like to do something like this (?<=\+\d{2})?\d{10}.
Match 10 digits \d{10} which may follow ? a country code in format: \+\d{2}.
What would be a correct regular expression to do this?
Also,
What to do if the country code could possibly be even 3 digit long.
e.g.
+917458963214
+0047854123698
match 7854123698 and 7458963214.
Your expected matches all appear to be immediately followed with a char other than a digit.
I suggest making the pattern inside the positive lookbehind optional and adding (?!\d) lookahead at the end to fail the match (and thus triggering backtracking in the lookbehind) if the ten digits are immediately followed with a digit:
(?<=(?:\+\d{2,3})?)\d{10}(?!\d)
See the regex demo. Details:
(?<=(?:\+\d{2,3})?) - a positive lookbehind that requires + and two or three digits or an empty string immediately to the left of the current location
\d{10} - ten digits
(?!\d) - no digit allowed immediately on the right.
However, as in 99% you can access captured substrings, you should just utilize a capturing group:
(?:\+\d{2,3})?(\d{10})
See this regex demo. Your values are in Group 1.
I'm trying to create regex to retrieve last number if there was a number or any number if there wasn't any from a string.
Examples:
6 łyżek stopionego masła -> 6
5 łyżek blabla, 6 łyżek masła -> 6
5 łyżek mąki lub masła -> 5
I'm matching only on masła (changing variable) so it has to be included in regex
EDIT:
I cannot explain what I actually need:
Here is regex101 example: https://regex101.com/r/pEeRk3/1
EDIT2:
Emma's solution works great, but I would need to parse decimals and 2multiple digit numbers as well, meaning that those would match as well:
https://regex101.com/r/pEeRk3/3 - I added examples with answers in the link
If you want to match the last occurence of a digit with a decimal and you word has to follow this value, you might use lookarounds:
(?<!\S)\d+(?:\.\d+)?(?!\S)(?!.*\d)(?=.*masła)
(?<!\S)\d+(?:\.\d+)?(?!\S) Match 1+ digits with an optional past to match a dot and 1+ digits
(?!.*\d) assert that there are no more digits following
(?=.*masła) Assert what is on the right is your word
Regex demo
Or you might use a capturing group:
(?<!\S)(\d+(?:\.\d+)?)[^\d\n]* masła(?!\S)[^\d\n]*$
Regex demo
This expression might simply suffice:
.*([0-9])
if we are interested in one digit only, or
.*([0-9]+)
if multiple digits might be desired.
Demo 1
If those strings with masła are desired, we can expand our expression to:
(?=.*masła).*([0-9])
Demo 2
If we would not be validating our numbers and our number would be valid, with commas or dots, then this expression might likely return our desired output:
(?=.*masła)([0-9,.]+)(\D*)$
Demo 3
I currently have the following regular expression:
^GB([0-9]{9}([0-9]{3})?|[A-Z]{2}[0-9]{3})$
This works fine for:
GB999999973
GBGD001
GBHA599
As can be tested here: https://regex101.com/r/jU980W/1
However the problem is that it does not validate with:
GB999 9999 73
I tried adding space indicators to the regular expression but then the other formats aren't supported anymore.
Does anyone know a way to have this regular expression both accept with and without spaces for the GB VAT Number?
Thanks in advance!
See regex in use here
^GB(?:\d{3} ?\d{4} ?\d{2}(?:\d{3})?|[A-Z]{2}\d{3})$
^ Assert position at the start of the line
GB Match this literally
(?:\d{3} ?\d{4} ?\d{2}(?:\d{3})?|[A-Z]{2}\d{3}) Match either of the following options
\d{3} ?\d{4} ?\d{2}(?:\d{3})? Option 1:
\d{3} Match exactly 3 digits
? Optionally match a space
\d{4} Match exactly 4 digits
? Optionally match a space
\d{3} Match exactly 2 digits
(?:\d{3})? Optionally match exactly 3 digits
[A-Z]{2}\d{3} Option 2:
[A-Z] Match any uppercase ASCII letter
\d{3} Match exactly 3 digits
$ Assert position at the end of the line