I am working on siebel CRM. I have space issues in my regex.
I have SSN numbers in these formats
123 456 789
123-456-789
123 45 6789
I need to dispaly my SSN Like XXX-XX-4567. My regex looks like
([\s.:])(?!000)(?!666)(?!9[0-9][0-9])\d{3}[- ]?(?!00)\d{2}[- ]?(?!0000)\d{4})([\s.:]) |
([\s.:])(?!000)(?!666)(?!9[0-9][0-9])\d{3}[- ]?(?!00)\d{3}[- ]?(?!00)\d{3})([\s.:]).
How can I remove all blank spaces in the above expression and display the format as i mentioned above?
It looks like there are syntax errors in your RegEx. There are a couple of unmatched brackets, at (?!0000)\d{4}) on the first section, the last bracket is unmatched.
I think I've managed to write the regex you're looking for, but a bit shorter than the one you were using:
([\s.:])((?!000)(?!666)(?!9[0-9]{2})\d{3})[- ]?((?!00)\d{2,3})[- ]?((?!00)\d{3,4})([\s.:])
This will match the following strings:
123-12-1234
123 456 789
123-456-789
123 45 6789
But will not match the following:
666-45-1234
abc-12-1232
123-00-1233
123-224-0011
123 224 0000
There are several capture groups here:
Matches any character (you may want to change this).
Matches the first three digit number.
Matches the second, two or three digit number.
Matches the third, three or four digit number.
Matches any character (you may want to change this).
You should be able to reconstruct the SSN in the format you need with the result of this RegEx.
Related
i have a problem and dont get it.
My Regex:
My Test-String:
I have two issues and one general question :)
As you can see in my Test-String the very last (german) Phone Number (the big yellow one in the Test-String attachment) does not match my Regex-Pattern correctly. I dont get it, what is the Problem here? the "0049" fits Group 5, but should fit Group 2, why is that?
My second Problem is, how can i get rid of the spaces before and after every match? (The 7 yellow small circles in the Test-String Attachment)
For copy/paste purposes, here is the Regex and Test-String again:
Regex:
((\+\d{2}|00\d{2})?([ ])?(\()?(\d{2,4})(\))?([-| |/])?(\d{3,})([ ])?(\d+)?([ ])?(\d+)?)
Test-String:
Vorwahl 089, die E.123 ebenfalls , also (089) 1234567. Die DIN 5008, also +49 89 1234567 respectivly 0049 89 1234567. Die E.123 empfiehlt, also +49 89 123456 0 respectivly 0049 89 123456 0 oder +49 89 123456 789. Also +49 89 123 456 789. Klammern 089/1234567 und 0151 19406041. Test +49 151 123 456 789 respectivly 0049 151 123 456 789
Last but not at least, my general question:
Is it a good approach to Group each logical part as i did in my example?
A last Information: I validate my Regex with https://regex101.com/ and use it in Python with the re Module.
The thing that makes it unpredictable are the numerous optional groups (..)?.
As first step i recommend replacing ([ ])?(\d+)? as a coupled expression ([ ]?\d+)?, which will avoid spaces at the end of the match - your point #2.
As a second step i recommend coupling the first optional space with the expression of the "national dialling": ((\+|00)\d{2}([ ])?)?. Now we are lucky, because it solves both the space at the beginning and the recognition of the whole number, due to less possible matching options.
The new expression now looks like this:
(((\+|00)\d{2}([ ])?)?(\()?(\d{2,4})(\))?([-| |/])?(\d{3,})([ ]?\d+)?([ ]?\d+)?)
I now recommend to simplify the last part, if you dont need the single group-values:
(((\+|00)\d{2}([ ])?)?(\()?(\d{2,4})(\))?([-| |/])?(\d{3,})([ ]?\d+){0,2})
For better performance I suggest you remove the parenteses/groups where possible or mark them as non-capturing, if you don't need to have the specific group-values.
In some programming languages you will not need to most outer parenteses, as that is always group 0.
I'm a regex newbie and I've got a valid regex for SSNs:
/^(\d{3}(\s|-)?\d{2}(\s|-)?\d{4})|[\d{9}]*$/
But I now need to expand it to accept either an SSN or another alphanumeric ID of 7 characters, like this:
/^[a-zA-Z0-9]{7}$/
I thought it'd be as simple as grouping the SSN and adding an OR | but my tests are still failing. This is what I've got now:
/^((\d{3}(\s|-)?\d{2}(\s|-)?\d{4})|[\d{9}])|[a-zA-Z0-9]{7}$/
What am I doing wrong? And is there a more elegant way to say either SSN or my other ID?
Thanks for any helpful tips.
Valid SSNs:
123-45-6789
123456789
123 45 6789
Valid ID: aCe8999
I have modified your first regex also a bit, below is demo program. This is as per my understanding of the problem. Let me know if any modification is needed.
my #ids = (
'123-45-6789',
'123456789',
'123 45 6789',
'1234567893434', # invalid
'123456789wwsd', # invalid
'aCe8999',
'aCe8999asa' # invalid
);
for (#ids) {
say "match = $&" if $_ =~ /^ (?:\d{3} ([ \-])? \d{2} \1? \d{4})$ | ^[a-zA-Z0-9]{7}$/x ;
}
Output:
match = 123-45-6789
match = 123456789
match = 123 45 6789
match = aCe8999
Your first regex got some problems. The important thing about it is that it accepts {{{{}}}}} which means you have built a wrong character class. Also it matches 123-45 6789 (notice the mixture of space and dash).
To mean OR in regular expressions you need to use pipe | and remember that each symbol belongs to the side that it resides. So for example ^1|2$ checks for strings beginning with 1 or ending with 2 not only two individual input strings 1 and 2.
To apply the exact match you need to do ^1$|^2$ or ^(1|2)$.
With the second regex ^[a-zA-Z0-9]{7}$ you are not saying alphanumeric ID of 7 characters but you are saying numeric, alphabetic or alphanumeric. So it matches 1234567 too. If this is not a problem, the following regex is the solution by eliminating the said issues:
^\d{3}([ -]?)\d\d\1\d{4}$|^[a-zA-Z0-9]{7}$
I want to allow only some specific types of phone number format.
Ex:
xxx-xxx-xxxx
+91-xxxxxxxxxx.
I don't know what will be the regular expression for this.
I refered some sites and got this
/^(\+\d{1,2}\s)?\(?\d{3}\)?[\s.-]\d{3}[\s.-]\d{4}$/
which works for 1st one but not worked for +91 format.
Basically, I want to allow India and US numbers only.
2nd Question:
I want a regular expression which will allow +, -, (, ), and the numbers i.e. 0-9 .
Here is a suggestion that accept your inputs, with separators , . and -:
\+91[\s-]\d{10}|\(?\d{3}\)?[\s-]\d{3}[\s-]\d{4}
Try it on regex101
The following inputs are valid:
123-456-7890
123 456 7890
(123) 123-6547
(999)-999-9999
+91-1234567890
+91 1234567890
If you only want to accept - as a separator, change all [\s-] by - in the regex.
are u looking for something like ^(\d{3}-\d{3}-\d{4}|\+91 \d{10})
example
both numbers are placed in group one this way if u work with groups
edit:
are u looking for one like this? this allows the sets of numbers to be seperated with everything exept a letter or a number (\d{3}[^\d\w]\d{3}[^\d\w]\d{4}|\+91[^\d\w]\d{10})
example
How do I create a regex that matches telephones with or without spaces in the number?
I have found:
^\+?\d+$
From another post but how do I modify that to allow 0 or more spaces in the number?
The first thing you need to think is the exact format you want for phone numbers containing spaces. Eg:
+535 233 4444
Is that one OK? It means divided like: 3 3 4. You can adapt the following regex to your needs:
^\+?\d{3}\s?\d{3}\s?\{d}{4}$
Just change the quantifiers ({3}, {4}, etc) to change the group lengths.
This is one example:
/^(?:\s*\d{3})?\s*\d{3}\s*\d{4}\s*$/
There's a lot of ways to match telephone numbers (and a lot of valid telephone formats). Here's a simple regex to match "5555555555", "555 555 5555", "(555) 555-5555", "555-555-5555", or "555.555.5555"
^(?\d{3})?( |-|.)?\d{3}( |-|.)?\d{4}$
I would like to extract portion of a text using a regular expression. So for example, I have an address and want to return just the number and streets and exclude the rest:
2222 Main at King Edward Vancouver BC CA
But the addresses varies in format most of the time. I tried using Lookbehind Regex and came out with this expression:
.*?(?=\w* \w* \w{2}$)
The above expressions handles the above example nicely but then it gets way too messy as soon as commas come into the text, postal codes which can be a 6 character string or two 3 character strings with a space in the middle, etc...
Is there any more elegant way of extracting a portion of text other than a lookbehind regex?
Any suggestion or a point in another direction is greatly appreciated.
Thanks!
Regular expressions are for data that is REGULAR, that follows a pattern. So if your data is completely random, no, there's no elegant way to do this with regex.
On the other hand, if you know what values you want, you can probably write a few simple regexes, and then just test them all on each string.
Ex.
regex1= address # grabber, regex2 = street type grabber, regex3 = name grabber.
Attempt a match on string1 with regex1, regex2, and finally regex3. Move on to the next string.
well i thot i'd throw my hat into the ring:
.*(?=,? ([a-zA-Z]+,?\s){3}([\d-]*\s)?)
and you might want ^ or \d+ at the front for good measure
and i didn't bother specifying lengths for the postal codes... just any amount of characters hyphens in this one.
it works for these inputs so far and variations on comas within the City/state/country area:
2222 Main at King Edward Vancouver, BC, CA, 333-333
555 road and street place CA US 95000
2222 Main at King Edward Vancouver BC CA 333
555 road and street place CA US
it is counting at there being three words at the end for the city, state and country but other than that it's like ryansstack said, if it's random it won't work. if the city is two words like New York it won't work. yeah... regex isn't the tool for this one.
btw: tested on regexhero.net
i can think of 2 ways you can do this
1) if you know that "the rest" of your data after the address is exactly 2 fields, ie BC and CA, you can do split on your string using space as delimiter, remove the last 2 items.
2) do a split on delimiter /[A-Z][A-Z]/ and store the result in array. then print out the array ( this is provided that the address doesn't contain 2 or more capital letters)