Regular expression for Invoice Number - regex

I am new to Stackoverflow and I need your help to match payment invoice number. So that user can't input wrong invoice number. It should match the invoice pattern like 612(fixed) 10/20/30/40/50(only one from 5 of them) 001-064(one at a time) 0000(fixed) 01-64(one at a time) 00(fixed) and then 0001-9999(allowed)
If I show you one invoice number it'll be like this one 612 30 005 0000 55 00 1234 without any space like this 61230005000055001234
I can't figure it out how could I do this. please help me if you can.

^612\s?[1-5]0\s?0(?:[0-5]\d|6[0-4])\s?0000\s?(?:[0-5]\d|6[0-4])\s?00\s?\d{4}$
Should do the job for you, assuming that spaces are optional, but in fixed position and only single ones.
^ is an anchor for the beginning of the string
612\s? matches 612 literally, followed by an optional space
[1-5]0\s? matches 1/2/3/4/5 followed by 0 and an optional space
0([0-5]\d|6[0-4])\s? means 0 followed by either 0-5 and any digit or 6
and 0-4, followed by an optional space
0000\s? matches 0000 literally, followed by an optinal space
([0-5]\d|6[0-4])\s? is either 0-5 and any digit or 6 and 0-4, followed by an optional space
00\s? matches 00 literally, followed by an optional space
\d{4} means any 4 digits
$ is an anchor for the end of the string
https://regex101.com/r/iU5jY5/3

612[1-5]00(?:[0-5][0-9]|6[0-4])0000(?:0[0-9]|[1-5][0-9]|6[0-4])00[0-9]{4}
See a demo here.

Related

regex to extract housenumber plus addition

I'm looking for a regex that matches housenumbers combined with additions for all addresses below:
Breestraat 4
Breestraat 45
Breestraat 456
Dubbele Straat 4a
Dubbele Straat 4-a
5 meistraat 1a
5meistraat 12
5meistraat 12a
Teststraat 22-III
Now the following regex works, except in the first case. This is because the single digit housenummber is missed because of the first \d in the regex (which prevents a starting digit to be captured).
\d?.(\d+.+)$
regex to extract housenumber addition
I'm scratching my head how to get the housenumer '4' for the first line. so basically how to change the "skip starting digit" to "skip starting digit but let it have to result on the capturing group".
You can use
\d+\D*$
\d+\S*$
See the regex demo #1 and regex demo #2.
The pattern matches
\d+ - one or more digits
\D* - zero or more non-digit chars
\S* - zero or more non-whitespace chars
$ - end of string.
It's not perfectly clear what you are requesting precisely..
Anyway this is the pattern matching the house number at the end of the string:
\d+[-\da-zI]*$
https://regexr.com/6l0g7
Anyway I'm aware this is not a valid answer

How would I find values in a file, but only on lines that don't start with #?

I've got a document that looks something like this:
# Document ID 8934
# Last updated 2018-05-06
52 84 12 70 23 2 7 20 1 5
4 2 7 81 32 98 2 0 77 6
(..and so on..)
In other words, it starts off with a few comment lines, then the rest of the document is just a bunch of numbers separated by spaces.
I'm trying to write a regex that gets all digits on all lines that don't start with #, but I can't seem to get it.
I've read over answers such as
Regular Expressions: Is there an AND operator?
Regex: Find a character anywhere in a document but only on lines that begin with a specific word
and pawed through sites such as http://regular-expressions.info, but I still can't get an expression that works (the best I can get is a lengthy version of ^[^#].*
So how can I match digits (or text, or whatever) in a string, but only on lines that don't start with a certain character?
Your regex ^[^#].* uses a negated character class which matches not a # from the start of the string ^ and after that matches any character zero or more times.
This would for example also match t test
What you might do is use an alternation to match a whole line ^#.*$ that starts with a # or capture in a group one or more digits (\d+)
Your digits are captured group 1. You could change the (\d+) to for example a character class ([\w+.]+) to match more than only digits.
(?:^#.*$|(\d+))
Details
(?: Non capturing group
^#.*$ Match from the start of the line ^ a # followed by any character zero or more times .* until the end of the string $
| Or
(\d+) capture one or more digits in a group
) Close non capturing group
I think a way simpler method would be to replace the lines with "" first with this regex:
^#.*
And then you can just match all the numbers with this:
-?\d+ (-? is for negative)

Regex to find week date

We're receiving a file from a customer which we need to read and save some values into our ERP-System.
the customer sends us a date in a week format like: 201814 this would mean the 14th week of the year 2018
the customer sends this date never in the same place in the file, so the only way i think I can get this date, is by searching the string in the file by regex.
My Regex should probably consist of the following conditions:
the length of the string is always 6 characters
all characters are nummeric values
the string always starts with 20
the last two values have to be between 01 and 53
what would the perfect regex for this be? there are many other "nummeric-only" values in the file, that's why i need to be so specific
I know I can do the length condition like this {1,6} and I know that [0-9] matches all digits from zero to nine, but I can't see how I can restrict 01 to 53.
Can someone help me with my regex? thanks a lot!
You may try this:
\b20\d{2}(?:0[1-9]|[1-4]\d|5[0-3])(?!\d)
Demo
Explanation:
\b word boundary start of string indiciator
20 literaly must start with 20
\d{2} followed by any two digits
(?: non capturing group starts here
0[1-9] means 01 to 09
or
[1-4]\d means 10 to 49
or
5[0-3] means 50-53
) end of non capturing group
(?!\d) negative lookahead to ensure the entire match is not followed
by a digit. The entire regex is formed such a way that you should not need to measure 6 digits; as if it is not 6 digit then the above conditions won't be met.
Use This: (20)\d{2}([1-4][0-9]|[0][1-9]|[5][1-3])
Demo

How can I replace this expression in chain regex (notepad++)?

i have this text
14 two 25 three 12 four 40 five 10
I want to obtain "14 two 14 25 three 14 25 12 four 14 25 12 40 five 14 25 12 40 10"
For example, when I replace (14 two ) for (14 two 14 ) this start after of 14 I can't start it after two.
Is there any other alternative to do?
For example using a group that is not included in match ( a group before match ) for replace it ?
please help me
This should do the trick for you:
Regex: ((?:\s?\d+\s?)+)((?:[a-zA-Z](?![^a-zA-Z]+\1))+)
Replacement: $1$2 $1
You will need to click on the "replace all" button for this to work (it cannot be done in one shot, it has to be repeated as long as it can find match. Online PHP example)
Explanation:
\s: Match a single space character
?: the previous expression must be matched 0 or 1 time.
\s?: Match a space character 0 or 1 time.
\d: Match a digit character (the equivalent of [0-9]).
+: The previous expression must be matched at least one time (u to infinite).
\d+: Match as much digit characters as you (but at least one time).
(): Capture group
(?:): Non-capturing group
((?:\s?\d+\s?)+): Match an optional space character followed by one or more digit characters followed by an optional space character. The expression is surrounded by a non-capturing group followed by a plus. That mean that the regex will try to match as much combination of space and digit character as it can (so you can end up with something like '14 25 12 40').
The capture group is meant to keep the value to reuse it in the replacement.You cannot simply add the plus at the end of the capture group without the non-capturing group within because it would only remember the last digits capture ('12' instead of the whole '14 25 12' use to build '14 25 12 40').
[a-zA-Z]: Match any English letters in any case (lower, upper).
\1: reference to what have been capture in the first group.
(?!): Negative lookahead.
[^]: Negative character class, so [^a-zA-Z] means match anything
((?:[a-zA-Z](?![^a-zA-Z]+\1))+): The negative lookahead is meant to make sure that we don't always end up matching the first "14 two" in the input text. Without it, we would end up in an infinite loop giving results as "14 two 14 14 14 14 14 14 25 three 12 four 40 five 10" (the "14" before "25" being repeated until you reach the timeout).
Basically, for every English letter we match, we lookahead to assert that the content of the first capture group (by example "14") is not present in our digit sequence.
For the replacement, $1$2 $1 means put the content of the capture group 1 and 2, add a space and put the content of the capture group 1 once more.

Social Security Number Validation That Accepts Dashes, Spaces or No Spaces

Social Security numbers that I want to accept are:
xxx-xx-xxxx (ex. 123-45-6789)
xxxxxxxxx (ex. 123456789)
xxx xx xxxx (ex. 123 45 6789)
I am not a regex expert, but I wrote this (it's kind of ugly)
^(\d{3}-\d{2}-\d{4})|(\d{3}\d{2}\d{4})|(\d{3}\s{1}\d{2}\s{1}\d{4})$
However this social security number passes, when it should actually fail since there is only one space
12345 6789
So I need an updated regex that rejects things like
12345 6789
123 456789
To make things more complex it seems that SSNs cannot start with 000 or 666 and can go up to 899, the second and third set of numbers also cannot be all 0.
I came up with this
^(?!000|666)[0-8][0-9]{2}[ \-](?!00)[0-9]{2}[ \-](?!0000)[0-9]{4}$
Which validates with spaces or dashes, but it fails if the number is like so
123456789
Ideally these set of SSNs should pass
123456789
123 45 6789
123-45-6789
899-45-6789
001-23-4567
And these should fail
12345 6789
123 456789
123x45x6789
ABCDEEEEE
1234567890123
000-45-6789
123-00-6789
123-45-0000
666-45-6789
More complete validation rules are available on CodeProject at http://www.codeproject.com/Articles/651609/Validating-Social-Security-Numbers-through-Regular. Copying the information here in case the link goes away, but also expanding on the codeproject answer a bit.
A Social Security number CANNOT :
Contain all zeroes in any specific group (ie 000-##-####, ###-00-####, or ###-##-0000)
Begin with ’666′.
Begin with any value from ’900-999′
Be ’078-05-1120′ (due to the Woolworth’s Wallet Fiasco)
Be ’219-09-9999′ (appeared in an advertisement for the Social Security Administration)
This RegEx taken from the referenced CodeProject article will validate all Social Security numbers according to all the rules - requires dashes as separators.
^(?!219-09-9999|078-05-1120)(?!666|000|9\d{2})\d{3}-(?!00)\d{2}-(?!0{4})\d{4}$
Same with spaces, instead of dashes
^(?!219 09 9999|078 05 1120)(?!666|000|9\d{2})\d{3} (?!00)\d{2} (?!0{4})\d{4}$
Finally, this will validate numbers without spaces or dashes
^(?!219099999|078051120)(?!666|000|9\d{2})\d{3}(?!00)\d{2}(?!0{4})\d{4}$
Combining the three cases above, we get the
Answer
^((?!219-09-9999|078-05-1120)(?!666|000|9\d{2})\d{3}-(?!00)\d{2}-(?!0{4})\d{4})|((?!219 09 9999|078 05 1120)(?!666|000|9\d{2})\d{3} (?!00)\d{2} (?!0{4})\d{4})|((?!219099999|078051120)(?!666|000|9\d{2})\d{3}(?!00)\d{2}(?!0{4})\d{4})$
To solve your problem with dashes, spaces, etc. being consistent, you can use a backreference. Make the first separator a group and allow it to be optional - ([ \-]?). You can then reference it with \1 to make sure the second separator is the same as the first one:
^(?!000|666)[0-9]{3}([ -]?)(?!00)[0-9]{2}\1(?!0000)[0-9]{4}$
See it here (thanks #Tushar)
I had a requirement to validate SSN's. This regex will validate SSN for below rules
Matches dashes, spaces or no spaces
Numbers, 9 digits, non-alphanumeric
Exclude all zeros
Exclude beginning characters 666,000,900,999,123456789,111111111,222222222,333333333,444444444,555555555,666666666,777777777,888888888,999999999
Exclude ending characters 0000
^(?!123([ -]?)45([ -]?)6789)(?!\b(\d)\3+\b)(?!000|666|900|999)[0-9]{3}([ -]?)(?!00)[0-9]{2}\4(?!0000)[0-9]{4}$
Explanation
^ - Beginning of string
(?!123([ -]?)45([ -]?)6789) - Don't match 123456789, 123-45-6789, 123 45 6789
(?!\b(\d)\3+\b) - Don't match 00000000,111111111...999999999. Repeat same with space and dashes. '\3' is for backtracking to (\d)
(?!000|666|900|999) - Don't match SSN that begins with 000,666,900 or 999.
([ -]?) - Check for space and dash. '?' is used to make space and dash optional. ? is 0 or 1 occurence of previous character.
(?!00) - the 4th and 5th characters cannot be 00.
\4 - Backtracking to check for space and dash again after the 5th character.
(?!0000) - The last 4 characters cannot be all zeros.
$ - End of string
Backtracking is used to repeat a captured group (). Each group is represented sequentially 1,2,3..so on
See here for more explanation and examples
https://regex101.com/r/rA2xA2/3