Capture 1-9 after the last occurrence of 0 - regex

I want to capture all numbers between 1 and 9 after the last occurrence of zero except zero in the last digit. I tried this pattern it seems that it doesn’t work.
Pattern: [1-9].*
DATA
0100179835
3000766774
1500396843
1500028408
1508408637
3105230262
3005228061
3105228407
3105228940
0900000000
2100000000
0800000000
1000000001
2200000001
0800000001
1300000001
1000000002
2200000002
0800000002
1300000002
1000000003
2200000003
0800000003
1300000003
1000000004
2200000004
0800000004
1300000004
1000000005
2200000005
0800000005
1300000005
1000000006
2300000006
0800000006
0900000006
1000000007
2300000007
0900000007
0800000007
1000000008
2300000008
0900000008
0800000008
1100000009
2300000009
0900000009
0800000009
1000005217
2000000429
1100000020
1000005000
3000000070
2000000400
1000020000
3000200000
2906000000
Desired Result
179835
766774
396843
28408
8408637
5230262
5228061
5228407
5228940
0
0
0
1
1
1
1
2
2
2
2
3
3
3
3
4
4
4
4
5
5
5
5
6
6
6
6
7
7
7
7
8
8
8
8
9
9
9
9
5217
429
20
5000
70
400
20000
200000
6000000

You can anchor the end of the string and match non-zero digits with an optional trailing zero. Ensure that there is at least one matching digit with a positive lookahead pattern:
(?=\d)[1-9]*0?$
Demo: https://regex101.com/r/uggV37/2

To get desired result:
(?:^0*[1-9]+0*\K0|0\K[1-9]+(?:0[1-9]*|0+)?)$
Explanation
(?: Non capture group for the alternatives
^ Start of string
0*[1-9]+0* Match 1+ digits 1-9 between optional zeroes
\K0 Forget what is matched so far and then match a zero
| Or
0\K Match a zero and forget what is matched so far
[1-9]+ Match 1+ digits 1-9
(?: Non capture group for the alternatives
0[1-9]* Match a zero and optional digits 1-9
| Or
0+ Match 1+ zeroes
)? Close the non capture group
) Close the non capture gruop
$ End of string
See a regex demo.

Match 1 item each line:
'0123056'.match(/(?<=0)[1-9]*0?$/g).filter(m => m != '')
Match multiple item each line:
'0123056 0000210 1205000 1204566 0123456 0012340 0123400'.match(/(?<=0)[1-9]*0?\b/g).filter(m => m != '')

Related

Regex optional group selection doesn't work

I want to extract the numbers from the following text:
Something_Time 10 min (Time in Class T>60�C Something Something )
Something_Time 899 min (Time in Class 35�C<T<=40�C Something Something )
Something_Time 0 min (Time in Class T<=-25�C Something Something )
So what I need is:
|---------------|---------------|---------------|
| Group 1 | Group 2 | Group 3 |
|---------------|---------------|---------------|
| 10 | 60 | |
|---------------|---------------|---------------|
| 899 | 35 | 40 |
|---------------|---------------|---------------|
| 0 | | -25 |
|---------------|---------------|---------------|
Group 2 as lower bound and group 3 as upper bound.
I tried the following regex expression:
^.* (\d{1,6}) min .*(?:[ \>](\-?\d{1,2}))?.*(?:[\=](\-?\d{1,2}))?.*$
This unfortunately does not match groups 2 and 3. It works for the second line as soon as the ? is removed from the end of both groups. Do you have any suggestions?
Try:
^Something_Time (\d{1,6}) min(?:.*?[ >](-?\d{1,2}))?(?:.*?[ =](-?\d{1,2}))?.*$
See Regex Demo
^ Matches start of string.
Something_Time Matches 'Something_Time '
(\d{1,6}) Group 1: 1 - 6 digits
min Matches ' min'
(?:.*?[ >](-?\d{1,2}))? Optional group that matches 0 or more non-newline characters followed by either a space or '>' followed by a number (optional '-' followed by up to 2 digits). The number is placed in Group 2.
(?:.*?[ =](-?\d{1,2}))? Optional group that matches 0 or more non-newline characters followed by either a space or '=' followed by a number (optional '-' followed by up to 2 digits). The number is placed in Group 3.
.* Matches 0 or more non-newline characters.
$ Matches the end of the string or a newline that precedes the end of the string.
In Python:
import re
tests = [
'Something_Time 10 min (Time in Class T>60�C Something Something )',
'Something_Time 899 min (Time in Class 35�C<T<=40�C Something Something )',
'Something_Time 0 min (Time in Class T<=-25�C Something Something )'
]
for test in tests:
m = re.match(r'^Something_Time (\d{1,6}) min(?:.*?[ >](-?\d{1,2}))?(?:.*?[ =](-?\d{1,2}))?.*$', test)
if m:
print(m.groups())
Prints:
('10', '60', None)
('899', '35', '40')
('0', None, '-25')

Valid regex for number(a,b) format

How can I express number(a,b) in regex? Example:
number(5,2) can be 123.45 but also 2.44
The best I got is: ([0-9]{1,5}.[0-9]{1,2}) but it isn't enough because it wrongly accepts 12345.22.
I thought about doing multiple OR (|) but that can be too long in case of a long format such as number(15,5)
You might use
(?<!\S)(?!(?:[0-9.]*[0-9]){6})[0-9]{1,5}(?:\.[0-9]{1,2})?(?!\S)
Explanation
(?<!\S) Negative lookbehind, assert what is on the left is not a non whitespace char
(?! Negative lookahead, assert what is on the right is not
(?:[0-9.]*[0-9]){6} Match 6 digits
) Close lookahead
[0-9]{1,5} Match 1 - 5 times a digit 0-9
(?:\.[0-9]{1,2})? Optionally match a dot and 1 - 2 digits
(?!\S) Negative lookahead, assert what is on the right is not a non whitespace char
Regex demo
I don't know Scala, but you would need to input those numbers when building your regular expression.
val a = 5
val b = 2
val regex = (raw"\((?=\d{1," + a + raw"}(?:\.0+)?|(?:(?=.{1," + (a + 1) + "}0*)(?:\d+\.\d{1," + n + "})))‌.+\)").r
This checks for either total digits is 5, or 6 (including decimal) where digits after the decimal are a max of 2 digits. For the above scenario. Of course, this accounts for variable numbers for a and b when set in code.

c# Regex mask for textbox decimal precision 1 min value 1 max value 100

Does anyone has the regex mask for a textbox where it allows decimal precision 1 with min value of 1 and max value of 100.
Values that need to pass:
0,5
0,1
10,5
99,5
100
basicly every value between 0,1 and 100
Give this pattern a try
\d{0,3},?\d*
Pattern breakdown:
\d{0,3} - 0 to 3 digits
,? - 0 to 1 comma
\d* - 0 or more digits
Tested at Regex101
To match every value between 0,1 and 100 and allow a decimal precision of 1, you could match either: 100 with an optional ,0 or 1 - 99 with an optional 1 decimal precision of 0-9 or match a 0 with 1 decimal precision from 1-9 so it does not match 0,0 using an alternation.
^(?:[1-9][0-9]?(?:,[0-9])?|0,[1-9]|(?:100(?:,0)?))$
Explanation
^ Assert start of the line
(?: Non capturing group
[1-9][0-9]?(?:,[0-9] Match 1 - 99 followed by an optional comma and digit 0-9
| Or
0,[1-9] Match a zero and a comma followed by a digit 1-9 so 0,0 does not match
| Or
(?:100(?:,0)?) Match 100 with an optional comma and 0
) Close non capturing group
$ Assert end of the line
Demo

A regular expression for a binary string with one pair of consecutive 0s and one pair of consecutive 1s

1*(011*)*00(11*0)* 1* intersect 0*(100*)*11(00*1)* 0*
The first half of the regular expression should match all binary strings with one pair of consecutive 0s and the second half should match all binary strings with one pair of consecutive 1s. As the first contains strings with one pair of consecutive 1s, and the second contains strings with one pair of consecutive 0s, I claim that the entire regular expression would only match binary strings with at most one consecutive pair of 0s and one consecutive pair of 1s. Is this correct?
Yes, but more precisely your expression matches binary strings that contain exactly one pair of 0s and exactly one pair of 1s (rather than "at most").
I can prove it via this method:
Here is another regular expression to encode those semantics, using a union rather than an intersection, which I feel is more straightforward.
(1)?(01)*00(10)*11(01)*(0)?|(0)?(10)*11(01)*00(10)*(1)?
The first half matches binary strings in which the pair of zeros precedes the pair of ones, and the second half matches binary strings in which the pair of ones precedes the pair of zeros. Before, after, and between those pairs alternating values may occur.
A string is accepted if it matches either of those patterns (rather than both as in your expression).
Now, it is possible to construct the state transitions based on either of these regular expressions. I have done so below, first with mine then with yours. Each numbered state contains a list of regular expressions that describe the remaining portion of the string, and the state transitions that occur when either a 0, 1, or end-of-line is encountered. A string matches if it matches any regular expression in the list.
As you can see, the state transitions between your version and mine are completely homologous. Therefore, they represent exactly the same set of strings.
start (1)?(01)*00(10)*11(01)*(0)?
(0)?(10)*11(01)*00(10)*(1)?
0 1
1 2
EOL NO_MATCH
1 1(01)*00(10)*11(01)*(0)?
0(10)*11(01)*(0)?
(10)*11(01)*00(10)*(1)?
0 3
1 2
EOL NO_MATCH
2 (01)*00(10)*11(01)*(0)?
0(10)*11(01)*00(10)*(1)?
1(01)*00(10)*(1)?
0 1
1 4
EOL NO_MATCH
3 (10)*11(01)*(0)?
0 NO_MATCH
1 5
EOL NO_MATCH
4 (01)*00(10)*(1)?
0 6
1 NO_MATCH
EOL NO_MATCH
5 0(10)*11(01)*(0)?
1(01)*(0)?
0 3
1 7
EOL NO_MATCH
6 1(01)*00(10)*(1)?
0(10)*(1)?
0 8
1 4
EOL NO_MATCH
7 (01)*(0)?
0 9
1 NO_MATCH
EOL MATCH
8 (10)*(1)?
0 NO_MATCH
1 10
EOL MATCH
9 1(01)*(0)?
END
0 NO_MATCH
1 7
EOL MATCH
10 0(10)*(1)?
END
0 8
1 NO_MATCH
EOL MATCH
start 1*(011*)*00(11*0)*1* + 0*(100*)*11(00*1)*0*
0 1
1 2
EOL NO_MATCH
1 11*(011*)*00(11*0)*1* + 0*(100*)*11(00*1)*0*
0(11*0)*1* + 0*(100*)*11(00*1)*0*
0 3
1 2
EOL NO_MATCH
2 1*(011*)*00(11*0)*1* + 00*(100*)*11(00*1)*0*
1*(011*)*00(11*0)*1* + 1(00*1)*0*
0 1
1 4
EOL NO_MATCH
3 (11*0)*1* + 0*(100*)*11(00*1)*0*
0 NO_MATCH
1 5
EOL NO_MATCH
4 1*(011*)*00(11*0)*1* + (00*1)*0*
0 6
1 NO_MATCH
EOL NO_MATCH
5 1*0(11*0)*1* + 00*(100*)*11(00*1)*0*
(11*0)*1* + 00*(100*)*11(00*1)*0*
1*0(11*0)*1* + 1(00*1)*0*
(11*0)*1* + 1(00*1)*0*
0 3
1 7
EOL NO_MATCH
6 11*(011*)*00(11*0)*1* + 0*1(00*1)*0*
0(11*0)*1* + 0*1(00*1)*0*
11*(011*)*00(11*0)*1* + 0*
0(11*0)*1* + 0*
0 8
1 4
EOL NO_MATCH
7 1*0(11*0)*1* + (00*1)*0*
1* + (00*1)*0*
0 9
1 NO_MATCH
EOL MATCH
8 (11*0)*1* + 0*1(00*1)*0*
(11*0)*1* + 0*
0 NO_MATCH
1 10
EOL MATCH
9 (11*0)*1* + 0*1(00*1)*0*
(11*0)*1* + 0*
0 NO_MATCH
1 7
EOL MATCH
10 1*0(11*0)*1* + (00*1)*0*
1* + (00*1)*0*
(11*0)*1* + 0*
0 8
1 NO_MATCH
EOL MATCH

How do I represent "Any string except for .... "

I'm trying to solve a regex where the given alphabet is Σ={a,b}
The first expression is:
L1 = {a^2n b^(3m+1) | n >= 1, m >= 0}
which means the corresponding regex is: aa(a)*b(bbb)*
What would be a regex for L2, complement of L1?
Is it right to assume L2 = "Any string except for aa(a)b(bbb)"?
First, in my opinion, the regex for L1 = {a^2n b^3m+1 | n>=1, m>=0}
is NOT what you gave but is: aa(aa)*b(bbb)*. The reason is that a^2n, n > 1 means that there are at least 2 a and a pair number of a.
Now, the regular expression for "Any string except for aa(aa)*b(bbb)*" is:
^(?!^aa(aa)*b(bbb)*$).*$
more details here: Regex101
Explanations
aa(a)*b(bbb)* the regex you DON'T want to match
^ represents begining of line
(?!) negative lookahead: should NOT match what's in this group
$ represents end of line
EDIT
Yes, a complement for aa(aa)*b(bbb)* is "Any string but the ones that match aa(aa)*b(bbb)*".
Now you need to find a regex that represents that with the syntax that you can use. I gave you a regex in this answer that is correct and matches "Any string but the ones that match aa(aa)*b(bbb)*", but if you want a mathematical representation following the pattern you gave for L1, you'll need to find something simpler.
Without any negative lookahead, that would be:
L2 = ^((b+.*)|((a(aa)*)?b*)|a*((bbb)*|bb(bbb)*)|(.*a+))$
Test it here at Regex101
Good luck with the mathematical representation translation...
The first expression is:
L1 = {a^2n b^(3m+1) | n >= 1, m >= 0}
Regex for L1 is:
^aa(?:aa)*b(?:bbb)*$
Regex demo
Input
a
b
ab
aab
abb
aaab
aabb
abbb
aaaab
aaabb
aabbb
abbbb
aaaaab
aaaabb
aaabbb
aabbbb
abbbbb
aaaaaab
aaaaabb
aaaabbb
aaabbbb
aabbbbb
abbbbbb
aaaabbbb
Matches
MATCH 1
1. [7-10] `aab`
MATCH 2
1. [30-35] `aaaab`
MATCH 3
1. [75-81] `aabbbb`
MATCH 4
1. [89-96] `aaaaaab`
MATCH 5
1. [137-145] `aaaabbbb`
Regex for L2, complement of L1
^aa(?:aa)*b(?:bbb)*$(*SKIP)(*FAIL)|^.*$
Explanation:
^aa(?:aa)*b(?:bbb)*$ matches L1
^aa(?:aa)*b(?:bbb)*$(*SKIP)(*FAIL) anything matches L1 will skip & fail
|^.*$ matches others that not matches L1
Regex demo
Matches
MATCH 1
1. [0-1] `a`
MATCH 2
1. [2-3] `b`
MATCH 3
1. [4-6] `ab`
MATCH 4
1. [11-14] `abb`
MATCH 5
1. [15-19] `aaab`
MATCH 6
1. [20-24] `aabb`
MATCH 7
1. [25-29] `abbb`
MATCH 8
1. [36-41] `aaabb`
MATCH 9
1. [42-47] `aabbb`
MATCH 10
1. [48-53] `abbbb`
MATCH 11
1. [54-60] `aaaaab`
MATCH 12
1. [61-67] `aaaabb`
MATCH 13
1. [68-74] `aaabbb`
MATCH 14
1. [82-88] `abbbbb`
MATCH 15
1. [97-104] `aaaaabb`
MATCH 16
1. [105-112] `aaaabbb`
MATCH 17
1. [113-120] `aaabbbb`
MATCH 18
1. [121-128] `aabbbbb`
MATCH 19
1. [129-136] `abbbbbb`