I have data that looks like this:
1 ,11/10/2015, 1 3
2 ,01/15/2013
3 ,04/10/2015, 5 5
4 ,04/01/2013, 165
5 ,07/01/2016, 311 312
I need to find every instance that looks like lines 1, 3, and 5 and replace the white space in between the 2 sets of digits with a comma so they become like:
1 ,11/10/2015, 1,3
2 ,01/15/2013
3 ,04/10/2015, 5,5
4 ,04/01/2013, 165
5 ,07/01/2016, 311,312
I'm close with this:
[^(^\d{1,3})][[^(\d{1,3})]\s+(\d{1,3})\r
, but it's keeping the 2 sets of digits AND the white space. Need to isolate the finds to just the white space in between the 2 sets of digits. The leading numbers (1-5) are not in my data set. Just included these for readability here.
If there is only one whitespace-separated digit pair per line, you may use
(\d+)\h+(\d+)
and replace with $1,$2.
If you need to define some more context and make the regex replacement safer, consider
,\h*\K(\d+)\h+(\d+)$
Details:
, - a comma
\h* - 0+ horizontal whitespaces
\K - omit all the text matched so far
(\d+) - Group 1: one or more digits
\h+ - 1+ horizontal whitespaces
(\d+) - Group 2: one or more digits
$ - end of line.
Related
Below is my regex for matching 2 digit where tens place value is 2 or 3 and it is working fine.
^(?=[2,4])\d{1,2}$
As soon as I add the regex for matching single digit in above regex , It started matching single digit and as well all 2 digit number.
^(?=\d|[2,4])\d{1,2}$
I want below sample input to be matched.
0
1
2
3
24
44
48
29
28
Below not to be matched.
99
11
33
55
77
Also It will great help if I would get to know why my regex is not working.
You get a difference in matches as the positive lookahead asserts that there must be to the right what you specify. In there first pattern that is either 2 4 or , and in the second case just a single digit.
You don't have a comma in your example data, so in that case you can match an optional 2 or 4 using just [24]? followed by a digit without any lookarounds.
^[24]?\d$
See a regex demo.
Try this: ^(\d|[2,4]\d)$
Test regex here: https://regex101.com/r/aZo7fK/1
^(\d|[2,4]\d)$
^ matches the start of string
(\d|[2,4]\d) matches either a single digit(0-9) or a two digit number which starts with either 2 or 4
$ matches the end of the string
This matches either a single digit(0-9) number or a two digit number which starts with either 2 or 4.
I suggest
^[2,4]?[0-9]$
pattern; where
^ - anchor, start of the text
[2,4]? - optional 2 or 4 digit for tens
[0-9] - mandatory digit 0..9 for units
$ - anchor, end of the text
Edit: Now, let's have a look at your current patterns; the first is
^(?=[2,4])\d{1,2}$
Here
(?=[2,4]) - look ahead for 2 or 4
\d{1,2} - one or two digits
as we can see 3 doesn't match: look ahead fails to find 2 or 4. As for your second attempt
^(?=\d|[2,4])\d{1,2}$
pattern, where
(?=\d|[2,4]) - look ahead for ANY digit (note, that |[2,4] is redundant)
\d{1,2} - one or two digits
the pattern matches too many texts; technically it matches any one or two digit numbers, e.g. for:
79
we have
(?=\d|[2,4]) - look ahead - succeeds with 7
\d{1,2} - one or two digits - succeeds with 79
I have a question in regex
I am dealing with numbers 0 and 1 only
I have 10 digit number grouped into 4 as below
([01]{2})([01]{4})([01]{2})([01]{2})
I need to match all those numbers with min 2 1's in the second group which is ([01]{4}) , no matter how many 0's or 1's other groups are having. I am interested only in the second group
For example, these are the potential matches are
0000110000
0011000000
0001100000
0000110000
I tried using positive look ahead like :
^(\d{2})((?=\d*1{2,}\d*)(\d{4}))(\d{2})(\d{2})
but this is matching even
0000000011
Any help is deeply appreciated
If the two 1s are not necessarily consecutive in Group 2, you can use
^([01]{2})(?=(?:[01]*1){2}[01]{4,6}$)([01]{4})([01]{2})([01]{2})$
See the regex demo
Details:
^ - start of string
([01]{2}) - Group 1: two occurrences of 1 or 0
(?=(?:[01]*1){2}[01]{4,6}$) - immediately to the right of the current location, there must be two occurrences of any zero or more 0 or 1 chars followed with 1 and then there must be four, five or six 1 or 0 chars till the end of string
([01]{4}) - Group 2: four occurrences of 1 or 0
([01]{2}) - Group 3: two occurrences of 1 or 0
([01]{2}) - Group 4: two occurrences of 1 or 0
$ - end of string.
If the ones need to be consecutive (as per your sample data), maybe you can use:
^(?=[01]{2,4}11)[01]{10}$
See the online demo. The idea here is that you would match 2-4 zero's or 1's upto a sequence of two ones. It makes sense if you realise the only combinations that are allowed would have the minimum of two 1's ("11") sequence after exactly 2-4 other digits.
^ - Start line anchor.
(?=[01]{2,4}11) - Open positive lookahead to look for 2-4 characters from our characters class upto "11".
[01]{10} - Match exactly 10 characters from our character class.
$ - End line anchor.
If need be you can change the [01]{10} pieces where you'd use capture groups.
EDIT:
If they don't have to be consecutive, maybe you can work with:
^[01]{2}(?=[01]{8}$)([01]{0,2}1[01]{0,2}1[01]{0,2})[01]{4}$
See the online demo.
Or less verbose:
^(?=[01]{10}$)(..)(.*1.*1.*)(..)(..)$
See the demo
Not a job for regex but for bitwise operators:
(in PHP):
$nums = [
'0000110000',
'0011000000',
'0001100000',
'1000110000',
'0000000110',
'0001000000'
];
foreach ($nums as $num) {
if ( !in_array((bindec($num) >> 4) & 15, [0, 1, 2, 4, 8]) )
echo $num, PHP_EOL;
}
You can probably do that in any language.
If a positive lookahead is supported, you could also assert that group 2 has as least 11 using a positive lookahead.
^([01]{2})(?=[01]{0,2}11)([01]{4})([01]{2})([01]{2})$
^ Start of string
([01]{2}) - Group 1: two occurrences of 1 or 0
(?= Positive lookahead
[01]{0,2}11 Match 0-2 times either 0 or 1 and match 11
) Close lookahead
([01]{4}) - Group 2: four occurrences of 1 or 0
([01]{2}) - Group 3: two occurrences of 1 or 0
([01]{2}) - Group 4: two occurrences of 1 or 0
$ - end of string.
Regex demo
Or you can write out all 3 alternatives matching 11
^([01]{2})(11[01][01]|[01]11[01]|[01][01]11)([01]{2})([01]{2})$
Regex demo
I need a regex expression, which will search for all permutations of digits (1, 2, 3), where digit in the middle will occur one or many times.
For ex:
123
133332
21111113
312
13333332
I've tried this expression:
([1][2]+[3])|([1][3]+[2])|([2][1]+[3])|([2][3]+[1])|([3][2]+[1])|([3][1]+[2]))
Unfortunately it is slow, is there any way to make it more more efficient?
You may use
([1-3])(?!\1)([1-3])\2*(?!\1|\2)[1-3]
See the regex demo
Details
([1-3]) - Group 1: 1, 2 or 3
(?!\1)([1-3])\2* - a digit from 1 to 3 not equal to Group 1 value and then 0+ occurrences of the digit
(?!\1|\2)[1-3] - a digit from 1 to 3 not equal to Group 1 and 2 value
In case you need to match the whole string, add ^ at the start and $ at the end of the pattern.
How to get
[\d ]{6}
to match:
1 23456
1 2 3456
1 2 3 456
1 2 3 4 56
1 2 3 4 5 6
In other words, I would like the space to not be counted towards the char limit. Something like [\d]{6 + but allow spaces you can eat}
The following will match 6 numbers, with any amount of space characters between them.
(?:\d\s*){5}\d
?: at the beginning there makes the group non-capturing. It's not necessary if all you wish to do is a simple match.
A live example:
https://regex101.com/r/PZJ8DO/2
Just to put my two cents in: you could use the opposite of \d which is \D in most flavors:
^(?:\d\D*){6}$
See a demo on regex101.com.
Note, that this would even allow something like
1a2b3c4d5e6
If this is not what you want (meaning you only want to allow spaces, nothing else), use \s* instead of \D*.
You can try to use
(?<=).*6.*
This will match any line that contains '6' even if there are some white spaces or other characters in the line.
The (?<=) Positive Look Behind.
The . matches any character except line breaks.
The * matches 0 or more of the preceding token.
And 6 matches a "6" Character.
You can test Regular Expression here: RegExr
Note that the positive look behind feature is not supported in all flavors of RegEx.
I have to generate a regular expression to detect patterns of text where credit card numbers are involved, I have a regular expression but fails when the text is altered with simple spaces between the text for example (not valid credit card number):
4320 7589 9456 0123
The regex is:
4\d{3}(\s+|-)?\d{4}(\s+|-)?\d{4}(\s+|-)?\d{4}
This regex match easy, but if someone alter the text with spaces between any number like this:
4 320 7589 9456 0123
Does not match, I need a regex to detect any possible variable with spaces, special symbols, letters, some examples:
43 20 75 89 94 56 01 23
4 3 2 0 7 5 8 9 9 4 5 6 0 1 2 3
4320a7589b9456c0123
4320$7589$9456$0123
4320_7589_9456_0123
I don't know if I can strip any space, symbols from the pattern to analyze the text?
I am posting because you actually asked for help with pattern to match any number of non-digits between the first 4 and 15 more digits.
The pattern is
^4(?:\D*\d){15}$
See demo
Regex breakdown:
^ - start of string
4 - literal 4
(?:\D*\d){15} - 15 occurrences of sequences of...
\D* - 0 or more non-digit symbols before..
\d - a digit
$ - end of string
If you need to capture, you can capture (like ^4((?:\D*\d){3})((?:\D*\d){4})((?:\D*\d){4})((?:\D*\d){4})$), but the submatches will still contain the "junk" in-between digits.