Logback: replacing 10 digit number by ******** and last two digit - regex

I use the following in my pattern(logback.xml) to replace 10 digit numbers in my log.
%replace(%msg){'\d{10}','**********'}
One problem with this approach is, it also matches first 10 digits of 11 digit number.
Is there a way to match exactly 10 digits numbers.
Now the bigger problem is somehow I need to display the last two digits of this 10 digit number.

Use this:
%replace(%msg){'\b\d{10}\b','**********'}
\b is a word boundary that matches a position where one side is a letter, and the other side is not a letter (for instance a space character, or the beginning of the string)

To display (leave uncaptured) the last two digits, please see the following regex:
'\b\d{8}(?=\d{2}\b)'
View a regex demo!
This will find 8 numerical digits before two digits where the 10 digits are wrapped within word boundaries. Since (?= ) is positive lookahead assertion, it won't be matched. The entire match can then be replaced with:
********
No capturing groups necessary.

To get the last two digits of the corressponding 10 digit number,
'\b\d{8}(\d{2})\b'
First captured group contains the last two digits.
DEMO

Yes, if you use this (as #zx81 said):
\b\d{8}(\d\d)\b
(Explenation: http://www.regexper.com/#%5Cb%5Cd%7B8%7D(%5Cd%5Cd)%5Cb)
That will find 10 digits and store the last 2 digits in a group. If you replace that with a string like this:
********$1
That will replace the first 8 numbers, and leave the last two visible.
Example: http://regexr.com/3989s

Related

regex to check if string doesn't contain non consecutive numbers on partial string

I am trying to create regex for below case:
Input string consisting of all numbers, max length is 30.
Check if in first 10 digits, any number is not consecutively appearing equal or more than 3 in length
eg.
1234567 --> is good (no consecutive number)
1234456 --> is good (4 appears consecutive but length is less than 3)
1234445 --> is bad (4 appears consecutive and length is equal or greater than 3)
12345678904444 --> is good (4 appears consecutive and length is greater than 3 however it is accepted since it is appearing after cut off of 10 digit)
The regex I came up with is below. pardon me for my mistake if any in regex, i am still in learning mode with regexes:
https://regex101.com/r/rv5e6a/1
currently it is getting applied all across the string but not sure how to limit so that regex can be applied only for first 10 digits only.
You can use
^(?!\d{0,7}(\d)\1{2})\d{1,30}$
See the regex demo. Note that \d{0,7} in the lookahead will allow checking for repeated digits only within the first ten. More details:
^ - start of string
(?!\d{0,7}(\d)\1{2}) - a negaitve lookahead that fails the match if there are three same digits after zero to seven digits immediately to the right of the current location
\d{1,30} - one to thirty digits
$ - end of string.
^(?:(\d)(?!\1{2})){1,9}$|^(?:(\d)(?!\2{2})){10}\d*$
regex101 link
Explanation:
^ # beginning of the line
(?: # start of a non-capturing group
(\d) # a single digit in a group that we can refer to with \1 later on
(?!\1{2}) # not followed by the digit in the \1 group repeated twice
){1,9} # repeat the non-capturing group 1-9 times
$ # end of the line
| # OR
^ # beginning of the line
(?: # start of a second non-capturing group
(\d) # a single digit in a group that we can refer to with \2 later on
(?!\2{2}) # not followed by the digit in the \2 group repeated twice
){10} # repeat the non-capturing group 10 times
\d* # the rest of the string can be more digits
$ # end of the line
The important parts of the regex above makes sure that a given digit is not followed by the same digit two more times ^(?:(\d)(?!\1{2}). But, because we only care about the first 10 digits, we need to handle this in two cases.
in the first case, we have a string of digits that is less than 10 characters, so we want our pattern to repeat 1-9 times and then hit the end of the line.
in the second case, we have a string of digits that is 10 or more characters and then there might be even more characters after that that we don't care about.
We need to keep these two cases separate because we don't want to exclude the cases where there are fewer than 10 characters total in the string.

Regular Expression that needs to match three exact digits in a four digit number

The regular expression that I am trying to create should match all numbers that contain three '8's in any 4 digit number. The regular expression that I have only matches the first 10 numbers out of the list of 15 numbers. Any suggestions will be greatly appreciated.
\b[0-9]*(?:8[0-9]*[0-9]?8|8[0-9]*[0-9]?8|8[0-9]*[0-9]?8)\b
Test data:
8088 8188 8288 8388 8488 8808 8818 8828 8838 8848 8880 8881 8882 8883 8884
The last five numbers should also match, but don't.
You can use
\b(?=\d{4}\b)(?:[0-79]*8){3}[0-79]*\b
See the regex demo.
Details:
\b - a word boundary
(?=\d{4}\b) - there must be 4 digits immediately on the right and they should be followed with a word boundary
(?:[0-79]*8){3} - three occurrences of any 0 or more digits but 8 and then 8
[0-79]* - any 0 or more digits but 8
\b - word boundary.
If it's guaranteed that the number is a four-digit number, then you can try the following:
\b8*[0-79]8*\b
To analyze what each part matches, you can check using,
\b(8*)[0-79](8*)\b
This should do it. This will match any of the 4 patterns.
([\d888]|[8\d88]|[88\d8]|[888\d])
You may want to add a check for the delimiter (in your example the space) as this pattern will match across the spaces giving you many more results
\b(\d?8{3}\d?)\b
this makes the first and last digit in the word bound optional, use
either ? or {0,1}
add quantifier to your eight to have exactly
number of eights you need {3}
replace [0-9] with \d as
Digit for brewity
supposed you have only numbers of length 4. Otherwise use an alternative without optional digits: \b(\d8{3}|8{3}\d)\b

Using regex to match numbers which have 5 increasing consecutive digits somewhere in them

First off, this has sort of been asked before. However I haven't been able to modify this to fit my requirement.
In short: I want a regex that matches an expression if and only if it only contains digits, and there are 5 (or more) increasing consecutive digits somewhere in the expression.
I understand the logic of
^(?=\d{5}$)1*2*3*4*5*6*7*8*9*0*$
however, this limits the expression to 5 digits. I want there to be able to be digits before and after the expression. So 1111345671111 should match, while 11111 shouldn't.
I thought this might work:
^[0-9]*(?=\d{5}0*1*2*3*4*5*6*7*8*9*)[0-9]*$
which I interpret as:
^$: The entire expression must only contain what's between these 2 symbols
[0-9]*: Any digits between 0-9, 0 or more times followed by:
(?=\d{5}0*1*2*3*4*5*6*7*8*9*): A part where at least 5 increasing digits are found followed by:
[0-9]*: Any digits between 0-9, 0 or more times.
However this regex is incorrect, as for example 11111 matches. How can I solve this problem using a regex? So examples of expressions to match:
00001459000
12345
This shouldn't match:
abc12345
9871234444
While this problem can be solved using pure regular expressions (the set of strictly ascending five-digit strings is finite, so you could just enumerate all of them), it's not a good fit for regexes.
That said, here's how I'd do it if I had to:
^\d*(?=\d{5}(\d*)$)0?1?2?3?4?5?6?7?8?9?\1$
Core idea: 0?1?2?3?4?5?6?7?8?9? matches an ascending numeric substring, but it doesn't restrict its length. Every single part is optional, so it can match anything from "" (empty string) to the full "0123456789".
We can force it to match exactly 5 characters by combining a look-ahead of five digits and an arbitrary suffix (which we capture) and a backreference \1 (which must exactly the suffix matched by the look-ahead, ensuring we've now walked ahead 5 characters in the string).
Live demo: https://regex101.com/r/03rJET/3
(By the way, your explanation of (?=\d{5}0*1*2*3*4*5*6*7*8*9*) is incorrect: It looks ahead to match exactly 5 digits, followed by 0 or more occurrences of 0, followed by 0 or more occurrences of 1, etc.)
Because the starting position of the increasing digits isn't known in advance, and the consecutive increasing digits don't end at the end of the string, the linked answer's concise pattern won't work here. I don't think this is possible without being repetitive; alternate between all possibilities of increasing digits. A 0 must be followed by [1-9]. (0(?=[1-9])) A 1 must be followed by [2-9]. A 2 must be followed by [3-9], and so on. Alternate between these possibilities in a group, and repeat that group four times, and then match any digit after that (the lookahead in the last repeated digit in the previous group will ensure that this 5th digit is in sequence as well).
First lookahead for digits followed by the end of the string, then match the alternations described above, followed by one or more digits:
^(?=\d+$)\d*?(?:0(?=[1-9])|1(?=[2-9])|2(?=[3-9])|3(?=[4-9])|4(?=[5-9])|5(?=[6-9])|6(?=[7-9])|7(?=[89])|8(?=9)){4}\d+
Separated out for better readability:
^(?=\d+$)\d*?
(?:
0(?=[1-9])|
1(?=[2-9])|
2(?=[3-9])|
3(?=[4-9])|
4(?=[5-9])|
5(?=[6-9])|
6(?=[7-9])|
7(?=[89])|
8(?=9)
){4}
\d+
The lazy quantifier in the first line there \d*? isn't necessary, but it makes the pattern a bit more efficient (otherwise it initially greedily matches the whole string, requiring lots of failing alternations and backtracking until at least 5 characters before the end of the string)
https://regex101.com/r/03rJET/2
It's ugly, but it works.

Regex to check for 4 consecutive numbers

Can I use
\d\d\d\d[^\d]
to check for four consecutive numbers?
For example,
411112 OK
455553 OK
1200003 OK
f44443 OK
g55553 OK
3333 OK
f4442 No
45553 No
f4444g4444 No
f44444444 No
If you want to find any series of 4 digits in a string /\d\d\d\d/ or /\d{4}/ will do. If you want to find a series of exactly 4 digits, use /[^\d]\d{4}[^\d]/. If the string should simply contain 4 consecutive digits use /^\d{4}$/.
Edit: I think you want to find 4 of the same digits, you need a backreference for that. /(\d)\1{3}/ is probably what you're looking for.
Edit 2: /(^|(.)(?!\2))(\d)\3{3}(?!\3)/ will only match strings with exactly 4 of the same consecutive digits.
The first group matches the start of the string or any character. Then there's a negative look-ahead that uses the first group to ensure that the following characters don't match the first character, if any. The third group matches any digit, which is then repeated 3 times with a backreference to group 3. Finally there's a look-ahead that ensures that the following character doesn't match the series of consecutive digits.
This sort of stuff is difficult to do in javascript because you don't have things like forward references and look-behind.
Should the numbers be part of a string, or do you want only the four numbers. In the later case, the regexp should be ^\d{4}$. The ^ marks the beginning of the string, $ the end. That makes sure, that only four numbers are valid, and nothing before or after that.
That should match four digits (\d\d\d\d) followed by a non digit character ([^\d]). If you just want to match any four digits, you should used \d\d\d\d or \d{4}. If you want to make sure that the string contains just four consecutive digits, use ^\d{4}$. The ^ will instruct the regex engine to start matching at the beginning of the string while the $ will instruct the regex engine to stop matching at the end of the string.

Regex 2 digits separated by commas, not all required

I need a regex for the following input:
[2 digits], comma, [two digits], comma, [two digits]
The 2 digits cannot start with 0. It is allowed to only enter the first 2 digits. Or to enter the first 2 digits, then a comma en then the next 2 digits. Or to enter the full string as described above.
Valid input would be:
10
99
17,56
15,99
10,57,61
32,44,99
Could anyone please help me with this regex?
At the moment I have this regex, but it doesn't limit the input to maximum 3 groups of 2 digits:
^\d{2}(?:[,]\d{2})*$
^[1-9]\d(?:,[1-9]\d){0,2}$
The first part ([1-9]\d) is simply the first number, which has to be present at all times. It consists of a non-zero digit and an arbitrary second digit (\d).
What follows is a non-capturing group ((?:...)), containing a comma followed by another two-digit number (,[1-9]\d), just alike the first one. This group can be repeated between zero and two times ({0,2}), so you get either no, one or two sequences of a comma and another number.
You can easily expand the part in the curly braces to allow for more allowed numbers.
^[1-9]\d([,][1-9]\d){0,2}$