I have the following Regex that check if there is in the string 8 caracters (letters or numbers) followed by a space a number and a comma:
^.*[a-zA-Z0-9]{8,} \d*,.*$
The following Regex does not match the following:
Hello 23, abc 2, me 5,
But match the following:
My8Chara 12, abc 2,
I would like to reverse the Regex. I want the regex match if the string does NOT contain 8 characters followed by a space a number and a comma.
Does anyone knows how to reverse Regex ? I cannot use something like !Regex.IsMatch because I use a generic validator. I must write it in regular expression.
The desired output are :
"" -> match
"abc 123, def 234," -> match
"my8chara 123, only5 12" -> does not match -> it contains 8 characters followed by a space a number and a comma
Thanks in advance,
Raphaël
You could maybe use a negative lookahead like this:
^(?!.*[a-zA-Z0-9]{8,} \d*).+$
regex101 demo.
A negative lookahead has the format (?! ... ). If what's inside it matched, then the whole match will fail.
So, if there is .*[a-zA-Z0-9]{8,} \d* matched, the whole match fails.
EDIT: If you still want to match sentences with the structure Hello 23, abc 2, me 5,, then I would suggest this:
^(?!.*[a-zA-Z0-9]{8,} \d*).*(?:[a-zA-Z0-9]+ \d*,)?.*$
^(?!.*[a-zA-Z0-9]{8} \d+,)\w.*$
Live demo
To match empty strings too:
^(?!.*[a-zA-Z0-9]{8} \d+,).*$
Related
I would like to use regex to match numbers if a substring is present, but without matching the substring. Hence,
2-4 foo
foo 4-6
bar 8
should match
2, 4
4, 6
I currently have
(\d{0,}\.?\d{1,})
which returns the numbers (int or float). Using
(\d{0,}\.?\d{1,}(?=\sfoo))
only matches 4, rather than 2 and 4. I also tried a lookahead
^(?=.*?\bfoo\b)(\d{0,}\.?\d{1,})
but that matches the 2 only.
*edited typo
With engines that support infinite width lookbehind patterns, you can use
(?<=\bfoo\b.*)\d*\.?\d+|\d*\.?\d+(?=.*?\bfoo\b)
See this regex demo. It matches any zero or more digits followed with an optional dot and then one or more digits when either preceded with a whole word foo (not necessarily immediately) or when followed with foo whole word somewhere to the right.
When you have access to the code, you can simply check for the word presence in the text and then extract all the match occurrences. In Python, you could use
if 'foo' in text:
print(re.findall(r'\d*\.?\d+', text))
# Or, if you need to make sure the foo is a whole word:
if re.search(r'\bfoo\b', text):
print(re.findall(r'\d*\.?\d+', text))
I am trying to detect if a string has a number based on few conditions. For example I don't want to match the number if it's surrounded by parentheses and I'm using the lookahead and lookbehind to do this but I'm running into issues when the number contains multiple digits. Also, the number can be between text without any space separators.
My regex:
(?https://regex101.com/r/RnTSMJ/1
Sample examples:
{2}: Should NOT Match. //My regex Works
{34: Should NOT Match. //My regex matches 4 in {34
45}: Should NOT Match. //My regex matches 4 in {45
{123}: Should NOT Match. //My regex matches 2 in {123}
I looked at Regex.Match whole words but this approach doesn't work for me. If I use word boundaries, the above cases work as expected but then cases like the below don't where numbers are surrounded with text. I also want to add some additional logic like don't match specific strings like 1st, 2nd, etc or #1, #2, etc
updated regex:
(?<!\[|\{|\(|#)(\b\d+\b)(?!\]|\}|\|st|nd|rd|th)
See here https://regex101.com/r/DhE3K4/4
123abd //should match 123
abc345 //should match 234
ab2123cd // should match 2123
Is this possible with pure regex or do I need something more comprehensive?
You could match 1 or more digits asserting what is on the left and right is not {, } or a digit or any of the other options to the right
(?<![{\d#])\d+(?![\d}]|st|nd|rd|th)
Explanation
(?<![{\d#]) Negative lookbehind, assert what is on the left is not {, # or a digit
\d+ Match 1+ digits
(?! Negative lookahead, assert what is on the right is not
[\d}]|st|nd|rd|th Match a digit, } or any of the alternatives
) Close lookahead
Regex demo
The following regex is giving the expected result.
(?<![#\d{])\d+(?!\w*(?:}|(?:st|th|rd|nd)\b))
Regex Link
I am trying to implement regex for a JSON Response on sensitive data.
JSON response comes with AccountNumber and AccountName.
Masking details are as below.
accountNumber Before: 7835673653678365
accountNumber Masked: 783567365367****
accountName Before : chris hemsworth
accountName Masked : chri* *********
I am able to match above if I just do [0-9]{12} and (?![0-9]{12}), when I replace this, it is replacing only with *, but my regex is not producing correct output.
How can I produce output as above from regex?
If all you want is to mask characters except first N characters, don't think you really a complicated regex. For ignoring first N characters and replacing every character there after with *, you can write a generic regex like this,
(?<=.{N}).
where N can be any number like 1,2,3 etc. and replace the match with *
The way this regex works is, it selects every character which has at least N characters before it and hence once it selects a character, all following characters also get selected.
For e.g in your AccountNumber case, N = 12, hence your regex becomes,
(?<=.{12}).
Regex Demo for AccountNumber masking
Java code,
String s = "7835673653678365";
System.out.println(s.replaceAll("(?<=.{12}).", "*"));
Prints,
783567365367****
And for AccountName case, N = 4, hence your regex becomes,
(?<=.{4}).
Regex Demo for AccountName masking
Java code,
String s = "chris hemsworth";
System.out.println(s.replaceAll("(?<=.{4}).", "*"));
Prints,
chri***********
If you match [0-9]{12} and replace that directly with a single asterix you are left with accountNumber Before: *8365
There is no programming language listed, but one option to replace the digits at the end is to use a positive lookbehind to assert what is on the left are 12 digits followed by a positive lookahead to assert what is on the right are 0+ digits followed by the end of the string.
Then in the replacement use *
If the value of the json exact the value of chris hemsworth and 7835673653678365 you can omit the positive lookaheads (?=\d*$) and (?=[\w ]*$) which assert the end of the string for the following 2 expressions.
Use the versions with the positive lookahead if the data to match is at the end of the string and the string contains more data so you don't replace more matches than you would expect.
(?<=[0-9]{12})(?=\d*$)\d
In Java:
(?<=[0-9]{12})(?=\\d*$)\\d
(?<=[0-9]{12}) Positive lookbehind, assert what is on the left are 12 digits
(?=\d*$) Positive lookahead, assert what is on the right are 0+ digits and assert the end of the string
\d Match a single digit
Regex demo
Result:
783567365367****
For the account name you might do that with 4 word characters \w but this will also replace the whitespace with an asterix because I believe you can not skip matching that space in one regex.
(?<=[\w ]{5})(?=[\w ]*$)[\w ]
In Java
(?<=[\\w ]{4})(?=[\\w ]*$)[\\w ]
Regex demo
Result
chri***********
My list of strings are,
1. bc // should match
2. abc // should not match
3. bc-bc // should match
4. ab-bc // should match
5. bc-ab // should match
I want to match all bcs. If it starts with any other character like a in string 1, I don't want to match.
I tried with regex [^a]bc. It did not match string 2 as well as string 1 and 5, since [] expects a character. Then I did try with [^a]?bc. It matched string 2 also. How to make regex which matches empty or not a particular list of characters?
Do you want to match bc only if it's not preceded by a certain set of characters (like for example a, x, or y)? Then that's exactly what a negative lookbehind assertion is for:
(?<![axy])bc
will match bc or bbc, but not abc or ybc.
If you want to match bc as a complete "word", i. e. not adjacent to any letters or digits, use word boundary anchors:
\bbc\b
Note that in MongoDB, in order to be able to use features like lookbehind that are available only to the PCRE engine (and not to JavaScript), you need to follow a certain syntax (using strings instead of regex objects), for example:
{ name: { $regex: '(?<![axy])bc' } }
I'm looking for a regular expression to catch all digits in the first 7 characters in a string.
This string has 12 characters:
A12B345CD678
I would like to remove A and B only since they are within the first 7 chars (A12B345) and get
12345CD678
So, the CD678 should not be touched. My current solution in R:
paste(paste(str_extract_all(substr("A12B345CD678",1,7), "[0-9]+")[[1]],collapse=""),substr("A12B345CD678",8,nchar("A12B345CD678")),sep="")
It seems too complicated. I split the string at 7 as described, match any digits in the first 7 characters and bind it with the rest of the string.
Looking for a general answer, my current solution is to split the first 7 characters and just match all digits in this sub string.
Any help appreciated.
You can use the known SKIP-FAIL regex trick to match all the rest of the string beginning with the 8th character, and only match non-digit characters within the first 7 with a lookbehind:
s <- "A12B345CD678"
gsub("(?<=.{7}).*$(*SKIP)(*F)|\\D", "", s, perl=T)
## => [1] "12345CD678"
See IDEONE demo
The perl=T is required for this regex to work. The regex breakdown:
(?<=.{7}).*$(*SKIP)(*F) - matches any character but a newline (add (?s) at the beginning if you have newline symbols in the input), as many as possible (.*) up to the end ($, also \\z might be required to remove final newlines), but only if preceded with 7 characters (this is set by the lookbehind (?<=.{7})). The (*SKIP)(*F) verbs make the engine omit the whole matched text and advance the regex index to the position at the end of that text.
| - or...
\\D - a non-digit character.
See the regex demo.
The regex solution is cool, but I'd use something easier to read for maintainability. E.g.
library(stringr)
str_sub(s, 1, 7) = gsub('[A-Z]', '', str_sub(s, 1, 7))
You can also use a simple negative lookbehind:
s <- "A12B345CD678"
gsub("(?<!.{7})\\D", "", s, perl=T)