Regex substitution does not replace match character for character - regex

I am trying to use Regex to dynamically capture all numbers in a string such as 1234-12-1234 or 1234-123-1234 without knowing the number of characters that will occur in each string segment. I have been able to capture this using positive look ahead via the following expression: [0-9]*(?=-). However, when I try to replace the numbers to Xs such that each number that occurs before the last dash is replaced by an X, the Regex does not return X's for numbers 1:1. Instead, each section returns exactly two X's. How can I get the regex to return the following:
1234-123-1234 -> XXXX-XXX-1234
1234-12-1234 -> XXXX-XX-1234
instead of the current
1234-123-1234 -> XX-XX-1234
?
Link to demo

The problem is that by placing the * directly after the digit match, more than one digit would get replaced with a single X. And then zero digits would get replaced with a single X. Therefore any number of digits would be effectively replaced as two X's.
Use this instead:
[0-9](?=.*-)

Related

Regex - Match n last numbers without the last one

With regex - replace, I am trying to format a number like this:
The leading number should be separated by a +. Moreover, the last number should be separated by a + as well. The more tricky part is, that adjacent 1s to the + to the middle part should be removed, without touching the first and the last number, e.g.,
011023040 -> 0+02304+0
111023920443 -> 1+02392044+3
13242311 -> 1+32423+1
I almost achieved this with the following regex:
'^([0-9]{1})([1]+)?([0-9*)(0-9]{1}$'
And replace this with
'\1+\3+\4'
However, I have a problem with the last example, as this returns:
1+324231+1
However, the one before the second + should be removed.
Can anyone help me with this problem?
You have to use a non-greedy quantifier:
^([0-9])1*([0-9]*?)1*([0-9])$
^^
Live demo
I managed to group the numbers in the following way
^(\d)(1*)(\d+)(\d)$
by using multiline and global flags.
The replacement should look like \1+\3+\4

Need regex expression with multiple conditions

I need regex with following conditions
It should accept maximum of 5 digits then upto 3 decimal places
it can be negative
it can be zero
it can be only numbers (max. upto 5 digit place)
it can be null
I have tried following but its not, its not fulfilling all conditions
#"^([\-\+]?)\d{0,5}(.[0-9]{1,3})?)$"
E.g. maximum value can hold is from -99999.999 to 99999.999
Use this regex:
^[-+]?\d{0,5}(\.[0-9]{1,3})?$
I only made two changes here. First, you don't need to escape any characters inside a character class normally, except for opening and closing brackets, or possibly backslash itself. Hence, we can use [-+] to capture an initial plus or minus. Second, you need to escape the dot in your regex, to tell the engine that you want to match a literal dot.
However, I would probably phrase this regex as follows:
^[-+]?\d{1,5}(\.[0-9]{1,3})?$
This will match one to five digits, followed by an optional decimal point, followed by one to three digits.
Note that we want to capture things like:
0.123
But not
.123
i.e. we don't want to capture a leading decimal point should it not be prefixed by at least one number.
Demo here:
Regex101
I assume you're doing this in C# given the notation. Here's a little code you can use to test your expression, with two corrections:
You have to escape the dot, otherwise it means "any character". So, \. instead of .
There was an extraneous close parenthesis that prevented the expression from compiling
C#:
var expr = #"^([\-\+]?)\d{0,5}(\.[0-9]{1,3})?$";
var re = new Regex(expr);
string[] samples = {
"",
"0",
"1.1",
"1.12",
"1.123",
"12.3",
"12.34",
"12.345",
"123.4",
"12345.123",
".1",
".1234"
};
foreach(var s in samples) {
Console.WriteLine("Testing [{0}]: {1}", s, re.IsMatch(s) ? "PASS" : "FAIL");
}
Results:
Testing []: PASS
Testing [0]: PASS
Testing [1.1]: PASS
Testing [1.12]: PASS
Testing [1.123]: PASS
Testing [12.3]: PASS
Testing [12.34]: PASS
Testing [12.345]: PASS
Testing [123.4]: PASS
Testing [12345.123]: PASS
Testing [.1]: PASS
Testing [.1234]: FAIL
It should accept maximum of 5 digits
[0-9]{1,5}
then upto 3 decimal places
[0-9]{1,5}(\.[0-9]{1,3})?
it can be negative
[-]?[0-9]{1,5}(\.[0-9]{1,3})?
it can be zero
Already covered.
it can be only numbers (max. upto 5 digit place)
Already covered. 'Up to 5 digit place' contradicts your first rule, which allows 5.3.
it can be null
Not covered. I strongly suggest you remove this requirement. Even if you mean 'empty', as I sincerely hope you do, you should detect that case separately and beforehand, as you will certainly have to handle it differently.
Your regular expression contains ^ and $. I don't know why. There is nothing about start of line or end of line in the rules you specified. It also allows a leading +, which again isn't specified in your rules.

blank spaces, number must start with +

I need to get a regex where a phone number must begin with a +. There can be a comma seperated list eg
List:
tel1: +E1234498912345678#fake.com, tel2: +498912345678, tel1: +E123449D1238912345678#fake.com
is a valid list. E is a valid special case
My regex is this:
^(tel1:)|(tel2:)( )(\+.)$
but it accepts numbers without a + as being valid which is not what I want. The number MUST be preceded by a + otherwise it's invalid. Any hints?
You can try the following:
(tel[12]:\s*\+[eE]?\w+(#\w+(\.\w+)+)?(,\s*)?)+
This should match telephone numbers separated by a comma, in a single line.
It also oversees the use of a special character E or e.
Also, domains may not only end in .com. .net or .com.uk should also be valid.

Comma Separated Numbers Regex

I am trying to validate a comma separated list for numbers 1-8.
i.e. 2,4,6,8,1 is valid input.
I tried [0-8,]* but it seems to accept 1234 as valid. It is not requiring a comma and it is letting me type in a number larger than 8. I am not sure why.
[0-8,]* will match zero or more consecutive instances of 0 through 8 or ,, anywhere in your string. You want something more like this:
^[1-8](,[1-8])*$
^ matches the start of the string, and $ matches the end, ensuring that you're examining the entire string. It will match a single digit, plus zero or more instances of a comma followed by a digit after it.
/^\d+(,\d+)*$/
for at least one digit, otherwise you will accept 1,,,,,4
[0-9]+(,[0-9]+)+
This works better for me for comma separated numbers in general, like: 1,234,933
You can try with this Regex:
^[1-8](,[1-8])+$
If you are using python and looking to find out all possible matching strings like
XX,XX,XXX or X,XX,XXX
or 12,000, 1,20,000 using regex
string = "I spent 1,20,000 on new project "
re.findall(r'(\b[1-8]*(,[0-9]*[0-9])+\b)', string, re.IGNORECASE)
Result will be ---> [('1,20,000', ',000')]
You need a number + comma combination that can repeat:
^[1-8](,[1-8])*$
If you don't want remembering parentheses add ?: to the parens, like so:
^[1-8](?:,[1-8])*$

Python: RE only captures first and last match

I'm trying to make a Regular Expression that captures the following:
- XX or XX:XX, up to 6 repetitions (XX:XX:XX:XX:XX:XX), where X is a hexadecimal number.
In other words, I'm trying to capture MAC addresses than can range from 1 to 6 bytes.
regex = re.compile("^([0-9a-fA-F]{2})(?:(?:\:([0-9a-fA-F]{2})){0,5})$")
The problem is that if I enter for example "11:22:33", it only captures the first match and the last, which results in ["11", "22"].
The question: is there any method that {0,5} character will let me catch all repetitions, and not the last one?
Thanks!
Not in Python, no. But you can first check the correct format with your regex, and then simply split the string at ::
result = s.split(':')
Also note that you should always write regular expressions as raw strings (otherwise you get problems with escaping). And your outer non-capturing group does nothing.
Technically there is a way to do it with regex only, but the regex is quite horrible:
r"^([0-9a-fA-F]{2})(?:([0-9a-fA-F]{2}))?(?:([0-9a-fA-F]{2}))?(?:([0-9a-fA-F]{2}))?(?:([0-9a-fA-F]{2}))?(?:([0-9a-fA-F]{2}))?$"
But here you would always get six captures, just that some might be empty.