Why is the regular expression ([£€$¥£]|USD|US\$)\s?(\d*.?\d+|\d{1,3}(,\d{3})*(.\d+)?) not matching US$ 150,000.00
Regular expression 1 :
([£€$¥£]|USD|US\$)\s?
matches US$
Regular expression 2 :
(\d*\.?\d+|\d{1,3}(,\d{3})*(\.\d+)?)
matches 150,000.00
Concatenation of two expressions
([£€$¥£]|USD|US\$)\s?(\d*\.?\d+|\d{1,3}(,\d{3})*(\.\d+)?)
does not match US$ 150,000.00
demo : https://regex101.com/r/fJJWqv/1
EDIT : The Regular expression 2 does not match 150,000.00 but shouldn't it match the comma too because of (,\d{3})* ?
Your second claim is untrue. (\d*\.?\d+|\d{1,3}(,\d{3})*(\.\d+)?) does not match 150,000.00. Rather, it matches 150 and 000.00. Since only the former is prefixed with US $, only it matches the third regex.
The reason for this is that the alternation order you specified favors a shorter match. To fix it, you can switch the alternation order: change (\d*\.?\d+|\d{1,3}(,\d{3})*(\.\d+)?) to (\d{1,3}(,\d{3})*(\.\d+)?|\d*\.?\d+).
In 150,000.00 using pattern (\d*\.?\d+|\d{1,3}(,\d{3})*(\.\d+)?) it will not match the comma because the 150 will be matched by \d*\.?\d+ and none of the alternatives start with a comma.
It can because \d* means 0+ digits so that will match 150. Then the \.? is an optional dot so it continues to \d+.
Due to bracktracking the \d* can give up one match to match at least 1 digit from \d+ and 150 will stay the match.
Then the next character is a , but non of the alternations start with a comma so the next character is tried and this time this pattern \d*\.?\d+ can match the 000.00.
One option to match your value (and if you only want the match you can omit the capturing groups) is you remove this part \d*\.?\d+
(?:[£€$¥£]|USD|US\$)\s?\d{1,3}(?:,\d{3})*(?:\.\d+)?
Regex demo
Related
There are a thousand regular expression questions on SO, so I apologize if this is already covered. I did look first.
I have string:
Name Subname 11X22 88X620 AB33(20) YA5619 77,66
I need to capture this string: YA5619
What I am doing is just finding AB33(20) and after this I am capturing until first white space. But AB33(20) can be AB-33(20) or AB33(-20) or AB33(-1).
My preg_match regex is: (?<=\bAB\d{2}\(\d{2}\)\s).+?(?=\s)
Why I am getting error when I change from \d{2} to \d+?
For final result I was thinking this regix will work but no:
(?<=\bAB-?\d+\(-?\d+\)\s).+?(?=\s)
Any ideas what I am doing wrong?
With most regex flavors, lookbehind needs to evaluate to a fixed-length sequence, so you can't use variable quantifiers like * or + or even {1,2}.
Instead of using lookaround, you can simply match your marker pattern and then forget it with \K.
AB-?\d+(?:\(-?\d+\))? \K[^ ]+
demo: https://regex101.com/r/8XXngH/1
It depends on the language. If it is in .NET for example, it matches due to the various length in the lookbehind.
Another solution might be to use a character class and add the character you would allow to match. Then match a whitespace character and capture in a group matching \S+ which matches 1+ times not a whitespace character.
\bAB[()\d-]+\s\K\S+
Explanation
\bAB Match literally prepended with word boundary to prevent AB being part of a larger match.
[()\d-]+ Match 1+ times any of the listed character in the character class
\s Match a whitespace char (or \s+ to match 1 or more)
\K Reset the starting point of the reported match( Forget what was matched)
\S+ Match in a group 1+ times not a whitespace character
Regex demo | Php demo
My regular expression = '(\d+)\1+'
My Aim is to capture repeating patters such as 2323 , 1212, 345345 which have different digits. Current regex also captures 11,22,11111 which I need to exclude
Example -
For the input = 44556841335158684945454545
Matches are
44
55
45454545
Matches should be -
45454545
How do I write a regex which excludes 44 and 55 and gives results which have different digits
Here is the regex I believe you want:
(\d)((?!\1)\d)
A bit of explanation:
(\d)
\d matches a digit (equal to [0-9])
((?!\1)\d)
Negative Lookahead (?!\1)
Assert that the Regex below does not match
\1
matches the same text as most recently matched by the 1st capturing group
\d
matches a digit (equal to [0-9])
Here is a quick JS demo:
var s = "44556841335158684945454545"
console.log(s.match(/(\d)((?!\1)\d)/g))
To say "two different numbers repeated" you can try
((\d)(?!\2)\d)\1
Capturing parentheses are numbered from the left; so \1 matches the entire outer pair of parentheses, and (?!\2) refers to the inner parentheses around the first digit, constraining the second digit so that it cannot be identical to the first.
Demo: https://regex101.com/r/5f2CEf/1
Obviously, add a + at the end to cover all adjacent repetitions of the match.
I want to match a pattern with regex, the pattern is:
A-Za-z1-9[0-9-0-9]
so for example:
test1[1-50]
Can you help me ?
Solution update:
^[A-Za-z0-9]+\[[0-9]+-[0-9]+]$
Use this regex: [A-Za-z]+[1-9]\[[0-9]+-[0-9]+\]. You might also want to add \b at the start of the regex to match only after non words character.
[A-Za-z]+ matches things like test, only letters are accepted, one or more times
[1-9] matches a any digit but 0
\[[0-9]+-[0-9]+\] matches one or more digits twice and separated with -. All this must be enclosed with square brackets. (You need to escape those with \ because they are metacharacters)
Need Regular expression for Phone Number like
(123)-123-1212 Valid
(123)-123-121 InValid
(123)-123-12 InValid
1212-344--- Invalid
(000)-123-1212 InValid
Only first format should be valid. Number should be 0-9
I don't have any idea regarding this expression
You can use the following:
^\((?!000)\d{3}\)-\d{3}-\d{4}$
Explanation:
^ match start of the string
\( followed by a parentheses ( (escaped because it has special meaning in regex)
(?!000) negative lookahead (to fail for 000)
\d{3} match a digit exactly three times (\d equivalent to [0-9])
\) close parentheses
- match hyphen literally
\d{3}-\d{4} followed by exactly 3 digits then a hyphen and exactly 4 digits
$ followed by end of the string (so that it wont match strings with other charcters after the specified patten)
I am using the regex
(.*)\d.txt
on the expression
MyFile23.txt
Now the online tester says that using the above regex the mentioned string would be allowed (selected). My understanding is that it should not be allowed because there are two numeric digits 2 and 3 while the above regex expression has only one numeric digit in it i.e \d.It should have been \d+. My current expression reads. Zero of more of any character followed by one numeric digit followed by .txt. My question is why is the above string passing the regex expression ?
This regex (.*)\d.txt will still match MyFile23.txt because of .* which will match 0 or more of any character (including a digit).
So for the given input: MyFile23.txt here is the breakup:
.* # matches MyFile2
\d # matched 3
. # matches a dot (though it can match anything here due to unescaped dot)
txt # will match literal txt
To make sure it only matches MyFile2.txt you can use:
^\D*\d\.txt$
Where ^ and $ are anchors to match start and end. \D* will match 0 or more non-digit.
The pattern you have has one group (.*) which would match using your example:MyFile2
because the . allows any character.
Furthermore the . in the pattern after this group is not escaped which will result in allowing another character of any kind.
To avoid this use:
(\D*)\d+\.txt
the group (\D*) would now match all non digit characters.
Here is the explanation, your "MyFile23.txt" matches the regex pattern:
A literal period . should always be escaped as \. else it will match "any character".
And finally, (.*) matches all the string from the beginning to the last digit (MyFile2). Have a look at the "MATCH INFORMATION" area on the right at this page.
So, I'd suggest the following fix:
^\D*\d\.txt$ = beginning of a line/string, non-digit character, any number of repetitions, a digit, a literal period, a literal txt, and the end of the string/line (depending on the m switch, which depends on the input string, whether you have a list of words on separate lines, or just a separate file name).
Here is a working example.