I would like to match 10 characters after the second pattern:
My String:
www.mysite.de/ep/3423141549/ep/B104RHWZZZ?something
What I want to be matched:
B104RHWZZZ
What the regex currently matches:
B104RHWZZZ?something
Currently, my Regex looks like this:
(?<=\/ep\/)(?:(?!\/ep\/).)*$.
Could someone help me to change the regex that it only matches 10 characters after the second "/ep/" ("B104RHWZZZ")?
It depends on which characters you allow to match. If you want to allow 10 non whitspace characters characters not being / or ? then you could use;
(?<=\/ep\/)[^\/?\s]{10}(?=[^\/\s]*$)
Explanation
(?<=\/ep\/) Assert /ep/ directly to the left
[^\/?\s]{10} Match 10 times any non whitespace character except for / and ?
(?=[^\/\s]*$) Assert no more occurrence of / to the right
Regex demo
Or matching 1+ chars other than / ? & instead of exactly 10:
(?<=\/ep\/)[^\/?&\s]+(?=[^\/\s]*$)
Regex demo
This would match the string as matching group 1:
ep\/\w+\/ep\/(\w+)
https://regex101.com/r/9tUjxG/1
While lookarounds can make this expression more sophisticated so that you won't require matching groups, it makes (in my experiences) the expression hard to read, understand and maintain/extend.
That's why I would always keep regexes as simple as possible.
Related
For example, I have these strings
APPLEJUCE1A
APPLETREE2B
APPLECAKE3C
APPLETEA1B
APPLEWINE3B
APPLEWINE1C
I want all of these strings except those that have TEA or WINE1C in them.
APPLEJUCE1A
APPLETREE2B
APPLECAKE3C
APPLEWINE3B
I've already tried the following, but it didn't work:
^APPLE(?!.*(?:TEA|WINE1C)).*$
Any help is appreciated as I'm also kinda new to this.
If you indeed have mutliple strings as you claim, there's no need to jam all that in one regex pattern.
/^APPLE/ && !/TEA|WINE1C/
If you have a single string, the best approach is probably to splice it into lines (split /\n/), but you could also use a single regex match too
/^APPLE(?!.*TEA|WINE1C).*/mg
You can use
^APPLE(?!.*TEA)(?!.*WINE1C).*
See the regex demo.
Details:
^ - start of string
APPLE - a fixed string
(?!.*TEA) - no TEA allowed anywhere to the right of the current location
(?!.*WINE1C) - no WINE1C allowed anywhere to the right of the current location
.* - any zero or more chars other than line break chars as many as possible.
If you don't want to match a string that has both or them (which is not in the current example data):
^APPLE(?!.*(WINE1C|TEA).*(?!\1)(?:TEA|WINE1C)).*
Explanation
^ Start of string
APPLE match literally
(?! Negative lookahead
.*(WINE1C|TEA) Capture either one of the values in group 1
.* Match 0+ characters
(?!\1)(?:TEA|WINE1C) Match either one of the values as long as it is not the same as previously matched in group 1
) Close the lookahead
.* Match the rest of the line
Regex demo
I need some help here
Here is example of what im trying to match:
1 ScreenMail Enable friendly none Internal any 5
I need to match everything excluding the last digits (5) Meaning matching the first digit(1), spaces, letter, special characters, etc I tried using /^(\d), but after matching the first digits, it stopped. Your assistance would be appreciated.
The simplest way is probably to remove last digits with:
\d+$
\d+\s*$
See the regex demo.
You may want to use a matching regex like
^.*[^\d\s]
that matches any zero or more chars other than line break chars (.*) as many as possible and then a char other than a digit and whitespace. See this regex demo.
However, if the digits are followed with an optional whitespace, or if you allow any text after the last digits, it will fail. You can then use
^.*[^\d\s](?=\s*\d)
See this regex demo. The (?=\s*\d) positive lookahead requires zero or more whitespaces and then a digit immediately to the right of the current location.
I need to process numbers that may have optional thousand-separators, such as 1234567 and 1,234,567
I naively assumed I could achieve this with
(\d{1,3}([,]?(\d{3}))*)
This, however, matches only 123456 (not the 7) and 1,234,567 (correctly)
However, if I specify an explicit number of matches (2 in this case)
(\d{1,3}([,]?(\d{3})){2})
or a bound (such as \b)
(\d{1,3}([,]?(\d{3}))*)\b
the full match is performed.
Why does the “greedy” * quantifier stop after the first match in the first regex?
If you want to match both numbers with, and without, proper comma thousands separators, then I would use an alternation:
^(\d{1,3}(?:,\d{3})*|\d+)$
Demo
The reason is that \d{1,3} is greedy, so it matches 123 at the beginning of the number. Then the rest of the regexp will only match groups of exactly 3 digits because it uses \d{3}. A regular expression doesn't try to match the longest possible string, so it won't backtrack and shorten the match for \d{1,3} to make the rest of the regexp go further.
But if you add a word boundary \b at the end, it no longer matches with that 3-digit prefix. That causes it to backtrack until it's able to match groups of 3 digits ending with a word boundary.
I have a problem with my regular expression, I am trying to extract a string/number/whatever after a special string.
I have this string:
TEST 3098
There is 6 spaces between TEST and its value, but I am not quite sure if it is alway 6 spaces.
I am trying this regular expression (PCRE)
(?<=TEST\s\s\s\s\s\s).*?(?=\s)
The result should be 3098. With my regular expression, I get the right result, but it is not strong enough, if the number of spaces changes I won't be able to extract it.
The lookbehind should be in a limited size.
Any suggestions?
You may use
TEST\s*\K\S+
If the number of whitespaces should be set to some min/max number use a limiting quantifier, \s{2,} will match two or more, \s{1,10} will allow 1 to 10 whitespaces.
Details
TEST - TEST
\s* - 0 or more whitespaces
\K - match reset operator that omits the text matched so far from the overall match memory buffer
\S+ - 1+ non-whitespaces
See the regex demo
I need to find all the words in an inputted text that has (?i:val) in it and are no longer that 5 characters.
So far I got: \b([a-zA-Z]*(?i:val)[a-zA-Z]*){1,4}\b
If we take this sample text to look in: In computer science, a value is an expression which cannot be evaluated any further (a normal form). Val is also a match
I get 3 matches (value, evaluated and Val), however evaluated should not match the pattern, as it is too long. What is the right way to get this straight?
Your pattern does not account for the length of the words matched.
Use word boundaries and a lookahead like this:
(?i)\b(?=\w*val)\w{1,5}\b
See regex demo
The regex matches:
\b - a leading word boundary since the next pattern is \w
(?=\w*val) - a lookahead making sure there is a val substring after zero or more word characters
\w{1,5} - matches 1 to 5 word characters
\b - trailing word boundary that stops words of more than 5 characters long from matching
You may use an ASCII JS version of the regex:
/\b(?=[a-z]*val)[a-z]{1,5}\b/i
It's important to understand why the "evaluated" was matched. Note:
[a-zA-Z]* matches the "e"
(?i:val) matches "val"
[a-zA-Z]* matches "uated"
Actually there's not repetition here! The pattern was matched in only one iteration.
You can achieve what you want using lookarounds, but I think that regex is not the best tool for this task. I highly recommend you using other functions depending on what you have.