Apply negative look-ahead to preceeding group - regex

I'm trying since hours to get this negative-look-ahead to work for me. It should match my string only if it's NOT followed by '/CCC'
http://refiddle.com/1xb
/(^[\w]+)(?!./CCC$)/mg
Test string:
BBB/CCC
AAA/DDD/CCC
Could someone point out why my pattern still matches the 'BBB' of the first line?

Firstly, you have to escape the / inside the regular expression.
You also have a dot that shouldn't be there and are missing a word boundary:
/(^\w+)\b(?!\/CCC$)/mg
refiddle

Related

How to create proper regular expression to find last character which I want to?

I need to create regex to find last underscore in string like 012344_2.0224.71_3 or 012354_5.00123.AR_3.335_8
I have wanted find last part with expression [^.]+$ and then find underscore at found element but I can not handle it.
I hope you can help me :)
Just use a negative character class [^_] that will match everything except an underscore (this helps to ensure no other underscores are found afterwards) and end of string $
Pattern would look as such:
(_)[^_]*$
The final underscore _ is in a capturing group, so you are wanting to return the submatch. You would replace the group 1 (your underscore).
See it live: Regex101
Notice the green highlighted portion on Regex101, this is your submatch and is what would be replaced.
The simplest solution I can imagine is using .*\K_, however not all regex flavours support \K.
If not, another idea would be to use _(?=[^_]*$)
You have a demo of the first and second option.
Explanation:
.*\K_: Fetches any character until an underscore. Since the * quantifier is greedy, It will match until the last underscore. Then \K discards the previous match and then we match the underscore.
_(?=[^_]*$): Fetch an underscore preceeded by non-underscore characters until the end of the line
If you want nothing but the "net" (i.e., nothing matched except the last underscore), use positive lookahead to check that no more underscores are in the string:
/_(?=[^_]*$)/gm
Demo
The pattern [^.]+$ matches not a dot 1+ times and then asserts the end of the string. The will give you the matches 71_3 and 335_8
What you want to match is an underscore when there are no more underscores following.
One way to do that is using a negative lookahead (?!.*_) if that is supported which asserts what is at the right does not match any character followed by an underscore
_(?!.*_)
Pattern demo

Regex - returning a match without a period

I'm using the below regex string to match the word "kohls" which is located in a group of other words.
\W*((?i)kohls(?-i))\W*
It works great when the word is alone, but if the word is in a url, the match includes a period on both sides.
See the below examples:
Thank you for shopping at Kohls - returns a match for kohls.
https://www.kohls.com - returns a match for .kohls.
Edit. https://www.KohlsAndMichaels.com - doesn't return any match for kohls.
I want it to only extract the exact match for kohls without periods or any other symbols/text in front or behind it. Can you tell me what I'm doing wrong?
In cases like that you can always use a site like regex101.com, which explains the regular expression and shows the matches with colors. So this is how your regular expression currently works:
As you can see in blue color, the problem with the dots is in the \W*, which matches any non-word character. In order to fix this, you can use the following regular expression:
\b((?i)kohls(?-i))\b
The \b (before and after the word you want to match) is used to assert the position at a word boundary. See how this work on that website now:
If you still have questions, look at the explanation of the regular expression provided by that website. It is worth looking.
The \W metacharacter is used to find non-word characters. So adding a star operator will match 0 or more of these non-word characters (like periods). Did you meant to add a word boundary instead?
\b(?i)kohls(?-i)\b
Replace both \W* with [\W,\.\-]* etc.
Should be enough.

Regex match every dot

I'd like to match every dot or comma but not in href attribute. So I have this regular expression:
^(?!.*?href=)(.*?)([.,])(\S+)
But it matches only the first occurrence. I think it because of non-greedy .*? But I can't come up with anything else. Can you help me, please?
What you might do to match every dot or comma and assuming that the attribute value is between single or double quotes is to match what you don't want and to capture in a group what you want to keep.
If you don't want to match a dot in the href you could match it with href=" followed by [^"]*" or '[^']*'. Then you could use an alternation | to capture in a group a dot or a comma using ([.,])
href=(?:"[^"]*"|'[^']*')|([.,])
If you want to match every occurrence, you will need to run the regex with the global (g) flag:
e.g.
/^(?!.*?href=)(.*?)([.,])(\S+)/g
I suggest you use a tool such as https://regex101.com/ to test and debug your regular expressions, it's super handy!

trying to find the correct regular expression

I have the following cases that should match with a regular expression, I've tried several combinations and have read a lot of answers but still no clue on how to solve it.
the rule is, find any combination of . inside a quoted string, atm I have the following regexp
\"\w*((..)|(.))\w*\"
that covers most of the cases:
mmmas"A.F"asdaAA
196.34.45.."asd."#
".add"
sss"a.aa"sss
".."
"a.."
"a..a"
"..A"
but still having problems with this one:
"WERA.HJJ..J"
I've been testing the regpexp in the http://regexr.com/ site
I will really appreciate any help on this
Change your regex to
\"\w*(\.+\w*)+\"
Update: escape . to match the dot and not any character
demo
From the question, it seems that you need to find every occurrence of one or more dot (along with optional word characters) inside a pair of quotes. The following regex would do this:
\"\w*(\.+\w*)+\"
In "WERA.HJJ..J", you have some word characters followed by a dot which is followed by a sequence of word characters again followed by dot and word characters. Your regex would match one or two dots with a pair of optional word character blocks on either sides only.
The dots in the regex are escaped to avoid them being matched against any character, since it is a metacharacter.
Check here.

How to match digits and dots. It has to start with digits first

For this example hello.1.2.3.4.world I want to match a result which gives me 1.2.3.4. Number of digits between dots doesn't matter. As long as it follow digit.digit pattern
My part solution was following regular-expression [\d.]+.[^.a-z], which gives me .1.2.3.4 as result. And I strip the first dot by using trim or similar method.
Any regexp master who can tell me how to rid the first dot with one regular expression only?
How about this: \.(\d(?:\.\d)*)\.\D
EDIT:
(\d+(?:\.\d+)*)
Demo
If you want to use your current regex you can put a lookahead at the start, and escape the literal dot when not inside a character group (?=\d)[\d.]+\.[^.a-z]
The lookahead (?=\d) will make sure the first character matched is a digit.
Demo here