regex negative lookahead with wildcard - regex

I am not sure if the title makes sense but I was not sure how to word it.
I have strings (of filenames) that looks like
/aa/john/doe/xx/yy/xxTRUEyy.jar
/bb/ee/john/doe/xx/yy/aaTRUE.jar
/cc/john/doe/xx/yy/aaFALSE.jar
/dd/john/deere/xx/yy/aaTRUE.jar
I need a regex that does NOT match strings that HAVE /john/doe/ in them, AND have a jar file with TRUE as part of the name. (In the above examples, that should match only string 3 and 4).

You can use a (?!) group to perform negative lookahead:
^(?!.*\/john\/doe\/).*TRUE[^\/]*\.jar$
should be sufficient. regex101 demo.

Related

Regex not ending with, without lookahead or lookbehind

It's sad but my regex library does not support lookahead and lookbehind assertions.
Is it possible to create a regex which match strings not ending with my pattern, foobar\d\d for example?
It is possible, although a bit verbose. Say you want your string not to end with abc, here's a regex for it:
([^c]|[^b]c|[^a]bc)$
Note that this won't match strings like c or bc. To match those, the regex becomes more complicated:
([^c]|[^b]c|[^a]bc|^|^c|^bc)$

How to match ".js" and ".css" but exclude ".min.js" and ".min.css"?

I tried the following without success:
\.(?!min)(js|css)$
Regex 101
I'm not very familiar with negative lookaheads, so I'm probably doing something wrong.
How can my regex be modified to match .js and .css but exclude .min.js and .min.css?
You've got it quite right, except
you need to place it before the dot
you need to use lookbehind instead of lookahead
(?<!\.min)\.(js|css)$
With lookahead this is more complicated, altough you might manage it if you matched the complete filename:
^(.{0,3}|.*(?!\.min).{4})\.(js|css)$
(a string shorter than 4 characters or one whose last 4 characters are not .min, isn't this horrible?)
You need to use negative lookbehind:
(?<!\.min)\.(js|css)$
RegEx Demo

Regex to match certain word but not a particular combination

I have 15 titles as follows:
fruits-and-flowers-themeA
fruits-and-flowers-themeB
fruits-and-flowers-just-test-themeA
themeAfruitsandflowers
nice-fruits-and-flowers-themeA
botanical-names-themeA
I want a regex to help me get only those titles with "themeA" in them, but it should not include "nice" and not include "just-test" or "just-tests".
I tried
^(?!.*just-test|*just-tests|nice).*?(?:themeA).*,
but I still get fruits-and-flowers-just-test-themeA in the output.
How to fix this?
Thanks
You can use this regex with negative lookahead:
^(?!.*?(?:just-tests?|nice)).*?themeA.*$
Working Demo
Option 1
You can use a single regex with lookaheads (see online demo):
^(?!.*nice?)(?!.*just-tests?).*themeA.*
The ^ asserts that the match starts at the beginning of the string (so we don't match a subset of the string
The (?!.*nice?) is a negative lookahead that asserts that at this position in the string, we cannot find any characters followed by nice
The (?!.*just-tests?) is a negative lookahead that asserts that at this position in the string, we cannot find any characters followed by just-test and an optional s
As a further tweak, you can compress the lookaheads into one using an | alternation as in anubhava's answer.
Option 2 without lookaheads (Perl, PHP/PCRE)
^(?:.*(?:nice|just-tests?).*)(*SKIP)(?!)|.*themeA.*
This one doesn't use lookaheads but just skips the unwanted titles. See demo.
Use two different regular expressions for clarity and simplicity.
Match your string against one regex that matches themeA:
/themeA/
and then check that the string does NOT match the one you don't want:
/nice|just-tests?/
Doing it in two different regexes makes it far easier to understand and maintain.

Regular Expression: match only non-repeated occurrence of a character

I need to find and replace all occurrences of apostrophe character in a string, but only if this apostrophe is not followed by another apostrophe.
That is
abc'def
is a match but
abc''def
is NOT a match.
I've already composed a working pattern - (^|[^'])'($|[^']) but I believe it may be shorter and simpler.
Thanks,
Valery
depends on your environment - if your environment supports lookahead and lookbehind, you can do this: (?<!')'(?!')
Ref: http://www.regular-expressions.info/lookaround.html
I think your pattern is short and precise. You could be using negative lookahead/lookbehind, but they would make it a lot more complex. Maintainability is important.
You'll have to be careful for an uneven number of apostrophes:
abc'''def
where you probably do want to replace the 3rd one and leave the 1st and 2nd in there.
You can do that like this (assuming you already matched string literals and only want to replace the uneven numbered trailing apostrophe):
Search for the pattern:
(('')*)'
and replace it with
$1
which is group 1: the even numbered apostrophes (or no apostrophes at all).
I'm not sure what actual problem you're solving, but in case you're parsing/reading a CSV file, or a string that has the likes of CSV input, I highly recommend using a decent CSV parser. Almost all languages have them in some form or another.
see here nagative lookahed q(?!u)
(?=pattern) is a positive look-ahead assertion
(?!pattern) is a negative look-ahead assertion
(?<=pattern) is a positive look-behind assertion
(?<!pattern) is a negative look-behind assertion
http://www.regular-expressions.info/lookaround.html
working DEMO

How to negate the whole regex?

I have a regex, for example (ma|(t){1}). It matches ma and t and doesn't match bla.
I want to negate the regex, thus it must match bla and not ma and t, by adding something to this regex. I know I can write bla, the actual regex is however more complex.
Use negative lookaround: (?!pattern)
Positive lookarounds can be used to assert that a pattern matches. Negative lookarounds is the opposite: it's used to assert that a pattern DOES NOT match. Some flavor supports assertions; some puts limitations on lookbehind, etc.
Links to regular-expressions.info
Lookahead and Lookbehind Zero-Width Assertions
Flavor comparison
See also
How do I convert CamelCase into human-readable names in Java?
Regex for all strings not containing a string?
A regex to match a substring that isn’t followed by a certain other substring.
More examples
These are attempts to come up with regex solutions to toy problems as exercises; they should be educational if you're trying to learn the various ways you can use lookarounds (nesting them, using them to capture, etc):
codingBat plusOut using regex
codingBat repeatEnd using regex
codingbat wordEnds using regex
Assuming you only want to disallow strings that match the regex completely (i.e., mmbla is okay, but mm isn't), this is what you want:
^(?!(?:m{2}|t)$).*$
(?!(?:m{2}|t)$) is a negative lookahead; it says "starting from the current position, the next few characters are not mm or t, followed by the end of the string." The start anchor (^) at the beginning ensures that the lookahead is applied at the beginning of the string. If that succeeds, the .* goes ahead and consumes the string.
FYI, if you're using Java's matches() method, you don't really need the the ^ and the final $, but they don't do any harm. The $ inside the lookahead is required, though.
\b(?=\w)(?!(ma|(t){1}))\b(\w*)
this is for the given regex.
the \b is to find word boundary.
the positive look ahead (?=\w) is here to avoid spaces.
the negative look ahead over the original regex is to prevent matches of it.
and finally the (\w*) is to catch all the words that are left.
the group that will hold the words is group 3.
the simple (?!pattern) will not work as any sub-string will match
the simple ^(?!(?:m{2}|t)$).*$ will not work as it's granularity is full lines
This regexp math your condition:
^.*(?<!ma|t)$
Look at how it works:
https://regex101.com/r/Ryg2FX/1
Apply this if you use laravel.
Laravel has a not_regex where field under validation must not match the given regular expression; uses the PHP preg_match function internally.
'email' => 'not_regex:/^.+$/i'