Regular expressions, negative lookahead anywhere in the string - regex

Im sorry if this is asked and has an answer but I can't find it.
I know about regex lookarounds and negative lookahead.
Thing is that negative lookahead examines what comes right after current position in a string.
What I need is to find and discard matches if string contains words like "career(s)" and "specials" for example, but if it contains them anywhere in the string.
What would be the efficient way of doing that?
At the moment I'm using PCRE flavor but the more general regex is, the better.
Thank you.

You can use this regex:
^(?!.*(?:career\(s\)|specials)).*
Or if s is optional then use:
^(?!.*(?:career|special)s?).*
RegEx Demo

Related

regex look ahead behind (look around) negative problems

I am having trouble understanding negative regex lookahead / lookbehind. I got the impression from reading tutorials that when you set a criteria to look for, the criteria doesn't form part of the search match.
That seems to hold for positive lookahead examples I tried, but when I tried these negative ones, it matches the entire test string. 1, it shouldn't have matched anything, and 2 even if it did, it wasn't supposed to include the lookahead criteria??
(?<!^And).*\.txt$
with input
And.txt
See: https://regex101.com/r/vW0aXS/1
and
^A.*(?!\.txt$)
with input:
A.txt
See: https://regex101.com/r/70yeED/1
PS: if you're going to ask me which language. I don't know. we've been told to use regex without any specific reference to any specific languages. I tried clicking various options on regex101.com and they all came up the same.
Lookarounds only try to match at their current position.
You are using a lookbehind at the beginning of the string (?<!^And).*\.txt$, and a lookahead at the end of the string ^A.*(?!\.txt$), which won't work. (.* will always consume the whole string as it's first match)
To disallow "And", for example, you can put the lookahead at the beginning of the string with a greedy quantifier .* inside it, so that it scans the whole string:
(?!.*And).*\.txt$
https://regex101.com/r/1vF50O/1
Your understanding is correct and the issue is not with the lookbehind/lookahead. The issue is with .* which matches the entire string in both cases. The period . matches any character and then you follow it with * which makes it match the entire string of any length. Remove it and both you regexes will work:
(?<!^And)\.txt$
^A(?!\.txt$)

c++11 regex lookahead exclude word

I have a string list:
ReferencePrice
ReferenceCurrentPrice
CostPrice
AverageCostPrice
...
I want to filter out all strings that:
Containing 'Price',
But not containing 'CostPrice'
My regex is '(?!Cost)Price', but it can match the 3rd string 'CostPrice', why? and what is the correct regex?
After some investigation, I know what 'lookahead' means. It means look right, so similarly lookbehind means look left.
The correct regex should be a negative lookbehind regex:
(?<!Cost)Price
Try it: https://regex101.com/r/m3238r/1
Unfortunately, c++11 doesn't support lookbehind. Boost regex does.
http://www.boost.org/doc/libs/1_50_0/libs/regex/doc/html/boost_regex/syntax/perl_syntax.html
Having to take of soon, I'll answer with an (in my opinion) silly solution (there must be much better ones :P).
((?!Cost)....|^.{0,3})Price
If preceded by 4 characters (atleast), make sure it isn't Cost. Alternatively, make sure there aren't more than 3 characters preceding Price.
See it here at regex101.

How to use lookahead in regex to match a word that only appear in certain context?

I'm learning regular expression and now I'm on chapter of lookahead. In the class example, if you want to match "sea" only in "seashore", you do:
/(?=seashore)sea/
or
/sea(?=shore)/
But what if I want to match "shore" only in "seashore"? I tried:
/(?=seashore)shore/
and
/(?=sea)shore/
but none of them work. Did I misunderstand something? As far as I understand, lookahead is like a premise for matching a string. But why I cannot match a "shore" only in context of "seashore"? Anyone can give me a hit? Lots of thanks!
FYI: this is the regex pal I'm using to test my regular expression:http://www.regextester.com/
You should use lookbehind if it is supported by your regex engine. Like so:
/(?<=sea)shore/
Otherwise (e.g. in Javascript, where lookbehinds are not supported), you'll have to match the whole thing and use capturing groups to separate the part that you want from the rest.
If you write /(?=seashore)... it already expects the sea... ahead and so, if it would match, it would match from there. There is no way to just exclude that thing from the match if you use lookahead.

Non Greedy Regex from Left

I have string like this:
\24s904dS\24sr4d2\24x\\y\\12z:234F\\3dRl\24o980\24
I want to match the bold part only:
x\\y\\12z:234F\\3dRl
I can take care the non-greedy for right part with this regex:
\\24(.*:.*?)\\24
But still can't find out how to deal with non-greedy for left part.
modify your pattern as follows
.*\\24(.*:.*?)\\24
You can use this negative lookahead based regex:
\\24((?:.(?!\\24))*:.*?)\\24
RegEx Demo
Important part is this lookahead based regex pattern (?:.(?!\\24))*, which means match a character if \24 is not followed. That essentially makes sure most adjacent left \24 is matched.
Output Match:
x\\y\\12z:234F\\3dRl
Rather than modifying the greediness, it's better to just write a more-precise regex:
\\24([a-zA-Z0-9]+:[a-zA-Z0-9]+)\\24
(It's relatively rare that non-greedy modifiers are really the best approach to a problem.)

Oracle regex string not beginning with '40821'

I am trying to define a regex that matches string with numbers and it's not begining with 40821, so '40822433598347597' matches and '408211' not. So, I've tried
^(?!40821)\d+
Works perfectly in my regex editor, but still doesnt work in oracle. I know, it's very easy to use where not but my goal is to do it using only regex. Please, some pieces of advice, what am I doing somthing wrong?
According to this question, negative lookahead and lookbehind are not supported in Oracle.
One way would be to explicitly enumerate the possibilities using alternation. In your case it would be something like:
^([012356789]|4[123456789]|40[012345679]|408[013456789]|4082[023456789])
I think you try to use negative lookbehind:
(?<!a)b matches a "b" that is not preceded by an "a"
Source: http://www.regular-expressions.info/lookaround.html
That kind of Perl's sytax is not supported by Oracle.