How to suppress "import" statements in IntelliJ regex search? - regex

I like to search in IntelliJ for all occurrences of lecture* and event in source code files. This works with the (lecture*|event) regex statement as shown in the screenshot.
Now I would like to filter out all import statements which also contain one or more of the terms in order to focus on the remaining code locations. How do I have change the regular expression to get the desired result?

You could use exclude matching import at the start of the string using a negative lookahead.
^(?![^\S\r\n]*import ).*\b(?:lecture\*|event)
Explanation
^ Start of string
(?! Negative lookahead, assert what is on the right is not
[^\S\r\n]*import Match 0+ times a whitespace char except newlines, match import and space
) Close lookahead
.* Match any char except a newline 0+ times
\b(?:lecture\*|event) Match either lecure* or event preceded by a word boundary
Regex demo
Note to escape the \* or else this part lecture* will match match lectur followed by 0+ repetitions of the e char.

Related

Regex: Match pattern unless preceded by pattern containing element from the matching character class

I am having a hard time coming up with a regex to match a specific case:
This can be matched:
any-dashed-strings
this-can-be-matched-even-though-its-big
This cannot be matched:
strings starting with elem- or asdf- or a single -
elem-this-cannot-be-matched
asdf-this-cannot-be-matched
-
So far what I came up with is:
/\b(?!elem-|asdf-)([\w\-]+)\b/
But I keep matching a single - and the whole -this-cannot-be-matched suffix. I cannot figure it out how to not only ignore a character present inside the matching character class conditionally, and not matching anything else if a suffix is found
I am currently working with the Oniguruma engine (Ruby 1.9+/PHP multi-byte string module).
If possible, please elaborate on the solution. Thanks a lot!
If a lookbehind is supported, you can assert a whitespace boundary to the left, and make the alternation for both words without the hyphen optional.
(?<!\S)(?!(?:elem|asdf)?-)[\w-]+\b
Explanation
(?<!\S) Assert a whitespace boundary to the left
(?! Negative lookahead, assert the directly to the right is not
(?:elem|asdf)?- Optionally match elem or asdf followed by -
) Close the lookahead
[\w-]+ Match 1+ word chars or -
\b A word boundary
See a regex demo.
Or a version with a capture group and without a lookbehind:
(?:\s|^)(?!(?:elem|asdf)?-)([\w-]+)\b
See another regex demo.

Regex catch multiple match look behind look ahead

This is extended question from this ruby, using regex to find something in between two strings
So i'd like to catch multiple of this username
Text (1 line)
starting-middle+31313131313#mysite.com starting-middle+4141414141#mysite.com
Result
["31313131313", "4141414141"]
I tried this (?<=\+).*(?=#) but it will take the last look ahead character, so the result like 31313131313#mysite.com starting-middle+41414141
https://rubular.com/r/3mn5t1pYffCC7C
Instead of using a lookbehind, you could match the + and use \K to forget what is matched so far.
Then use a negated character class to match any character except # or a whitespace char and use the positive lookahead to assert an # on the right.
\+\K[^\s#]+(?=#)
Explanation
\+ Match +
\K Clear the match buffer
[^\s#]+ Match 1+ times any char except a whitespace char or # (Or use \d+ to match only digits)
(?=#) Positive lookahead, assert # directly to the right of the current position
Regex demo

Regex for all illegal filename characters before filetype extension

I'm looking for a Regex that exchanges all illegal filename chars like () space . etc before the filetype ending like .jpg by an -
i got:
[^a-zA-Z0-9_-]+
matches every illegal filename char, but including file extension
and
.*(?=.)
matching everything until the last occurence of .
how do i combine these?
one of my evil file names is
(800x800-png)MGC1000-03EPTD-021_RAL7035-5010.tif.png
after regex replace it should look like
-800x800-png-MGC1000-03EPTD-021_RAL7035-5010-tif.png
the regex should be working in libre office / excel search and replace.
thanks for your help!
You could use your negated character class [^a-zA-Z0-9_-]+ and use a positive lookahead to assert that the string ends with a dot and 1+ word characters.
In the replacement use a hyphen -
[^a-zA-Z0-9_-]+(?=.*\.\w+$)
As per comment from #Stein you might shorten it to:
[^\w-]+(?=.*\.\w+$)
Explanation
[^a-zA-Z0-9_-]+ Match 1+ times any character that is not in the character class
(?= Positive lookahead, assert what is on the right is
.*\.\w+ Match any character 0+ times, then a dot and 1+ word chars
$ Assert the end of the string
) Close positive lookahead
Regex demo
If the extension itself could have special characters, then you might update \w+$ to [^.\s]+$ like:
[^\w-]+(?=.*\.[^.\s]+$)

RegEx: don't capture match, but capture after match

There are a thousand regular expression questions on SO, so I apologize if this is already covered. I did look first.
I have string:
Name Subname 11X22 88X620 AB33(20) YA5619 77,66
I need to capture this string: YA5619
What I am doing is just finding AB33(20) and after this I am capturing until first white space. But AB33(20) can be AB-33(20) or AB33(-20) or AB33(-1).
My preg_match regex is: (?<=\bAB\d{2}\(\d{2}\)\s).+?(?=\s)
Why I am getting error when I change from \d{2} to \d+?
For final result I was thinking this regix will work but no:
(?<=\bAB-?\d+\(-?\d+\)\s).+?(?=\s)
Any ideas what I am doing wrong?
With most regex flavors, lookbehind needs to evaluate to a fixed-length sequence, so you can't use variable quantifiers like * or + or even {1,2}.
Instead of using lookaround, you can simply match your marker pattern and then forget it with \K.
AB-?\d+(?:\(-?\d+\))? \K[^ ]+
demo: https://regex101.com/r/8XXngH/1
It depends on the language. If it is in .NET for example, it matches due to the various length in the lookbehind.
Another solution might be to use a character class and add the character you would allow to match. Then match a whitespace character and capture in a group matching \S+ which matches 1+ times not a whitespace character.
\bAB[()\d-]+\s\K\S+
Explanation
\bAB Match literally prepended with word boundary to prevent AB being part of a larger match.
[()\d-]+ Match 1+ times any of the listed character in the character class
\s Match a whitespace char (or \s+ to match 1 or more)
\K Reset the starting point of the reported match( Forget what was matched)
\S+ Match in a group 1+ times not a whitespace character
Regex demo | Php demo

Regex for spoof

I would like to ask for help regarding my problem when it comes to spoofing let say usernames and I want to catch them using regex.
for example the correct username is :
rolf
and here are the spoofed versions that I could think of:
roooolf
r123olf
123rolf123
rolf5623
123rolf
rollllf
rrrrrrolf
rolffff
So basically I have this regex expression ( that I know is not sufficient because I've tried it on regex101 website )
.+(?![rolf]).+
I'm using this as a baseline because it doesnt catch the correct username which is :
rolf
but it doesn't catch all the other "spoofed" versions of the username.
Any Ideas how can I make my regex more efficient?
Thanks in advance!
You may try this too
(?m)^(?![^\n]*?rolf[^\n]*$).*$
Demo
To match not exactly rolf You can use a negative lookahead (?! to assert that what follows from the beginning of the string is not 'rolf' until the end of the string.
^(?!rolf$).+$
That would match
^ Assert position at the begin of the string
(?! Negative lookahead that asserts that what follows is not
rolf Match literally
) Close negative lookahead
.+ Match any character one or more times
$Assert position at the end of the string
From your example regex you match .+ where #Ωmega has a fair point, matches spaces.
Instead of .+ you could specify what characters you might accept like \w+ for example to match one or more word characters or specify more using a character class.
You can use a regex pattern
\b(?!rolf\b)\S+\b
\b Word boundary - Matches a word boundary position between a
word character and non-word character or position (start / end of
string).
(?! Negative lookahead - Specifies a group that can not match
after the main expression (if it matches, the result is discarded).
\S Not whitespace - Matches any character that is not a
whitespace character (spaces, tabs, line breaks).
+ Quantifier - Match 1 or more of the preceding token.
Test your inputs with this pattern here.