How to write regex which find word (without whitespace) that doesn't contain some chars (like * or #) and sentence also (like level10 or level2 - it should be also regex - level[0-9]+). It will be simple for chars excluded ([^\\s^#^*]+) but how to exclude this 'level' example too ?
I want to exclude chars AND level with number.
Examples:
weesdlevel3fv - shouldn't match because of 'level3'
we3rlevelw4erw - should match - there is level without number
dfs3leveldfvws#3vd - shouldn't match - level is good, but '#' char appeared
level4#level levelw4_level - threat as two words because of whitespaces - only second one should match - no levels with number and no restricted chars like '#' or '*'
See this regex:
/(?<=\s)(?!\S*[#*])(?!\S*level[0-9])\S+/
Regex explanation:
(?<=\s) Asserts position after a whitespace sequence.
(?!\S*[#*]) Asserts that "#" or "*" is absent in the sequence.
(?!\S*level[0-9]) Asserts that level[0-9] is not matched in the sequence.
\S+Now that our conditionals pass, this sequence is valid. Go ahead and use \S+ or \S++ to match the entire sequence.
To use lookaheads more exclusively, you can add another (?!\S*<false_to_assert>) group.
View a regex demo!
For this specific case you can use a double negation trick:
/(?<=\s)(?!\S*level[0-9])[^\s#*]+(?=\s)/
Another regex demo.
Read more:
Regex for existence of some words whose order doesn't matter
you can simply OR the searches with the pipe character
[^\s#*]+|level[0-9]+
Related
Here are my potential inputs:
brian#muck.co, brian#gmail.com
brian#gmail.com, brian#muck.co
What I want to do is extract the #muck.co email address.
What I have tried is:
\s.*#muck.co
The problem is that this only grabs an email address if it is preceded by a space (so it would only match the second example input above). . . How would I write a Regex expression to match either inputs?
\s matches for a space, so you should wanted to use something like [^\s]*#muck.co - this means any number of not space caracters. [] - for a set of symbols, ^ - for negate effect.
It does not work for me, because \s in my regex flavour seems to not contain regular space, but this works [^[:space:]]\+#muck\.co. Also \+ instead of * for one or more non-space characters instead of any number and escape dot \. which unescaped stands for any single character.
You can use a negated character class to not cross the # and use either a word boundary at the end to prevent a partial word match:
[^\s#]+#muck\.co\b
Regex demo
Currently, I am not expert in Regex, but I tried below thing I want to improve it better, can some one please help me?
Pattern can contain ASCII letters, spaces, commas, periods, ', . and - special characters, and there can be one digit at the end of string.
So, it's working well
/^[a-z ,.'-]+(\d{1})?$/i
But I want to put condition that at least 2 letters should be there, could you please tell me, how to achieve this and explain me bit as well, please?
Note that {1} is always redundant in any regex, please remove it to make the regex pattern more readable. (\d{1})? is equal to \d? and matches an optional digit.
Taking into account the string must start with a letter, you can use
/^(?:[a-z][ ,.'-]*){2,}\d?$/i
Details:
^ - start of string
(?: - start of a non-capturing group (it is used here as a container for a pattern sequence to quantify):
[a-z] - an ASCII letter
[ ,.'-]* - zero or more spaces, commas, dots, single quotation marks or hyphens
){2,} - end of group, repeat two or more ({2,}) times
\d? - an optional digit
$ - end of string
i - case insensitive matching is ON.
See the regex demo.
The thing to change in your regex is + after the list of allowed characters.
+ means one or many occurrences of the provided characters. If you want to have 2 or more you can use {2,}
So your regex should look something like
/^[a-z ,.'-]{2,}\d?$/i
I want to create a regular expression to filter lines based on a word combination.
In the following example I want to match any lines that have wheel and ignore any lines that have steering in them. In the example below there are lines with both. I want to skip the line with steeringWheel but select all the rest.
chrysler::plastic::steeringWheel
chrysler::chrome::L_rearWheelCentre
chrysler::chrome::R_rearWheelCentre
If I do the following
.*(Wheel|^steering).*
It would find lines including steeringWheel.
You need to use a negative lookahead anchored at the start:
(?i)^(?!.*steering).*(wheel|tyre).*
^^^^^^^^^^^^^^
See the regex demo.
The pattern matches:
(?i) - make the pattern case insensitive
^ - start of string
(?!.*steering) - a negative lookahead that fails the match if there is steering substring after any 0+ chars
.* - any 0+ chars as many as possible up to the last occurrence of
(wheel|tyre) - either wheel or tyre
.* - any 0+ chars up to the end of line.
This regex should work. It uses a negative lookbehind, assuming that the word steering will be immediately followed by the word 'wheel'.
.*(?<!steering)Wheel.*
I don't think you'll be able to write it all as one regex. My understanding is regex doesn't truly support not matching words. The negative look arounds are good, but it has to be right there next to it not just somewhere on the line. What you are trying to do with ^ is for character classes like:
[^abc0-9] #not a character a,b,c,0..9
If possible something like this should work:
thelist = [
"chrysler::plastic::steeringWheel",
"chrysler::chrome::L_rearWheelCentre",
"chrysler::chrome::R_rearWheelCentre"
]
theregex_wheel = re.compile("wheel", re.IGNORECASE)
theregex_steering = re.compile("steering", re.IGNORECASE)
for thestring in thelist:
if re.search(theregex_wheel, thestring) and not re.search(theregex_steering, thestring):
print ("yep, want this")
else:
print ("skip this guy")
I've got to rename our application and would like to search all strings in the source code for the use of it. Naturally the app name can appear anywhere within the strings and the strings can span multiple lines which complicates things.
I was using (["'])APP_NAME to find instances at the start of strings but now I need a more complete solution.
Essentially what I'd like to say is "find instances of APP_NAME enclosed by quotes" in regex speak.
I'm searching in Xcode in case anyone has any Xcode-specific alternatives...
You may use
"[^"]*APP_NAME[^"]*"|'[^']*APP_NAME[^']*'
See the regex demo.
Note that this regex is based on alternation (| means OR) and negated character classes ([^"]* matches any 0+ chars other than ").
Or, alternatively:
(["'])(?:(?!\1).)*APP_NAME.*?\1
See this regex demo. The pattern is a bit trickier:
(["']) - captures " or ' into Group 1
(?:(?!\1).)* - any 0+ occurrences of a char that is not equal to the one captured into Group 1
APP_NAME - literal char sequence
.*? - any 0+ chars other than line break chars but as few as possible`up to the first occurrence of...
\1 - the value captured into Group 1.
Which regex needs to be used to extract 'Manchester City' from string.
String is:
Aston Villa - Manchester City
I tried -(.*)\w|-(.), but it grabs - .
Note that -(.*)\w|-(.) matches - since both the alternatives here start with matching a hyphen. You can usually check if something is present or not with a lookaround.
However, in this case, I'd suggest
-\s*\K[^-]+$
Since you need to only match the substring after the last - with spaces trimmed off, you need something like a negative infinite width lookbehind (?<=-\s*). However, in PCRE, infinite width lookbehind is not supported. Instead, there is a \K operator that makes the engine omit the whole match that was grabbed so far by the current pattern.
See a regex demo
Breakdown:
- - a literal hyphen
\s* - zero or more whitespace characters
\K - operator that resets (empties) all currently kept match buffer
[^-]+ - one or more characters other than - up to ...
$ - the end of the string.
The simplest is[code] . *- (. *) [/code] and your data is in $1 or \1 or something else that depends on your tool. That assume that data are in format xxxxx-xxxxxx
Another simple option is - (.*) see: https://regex101.com/r/fY3oE7/1. Use the first capturing group in your language to get the part after the dash.