I want to match 'marks' 'marks-for-days' 'reactions(98%)' 'fun-for me'
All of those even if its without the brackets like:
marksman - no match
marksreaction - no match
but
marks (98%) - match
reactions 98% - match
fun for me - match
there are no fun only marks - match
I tried the basic word matching but it doesnt work \w*(marks|reactions|fun for me)\w*.
You are not matching the - in marks-for-days only the spaces.
Using a character class can match either a space or hyphen [- ] and use word boundaries to prevent partial matches.
\b(?:marks|reactions|fun[- ]for[- ]me)\b
Regex demo
If you want to match the whole line you can add .* before and after the pattern:
.*\b(?:marks|reactions|fun[- ]for[- ]me)\b.*
Regex demo
Related
For example we have a string:
asd/asd/asd/asd/1#s_
I need to match this part: /asd/1#s_ or asd/1#s_
How is it possible to do with plain regex?
I've tried negative lookahead like this
But it didn't work
\/(?:.(?!\/))?(asd)(\/(([\W\d\w]){1,})|)$
it matches this '/asd/asd/asd/asd/asd/asd/1#s_'
from this 'prefix/asd/asd/asd/asd/asd/asd/1#s_'
and I need to match '/asd/1#s_' without all preceding /asd/'s
Match should work with plain regex
Without any helper functions of any programming language
https://regexr.com/
I use this site to check if regex matches or not
here's the possible strings:
prefix/asd/asd/asd/1#s
prefix/asd/asd/asd/1s#
prefix/asd/asd/asd/s1#
prefix/asd/asd/asd/s#1
prefix/asd/asd/asd/#1s
prefix/asd/asd/asd/#s1
and asd part could be replaced with any word like
prefix/a1sd/a1sd/a1sd/1#s
prefix/a1sd/a1sd/a1sd/1s#
...
So I need to match last repeating part with everything to the right
And everything to the right could be character, not character, digit, in any order
A more complicated string example:
prefix/a1sd/a1sd/a1sd/1s#/ds/dsse/a1sd/22$$#!/123/321/asd
this should match that part:
/a1sd/22$$#!/123/321/asd
If you want the match only, you can use \K to reset the match buffer right before the parts that you want to match:
^.*\K/a\d?sd/\S+
The pattern will match
^ Start of string
.* Match any char except a newline until end of the line
\K Forget what is matched until now
/a\d?sd/ match a, optional digits and sd between forward slashes
\S+ Match 1+ non whitespace chars
See a regex demo
When the following regex - \d\[\w*] is given the input string - asd3[bc]de , it would match 3[bc].
When given input such as 3[bc4[de]] that has nested matches, it matches the inner pattern 4[de] and not the outer one. Why is this so? Is there a way to force the regex to match the outer pattern?
\w won't match a '['.
The \d\[ matches the 3[, then \w* matches bc4, but won't match the inner '['. So, the regex engine has to back track and find another match for \d\[. That matches the 4[, \w* matches de, and then the ]s match.
I believe there are some regex engines that can have recursive patterns and match nested items.
let re = /\d\[\w*]?(\d\[\w*])]/;
let str = "3[bc4[de]]";
console.log([...str.match(re)]);
I want to create a regular expression to filter lines based on a word combination.
In the following example I want to match any lines that have wheel and ignore any lines that have steering in them. In the example below there are lines with both. I want to skip the line with steeringWheel but select all the rest.
chrysler::plastic::steeringWheel
chrysler::chrome::L_rearWheelCentre
chrysler::chrome::R_rearWheelCentre
If I do the following
.*(Wheel|^steering).*
It would find lines including steeringWheel.
You need to use a negative lookahead anchored at the start:
(?i)^(?!.*steering).*(wheel|tyre).*
^^^^^^^^^^^^^^
See the regex demo.
The pattern matches:
(?i) - make the pattern case insensitive
^ - start of string
(?!.*steering) - a negative lookahead that fails the match if there is steering substring after any 0+ chars
.* - any 0+ chars as many as possible up to the last occurrence of
(wheel|tyre) - either wheel or tyre
.* - any 0+ chars up to the end of line.
This regex should work. It uses a negative lookbehind, assuming that the word steering will be immediately followed by the word 'wheel'.
.*(?<!steering)Wheel.*
I don't think you'll be able to write it all as one regex. My understanding is regex doesn't truly support not matching words. The negative look arounds are good, but it has to be right there next to it not just somewhere on the line. What you are trying to do with ^ is for character classes like:
[^abc0-9] #not a character a,b,c,0..9
If possible something like this should work:
thelist = [
"chrysler::plastic::steeringWheel",
"chrysler::chrome::L_rearWheelCentre",
"chrysler::chrome::R_rearWheelCentre"
]
theregex_wheel = re.compile("wheel", re.IGNORECASE)
theregex_steering = re.compile("steering", re.IGNORECASE)
for thestring in thelist:
if re.search(theregex_wheel, thestring) and not re.search(theregex_steering, thestring):
print ("yep, want this")
else:
print ("skip this guy")
I have problem with my regex string. I have two combinations of strings as follows,
2.3.8.2.2.1.2.3.4.12345 = WORDS: "String to capture"
2.3.8.2.2.1.2.3.4.12345 = ""
Regex:
1\.2\.3\.4\.(\d+) = WORDS: (?|"([^"]*)|([^:]*))
https://regex101.com/r/kQ3wT5/10 - matching
https://regex101.com/r/kQ3wT5/9 - Not matching
This regex is matching only for the first string and not for the second where i have empty string. So the regex has to match on both scenario. And one more thing i really dont want to go with "global" match.
Please help me on this.
You need to make WORDS:<space> optional by enclosing it with an optional non-capturing group:
1\.2\.3\.4\.(\d+) = (?:WORDS: )?(?|"([^"]*)|([^:]*))
See the regex demo.
The (?:WORDS: )? matches 1 or 0 sequences (due to the ? quantifier) of WORDS: substring followed with a space.
I have this regex pattern which I made myself (I'm a noob though, and made it through following tutorials):
^([a-z0-9\p{Greek}].*)\s(Ε[0-9\p{Greek}]+|Θ)\s[\(]([a-z1-9\p{Greek}]+.*)[\)]\s-\s([a-z0-9\p{Greek}]+$)
And I'm trying to match the following sentences:
ΠΡΟΓΡΑΜΜΑΤΙΣΤΙΚΕΣ ΕΦΑΡΜ ΣΤΟ ΔΙΑΔΙΚΤΥΟ Ε2 (Ε.Β.Δ.) - ΔΗΜΗΤΡΙΟΥ
ΠΡΟΓΡΑΜΜΑΤΙΣΜΟΣ 1 Θ (ΑΜΦ) - ΜΑΣΤΟΡΟΚΩΣΤΑΣ
ΕΙΣΑΓΩΓΗ ΣΤΗΝ ΠΛΗΡΟΦΟΡΙΚΗ Θ (ΑΜΦ) - ΒΟΛΟΓΙΑΝΝΙΔΗΣ
And so on.
This pattern splits the string into 4 parts.
For example, for the string:
ΠΡΟΓΡΑΜΜΑΤΙΣΤΙΚΕΣ ΕΦΑΡΜ ΣΤΟ ΔΙΑΔΙΚΤΥΟ Ε2 (Ε.Β.Δ.) - ΔΗΜΗΤΡΙΟΥ
The first match is: ΠΡΟΓΡΑΜΜΑΤΙΣΤΙΚΕΣ ΕΦΑΡΜ ΣΤΟ ΔΙΑΔΙΚΤΥΟ (Subject's Name)
Second match is: Ε2 (Class)
Third match is: Ε.Β.Δ. (Room)
And the forth match is: ΔΗΜΗΤΡΙΟΥ (Teacher)
Now in some entries E*/Θ is not defined, and I want to get the 3 matches without the E*/Θ. How should I modify my pattern so that (Ε[0-9\p{Greek}]+|Θ) is an optional match?
I tried ? so far, but because in my previous matches i'm defining \s and \s it requires 2 whitespaces to get 3 matches and i only have one in my string.
I think you need to do two things:
Make .* lazy (i.e. .*?)
Enclose (?:\s(Ε[0-9\p{Greek}]+|Θ))? with a non-capturing optional group.
The regex will look like
^([a-z0-9\p{Greek}].*?)(?:\s(Ε[0-9\p{Greek}]+|Θ))?\s[\(]([a-z1-9\p{Greek}]+.*)[\)]\s-\s([a-z0-9\p{Greek}]+)$
^^ ^^ ^
See demo
If you do not make the first .* lazy, it will eat up the second group that is optional. Making it lazy will ensure that if there is some text that can be matched by the second capturing group, it will be "set".
Note you call capture groups matches, which is wrong. Matches are whole texts matched by the entire regular expression and captures are just substrings matched by parts of regexp enclosed in unescaped round brackets. See more on capture groups at regular-expressions.info.
You can use something like:
(E[0-9\p{Greek}]+|0)?
The whole group will be optional (?).