Regex 'OR' seems to not behave as expected

Regex 'OR' seems to not behave as expected - regex

Hello I am trying to build a regex for a string with the followings constraints :
should only contain 'X', 'O', 'T', '_', ';'
'T' and 'O' should occur only once and can be anywhere in the string
'X', '_', ';' may occur zero to n times
Here are few valid examples :
"X__;O_T;___"
"T__;_XX_;_XO"
"T__;OX_;_X_"
"OT"
This is the regex I have right now :
/^([X;_]*T[X;_]*O)|([X;_]*O[X;_]*T);$ */i
The above seems to pass the below input as valid:
T__;_X__OO; //which is not valid
Thanks for your time.

If you can use a lookahead you may use
^(?=[^O]*O[^O]*$)(?=[^T]*T[^T]*$)[TOX;_]*$
See the regex demo
Details
^ - start of string
(?=[^O]*O[^O]*$) - there must be any 0+ chars other than O, then O, and then any 0+ chars other than O up to the end of the string
(?=[^T]*T[^T]*$) - there must be any 0+ chars other than T, then T, and then any 0+ chars other than T up to the end of the string
[TOX;_]* - 0+ T, O, X, ;, _ chars
$ - end of string.
A non-lookaround approach based on alternation is also possible:
^[X;_]*(?:T[X;_]*O|O[X;_]*T)[X;_]*$
See the regex demo.
Details
^ - string start
[X;_]* - 0+ T, O, X, ;, _ chars
(?:T[X;_]*O|O[X;_]*T) - either of the two alternatives:
T[X;_]*O - T, any 0+ T, O, X, ;, _ chars, O
| - or
O[X;_]*T - O, any 0+ T, O, X, ;, _ chars, T
[X;_]* - 0+ T, O, X, ;, _ chars
$ - string end.

Related

Regex capture required and optional characters in any position only

I would like to match against a word only a set of characters in any order but one of those letters is required.
Example:
Optional letters: yujkfec
Required letter: d
Matches: duck dey feed yudekk dude jude dedededy jejeyyyjd
No matches (do not contain required): yuck feck
No matches (contain letters outside of set): sucked shock blah food bard
I've tried ^[d]+[yujkfec]*$ but this only matches when the required letter is in the front. I've tried positive lookaheads but this didn't do much.

You can use
\b[yujkfec]*d[dyujkfec]*\b
See the regex demo. Note that the d is included into the second character class.
Details:
\b - word boundary
[yujkfec]* - zero or more occurrences of y, u, j, k, f, e or c
d - a d char
[dyujkfec]* - zero or more occurrences of y, u, j, k, f, e, c or d.
\b - a word boundary.

Find if either followed by non number or end of file

I want to match the string b5 with optional $ in front of the b or tha 5 :
=b5
b$5
= $b$5
($b5)
But the 5 can't be followed by any number . And the b can't be preceded by any alphabet. So this should return false :
b55
ab5
I tried this :
\W\$*b\$*5\W
it works fine. i will match X=($b$5) but the problem is : it won't match anymore if the '5' is the last character in the line.
because 5 is last character

You can use
(?:\W|^)\$*b\$*5(?:\W|$)
(?:\W|^)\$*b\$*5\b
See the RE2 regex demo.
Details
(?:\W|^) - a non-capturing group matching either a non-word char or start of string
\$* - zero or more $ chars
b - a b char
\$* - zero or more $ chars
5 - a 5 char
(?:\W|$) - a non-capturing group matching either a non-word char or end of string or
\b - a word boundary.

Regex match strings with different values

for i,v in array
for i , v in array
for i , v in array
for i, v in array
for i,v in array
for i, v in array
for[\s+,.](.+)
https://regex101.com/r/Vd3w7C/2
How i could match anything after the v
but
i,v, and in array will have different values
i mean something like:
for ppp,gflgkf heekd gfvb

You could use
\bfor\s+[^\s,]+(?:\s*,\s*[^\s,]+)*\s+(.+)
The pattern matches:
\bfor\s+ Match for and 1+ whitespace chars
[^\s,]+ Match 1+ times any char except a whitspace char or ,
(?: Non capture group
\s*,\s*[^\s,]+ Match a comma between optional whitespace chars, and match at least a single char other than a comma or whitespace chars
)*\s+ Close the group and optionally repeat it followed by 1+ whitespace chars
(.+) Capture 1+ times any char except a newline in group 1
See a regex demo.

command line grep finding words with exactly one vowel

how do you list all the lines that contain words which contain one vowel?
I have tried
egrep -i '\<.*[aeiou]{1}.*\>' f3.txt
but I'm stuck and can't figure it out

You may use
grep -i '\<[^[:digit:][:punct:][:space:]aeiou]*[aeiou][^[:digit:][:punct:][:space:]aeiou]*\>' f3.txt
Details
\< - start of a word
[^[:digit:][:punct:][:space:]aeiou]* - 0 or more chars other than digits, punctuation, whitespace, a, e, i, o, u
[aeiou] - 1 occurrence of a, e, i, o or u
[^[:digit:][:punct:][:space:]aeiou]* - 0 or more chars other than digits, punctuation, whitespace, a, e, i, o, u
\> - end of a word.
See an online demo.

Regex Pattern - Groovy

I need to create a regex - with the following requirements
starts with C, D, F, G, I, M or P
has at least one underscore (_)
eg. C6352_3
I've tried the following like this
#Pattern(regexp = '^(\C|\D|\F|\G|\I\|\M|\P)+\_*' , message = "error")

You may use
/^[CDFGIMP][^_\s]*_\S*$/
Or, to only handle word chars (letters, digits and _),
/^[CDFGIMP]\w*_\w*$/
or a bit more efficient one with character class subtraction:
/^[CDFGIMP][\w&&[^_]]*_\w*$/
See the regex demo
Details
^ - start of a string
[CDFGIMP] - any char listed in the character set
[^_\s]* - zero or more chars other than _ and whitespace
\w* - matches 0+ word chars: letters, digits or _ ([\w&&[^_]]* matches 0+ letters and digits only)
_ - an underscore
\S* - 0+ non-whitespace chars (or \w* will match any letters, digits or _)
$ - end of string (or better, \z to only match at the very end of the string).

You could skip regex, and make it readable:
boolean valid(String value) {
(value?.take(1) in ['C', 'D', 'F', 'G', 'I', 'M', 'P']) && value?.contains('_')
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Regex 'OR' seems to not behave as expected - regex

Related

Regex capture required and optional characters in any position only

Find if either followed by non number or end of file

Regex match strings with different values

command line grep finding words with exactly one vowel

Regex Pattern - Groovy

Categories

Resources