How to match something not defined - regex

if I have defined something like that
COMMAND = "HI" | "HOW" | "ARE" | "YOU"
How can i say "if u match something that is not a COMMAND"?..
I tried with this one
But didn't work..

As far as I can tell this is not possible with (current) JFlex.
We would need an effective tempered negative lookahead: ((?!bad).)*
There are two ways to do a negative lookahead in JFlex:
negation in the lookahead: x / !(y [^]*) (match x if not followed by y in the lookahead).
lookahead with negated elements: x / [^y]|y[^z] (match if x is followed by something that is !a or a!b.
Otherwise, you may get some ideas from this answer (specifically the lookaround alternatives):

Well, you can just match anything else, then
COMMAND = "HI" | "HOW" | "ARE" | "YOU"
. {throw new RuntimeException("Illegal character: <" + yytext() + ">");}


I have written a regex for matching the sub string with spaces around it but that's not working well

Actually I was working on a regex problem whose task is to take a substring (||, &&) and replaces it with another substring (or, and) and I wrote code for it but that's not working well
question = x&& &&& && && x || | ||\|| x
Expected output = x&& &&& and and x or | ||\|| x
Here is the code I wrote
import re
for i in range(int(input())):
print(re.sub(r'\s[&]{2}\s', ' and ', re.sub(r"\s[\|]{2}\s", " or ", input())))
My output = x&& &&& and && x or | ||\|| x
You need to use lookarounds, the problem with the current regex is && && here the && the first match captures the space so there's no space available before the second && and it won't match, so we need to use zero-length-match ( lookarounds)
Replace the regex
\s[&]{2}\s --> (?<=\s)[&]{2}(?=\s)
\s[\|]{2}\s --> (?<=\s)[\|]{2}(?=\s)
(?<=\s) - Match should be precede space characters
(?=\s) - Match should be followed by space characters
You're looking for a regex like (?<=\s)&&(?=\s) (Regex demo)
Using lookarounds to assert the position of space characters around your targeted replacement groups allows overlapping matches to occur - otherwise, it will match the spaces on both sides and block out the other options.
import re
in_str = 'x&& &&& && && x || | ||\|| x'
expect_str = 'x&& &&& and and x or | ||\|| x'
print(re.sub("(?<=\s)\|\|(?=\s)", "or", re.sub("(?<=\s)&&(?=\s)", "and", in_str)))
Python demo
Try using re.findall() instead of re.sub

R regex - removing pattern from ends

Suppose I have a string that looks like so:
How would I only remove the N's from the ends to get:
Just match and remove the N's which exists at the start or at the end through gsub.
gsub("^N+|N+$", "", x)
^N+ matches one or more N's which exists at the start.
| Alternation operator.
N+$ Matches one or more N's which exists at the end.
> gsub("^N+|N+$", "", x)
gsub("^N*([A-Z]*?)N*$", "\\1", x)
You can use \1 to backreference here.See demo.
Use as

Regular Expression : Splitting a string of list of multivalues

My goal is splitting this string with regular expression:
in a list of:
This regex works with your sample string:
This splits on comma, but uses a negative lookahead to assert that the next bracket character is not a right bracket. It will still split even if there are no following brackets.
Here's some java code demonstrating it working with your sample plus some general input showing its robustness:
String input = "AA(1.2,1.3)+,BB(125)-,FOO,CC(A,B,C)-,DD(QWE)+,BAR";
String[] split = input.split(",(?![^(]+\\))");
for (String s : split) System.out.println(s);
I don't know what language you are working with, but this makes it in grep:
$ grep -o '[A-Z]*([A-Z0-9.,]*)[^,]*' file
^^^^^^ ^^^^^^^^^^^ ^^^^^
| ^ | ^ |
| | | | everything but a comma
| ( char | ) char
| A-Z 0-9 . or , chars
list of chars from A to Z

Non-greedy regular expression match for multicharacter delimiters in awk

Consider the string "AB 1 BA 2 AB 3 BA". How can I match the content between "AB" and "BA" in a non-greedy fashion (in awk)?
I have tried the following:
awk '
str="AB 1 BA 2 AB 3 BA"
if (match(str,regex))
print substr(str,RSTART,RLENGTH)
with no output. I believe the reason for no match is that there is an odd number of characters between "AB" and "BA". If I replace str with "AB 11 BA 22 AB 33 BA" the regex seems to work..
Merge your two negated character classes and remove the [^A] from the second alternation:
regex = "AB([^AB]|B|[^B]A)*BA"
This regex fails on the string ABABA, though - not sure if that is a problem.
AB # Match AB
( # Group 1 (could also be non-capturing)
[^AB] # Match any character except A or B
| # or
B # Match B
| # or
[^B]A # Match any character except B, then A
)* # Repeat as needed
BA # Match BA
Since the only way to match an A in the alternation is by matching a character except B before it, we can safely use the simple B as one of the alternatives.
The other answer didn't really answer: how to match non-greedily?
Looks like it can't be done in (G)AWK. The manual says this:
awk (and POSIX) regular expressions always match the leftmost, longest
sequence of input characters that can match.
And the whole manual doesn't contain the words "greedy" nor "lazy". It mentions Extended Regular Expressions, but for greedy matching you'd need Perl-Compatible Regular Expressions. So… no, can't be done.
For general expressions, I'm using this as a non-greedy match:
function smatch(s, r) {
if (match(s, r)) {
do {
} while (match(substr(s, m, n - 1), r))
return RSTART
} else return 0
smatch behaves like match, returning:
the position in s where the regular expression r occurs, or 0 if it does not. The variables RSTART and RLENGTH are set to the position and length of the matched string.

Regular expression match decimal with letters

I have following string 3.14, 123.56f, .123e5f, 123D, 1234, 343E12, 32.
What I want to do is match any combination of above inputs. So far I started with the following:
I realize I have to escape the . since its a regular expression itself.
This should also work, if not already proposed.
try {
Pattern regex = Pattern.compile("\\.?\\b[0-9]*\\.?[0-9]+(?:[eE][-+]?[0-9]+)?[fD]?\\b", Pattern.CASE_INSENSITIVE | Pattern.UNICODE_CASE);
Matcher regexMatcher = regex.matcher(subjectString);
while (regexMatcher.find()) {
// matched text:
// match start: regexMatcher.start()
// match end: regexMatcher.end()
} catch (PatternSyntaxException ex) {
// Syntax error in the regular expression
^ start of the string
(\d+(\.\d+)?|\.\d+) one or more digits with an optional ( . and one or more digits)
. and one or more digits
([eE]\d+)? an optional ( e or E and one or more digits)
[fD]? an optional f or D
$ end of the string
As a sidenote, I've made the D compatible with everything but the f.
If you need positive and negative sign, add [+-]? after the ^
This will match all of those:
Note that within a character class (square brackets), dot . is not a special character and should not be escaped.
Maybe that one ?
^\d* #possibly a digit or sequence of digits at the start
(?:\.\d+)? #possibly followed by a dot and at least one digit
(?:[eE]\d+)? #possibly a 'e' or 'E' followed by at least one digit
(?:[fD])?$ #optionnaly followed by 'f' or 'D' letters until the end
You can use regexpal to test it out, but this seems to work on all of those examples: