Remove certain letters in foma - regex

I am trying to write a rule to remove the non-start [a | e | h | i | o | u | w | y] letters in a string. The rule should keep the first letter, but remove given letters in other locations.
For example,
vave -> vv
aeiou -> a
My code is as below:
?* [ a | e | h | i | o | u | w | y ]+:0 ?* [ a | e | h | i | o | u | w | y ]+:0;
However, when applying the rule on vaavaa, it returns
vaav
vava
vava
vav
vava
vava
vav
vvaa
vva
vva
vv
while vv is what I want.
Please share some advice. Thanks!

You may use this regex for search:
(?!^)[aehiouwy]+
and replace it by emptry string ""
RegEx Demo
RegEx Details:
(?!^): Lookahead to make sure it is not at start
[aehiouwy]+: Match one or more of these letters inside [...]

You can use a captured group and alternation
^(.)|[aehiouwy]+
replace by \1
Regex demo

Related

regex numbers in arithmetic expression

I want to capture all numbers in a string
for example:
+================+============+
| string | match |
+================+============+
| 5*-33 = 75.3 | 5|-33|75.3 |
+----------------+------------+
| s44+2=7 | 2|7 |
+----------------+------------+
| ii2*-5 = 46 | -5|46 |
+----------------+------------+
| -2*-2.1 = 0.1 | -2|-2.1|0.1|
+================+============+
i tried with following expression, but its not working with signed numbers.
\b([0-9]+(\.\d+)?)\b
Regexr
Don't forget the optional -. - is not a number, so you have to capture it separately.
\b(-?\d+(\.\d+)?)\b
Of course, this will have issues with valid expressions such as:
4-3
But that seems to be a different problem.

Match single character not enclosed by braces

I am making a property list syntax definition file (tmLanguage) for practice. It's in Sublime Text's YAML format, but I'll be using it in VS Code.
I need to match all characters (including unterminated { and }) that are not enclosed by {}.
I have tried using negative lookahead and lookbehind assertions, but it just matches not the first or last character in brackets.
(?<!{).(?!})
Adding a greedy quantifier to consume all characters just matches the full line.
(?<!{).+(?!})
Adding a lazy quantifier just matches everything except the first character after the {. It also matches {} exactly.
(?<!{).+?(?!})
| Test | Expected Matches |
| ----------------- | ----------------------- |
| `{Ctrl}{Shift}D` | `D` |
| `D{Ctrl}{Shift}` | `D` |
| `{Ctrl}D{Shift}` | `D` |
| `{Ctrl}{Shift}D{` | `D` `{` |
| `{Ctrl}{Shift}D}` | `D` `}` |
| `D}{Ctrl}{Shift}` | `D` `}` |
| `D{{Ctrl}{Shift}` | `D` `{` |
| `{Shift` | `{` `S` `h` `i` `f` `t` |
| `Shift}` | `S` `h` `i` `f` `t` `}` |
| `{}` | `{` `}` |
Sample text file: https://raw.githubusercontent.com/spikespaz/windows-tiledwm/master/hotkeys.conf
Full syntax highligher:
# [PackageDev] target_format: plist, ext: tmLanguage
---
name: Hotkeys Config
scopeName: source.hks
fileTypes: ["hks", "conf"]
uuid: c4bcacab-0067-43db-a1d7-7a74fffe2989
patterns:
- name: keyword.operator.assignment
match: \=
- name: constant.language
match: "null"
- name: constant.numeric.integer
match: \{(?:Alt|Ctrl|Shift|Super)\}
- name: constant.numeric.float
match: \{(?:Menu|RMenu|LMenu|Tab|Enter|PgUp|PgDown|Ins|Del|Home|End|PrntScr|Esc|Back|Space|F[0-12]|Up|Down|Left|Right)\}
- name: comment.line
match: \#.*
...
You can use the following RegEx to match:
(?:{(Ctrl|Shift|Alt)})*
Then simply replace the matches with an empty string and what you get is according to your wishes.
The RegEx is selfexplaining, but here's a short description:
It creates a non capturing Group consisting of one of your modifier keys in curly brackets. The plus sign '+' at the right means it repeats that one or more times.

regex for matching a word and numbers in same string

I don't know regex and need to find the expressions to isolate strings that have the word "comp" plus any price (number)
any ideas?
249.00 | 259.00 | 279.00 | comp | 349.00 | //I need to return this as match
369.00 | 359.00 | 599.00 | //don't want to return this as match
299.00 | 499.00 | //don't want to return this as match
329.00 | //don't want to return this as match
comp | 269.00 | 269.00 | //I need to return this as match
179.00 | 239.00 | comp | //I need to return this as match
comp | //don't want to return this as match
89.00 | 89.00 | 89.00 | //no match
249.00 | //don't want to return this as match
comp | 249.00 | //I need to return this as matc
199.00 | comp | comp | //I need to return this as match
comp | comp | 99.00 | 99.00 | //I need to return this as match
comp | comp | comp | comp | comp | //I need to return this as match
Try
(\bcomp\b.+([0-9]*[.]?[0-9]+))|(([0-9]*[.]?[0-9]+).+\bcomp\b)
\bcomp\b for boundary word comp.
.+ for one to many characters.
[0-9]*[.]?[0-9]+ for float number.
| for or condition. number | comp or comp | number
Let try this
/\d?+comp\d?+/g
It will match all string "comp" with numbers. I think it right for you

REGEX: why '^([a-z] | a)$' does not match 'a'?

I never used regular expressions before and I was testing some examples.
What I don't understand is why the regular expression ^([a-z] | a)$ doesn't match the string 'a'.
As I understood [a-z] is equivalent to (a | b | c | ... | y | z), so
[a-z] | a must be equivalent to (a | b | c | ... | y | z) | a, that is the same
to say (a | b | c | ... | y | z) or [a-z].
For that reason a string str matches ^([a-z] | a)$ iff matches ^[a-z]$.
That's why I don't understand why that regular expression doesn't match string 'a' or 'e' for example.
PS: I was testing this in this page.
Spaces matter in regular expressions. Remove the spaces around the pipe (|) and it should work.

ANTLR4 - Match values over 9?

So, I've been working again on my assembler, this time I'm hanging with the floating-point registers. Basically, there are 32 fp registers. So, I want to match them, if I write F0, F1, F2, ..., F31. I wrote following into my lexer:
REG
: ('R0'|'r0')
| ('AT'|'at')
| ('v'[0-1]|'V'[0-1])
| ('a'[0-3]|'A'[0-3])
| ('t'[0-9]|'T'[0-9])
| ('s'[0-9]|'S'[0-8])
| ('k'[0-1]|'K'[0-1])
| ('GP'|'gp')
| ('SP'|'sp')
| ('FP'|'fp')
| ('ra'|'RA')
| ('f'[0-31]|'F'[0-31])+
;
Basically, every register here worked without any problems. But F0-F31 seems not to work. I tested it out and noticed, that it only matches F0-F3 but not any higher. This was quite obvious in that moment and I couldn't find out how I would match values which are over 10. I also tried some workarounds like adding more [0-9] behind the others, but that didn't help, as it then would match later values like F36 or F39. So, any idea how I could handle this?
Thanks in Advance.
The class [0-31] matches the 0, 1, 2, 3 or 1 (again). To emphasise: regular expression classes do not match numeric values, but (text) characters.
To match F0, F1, F2, ..., F31 (and f0, f1, f2, ..., f31), do something like this:
FREG
: [fF] ( [0-9] // matches f0..f9 (and F0..F9)
| [1-2] [0-9] // matches f10..f29 (and F10..F29)
| '3' [01] // matches f30 or f31 (and F30 or F31)
)
;
Your complete REG rule could be written as follows:
REG
: [rR] '0'
| 'AT' | 'at'
| [vV] [01]
| [aA] [0-3]
| [tT] [0-9]
| [sS] [0-9]
| [kK] [01]
| 'GP' | 'gp'
| 'SP' | 'sp'
| 'FP' | 'fp'
| 'RA' | 'ra'
| [fF] ( [0-9] | [1-2] [0-9] | '3' [01] )
;
Note that [01] and [0-1] match the same: either '0' or '1'. Also be aware that 'ra' | 'RA' does not match 'Ra'. If you want 'Ra' and 'rA' to match as well, write it like this: [rR] [aA].