This question already has answers here:
How can I "inverse match" with regex?
(10 answers)
Closed 6 months ago.
Regex: /^[0-9\p{L}.,\s]+$/u
I would like to replace the characters not matching with the regex with "".
As I understand, you simply want to drop all chars not matching your regex. So the idea is to invert the class of chars:
/^[0-9\p{L}.,\s]+$/u should become /[^\d\p{L}.,\s]+/gu (I added the ^ after the [ to say "not in this list of chars" and replaced 0-9 by \d for digits. Use the g modifier (=global ) to match multiple times.
Running it: https://regex101.com/r/IQz6K5/1
I'm not sure that ,, . and the space will be enough ponctuation. It would be interesting to have a complete example of what you are trying to achieve. You could use another unicode character class for ponctuation if needed, typically with \p{P}. See more info about unicode classes here: https://www.regular-expressions.info/unicode.html#category
Related
This question already has answers here:
Regular expression to match a line that doesn't contain a word
(34 answers)
Closed 2 years ago.
I am currently using the following character class:
[^\)\(] in my regex
I want to add the word 'hello' to this class so it is also not matched in my string.
I have tried
[^\)\((hello)]
but it does not work.
What can I do?
One typical way you would enforce that hello does not appear would be to use a negative lookahead, e.g.
^(?!.*hello)[^t()]+$
If you only wanted to exclude hello when it appears as a bona fide word, then surround it with word boundaries in the lookahead:
^(?!.*\bhello\b)[^t()]+$
This question already has answers here:
Replace multiple characters by one character with regex
(3 answers)
Closed 2 years ago.
I'm trying to create a one-line regex to find multiple characters and replace them all with an underscore.
E.g.
Title :: How to sew
// would be
Title_How_to_sew
I've gotten this far:
const newTitle = event.replace(/[^a-zA-Z]/gm, '_')
which returns: Title____How_to_sew
How can I make that just one underscore per gap?
You can combine multiple regexes with the operator |
For example, /(:|\s)+/ where \s matches any whitespace character
Adding the + at the end means that it will greedily swallow as many repititions as it finds, per match.
This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 3 years ago.
I am very new to python, I am trying to write a regex that will find all instances of a period, space, then capital letter in a corpora.
I have this:
print (re.findall(r'(\.|\!|\?) (A-Z\w+\b)',text))
I got it to print when there was only one capital (i.e. I went to the movie.) but not when its a capitalized word.
Thoughts?
Could use findall using this
(\.|!|\?) ([A-Z]\w+)
The word boundary is not needed here.
The alternations can be substituted for a class [.!?] but not necessary.
The A-Z is a class item but it needs to be enclosed in square brackets [].
Findall will make two elements per match, the punctuation and the alphanum string.
This question already has answers here:
regex: required character in brackets
(3 answers)
Closed 4 years ago.
I am working for something and writing a regular expression to capture a string which is either (numbers and letters) or only numbers.
I know a regex for only number is [0-9] and alphanumeric is [A-Za-z0-9] . But this would capture even the strings which are only letters. How do i force it to not have only letters? Is there a way to do it in a single regex?
([0-9]*[a-zA-Z]*[0-9])+([a-zA-Z]*)
This should solve your problem.
You can test it here
This question already has answers here:
Regex plus vs star difference? [duplicate]
(9 answers)
RegEx match open tags except XHTML self-contained tags
(35 answers)
Closed 4 years ago.
I am trying to match a string "menu-item" but has a digit after it.
<li id="menu-item-578" class="menu-item menu-item-type-post_type menu-item-object-page menu-item-578">
i can use this regex
menu-item-[0-9]*
however it matches all the menu-item string, i want to only match the "menu-item-578" but not id="menu-item-578"
how can i do it?
thank you
You should avoid using menu-item-[0-9]* not because it matches the same expected substring superfluously but for the reason that it goes beyond that too like matching menu-item- in menu-item-one.
Besides replacing quantifier with +, you have to look if preceding character is not a non-whitespace character:
(?<!\S)menu-item-[0-9]+(?=["' ])
or if your regex flavor doesn't support lookarounds you may want to do this which may not be precise either:
[ ]menu-item-[0-9]+
You may also consider following characters using a more strict pattern:
[ ]menu-item-[0-9]+["' ]
Try it works too:
(\s)(menu-item-)\d+
https://regex101.com/
\s Any whitespace character
Use a space before, like this:
\ menu-item-[0-9]*
The first ocurrence has an " right before, while the second one has a space.
EDIT: use an online regex editor (like Regex tester to try this things.