removing one letter except a compination - regex

Trying to remove all characters except from the compination of 'r d`. To be more clear some examples:
a ball -> ball
r something -> something
d someone -> someone
r d something -> r d something
r something d -> something
Till now I managed to remove the letters except from r or d, but this is not what i want. I want to keep only the compination(ex.4). I use this:
\b(?!r|d)\w{1}\b
Any idea who to do it?
Edit:The reg engine supports lookbehinds.

You may capture the r d combination and use a backreference in the replacement pattern to restore that combination, and remove all other matches:
\b(r d)\b|\b\w\b\s*
See the regex demo (replace with $1 that will put the r d back into the result).
Details:
\b(r d)\b - a "whole word" r d that is captured into Group 1
| - or
\b\w\b\s* - a single whole word consisting of 1 letter/digit/underscore (\b\w\b) and followed with 0+ whitespaces (\s*, just for removing the excessive whitespace, might not be necessary).

Related

How to write regular expression to check if substring starts with range of characters?

Thanks for your help in advance.
I want to check if a substring would start within range of characters after a prefix
For example, I have the following strings with prefix 'abc_xyz$'
abc_xyz$Item Ledger_Entry_CT
abc_xyz$Purchase
To check if string after prefix would start G thru R, I have written the following regular expression,
abc_xyz\$^[G-Rg-r].*
Unfortunately it does not help.
Here are use cases
abc_xyz$Item Ledger_Entry_CT --> should match since first char in 'Item' matches thru G and R
abc_xyz$Purchase --> should match since first char in 'Purchase' matches thru G and R
abc_xyz$Customer --> should NOT match since first char in 'Customer' do not match thru G and R
abc_xyz$Sales --> should NOT match since first char in 'Sales' do not match thru G and R
Any help?
You need to use
^abc_xyz\$[G-Rg-r].*
^abc_xyz\$(?i:[g-r]).*
^abc_xyz\$(?i)[g-r].*
See the regex demo.
The pattern matches
^ - start of string
abc_xyz\$ - a abc_xyz$ fixed string
[G-Rg-r] - G to R or g to r letters
.* - the rest of the line.
Note the (?i:[g-r]) inline modifier group makes the [g-r] pattern part case insensitive.
The (?i) part makes all the pattern parts to the right of it case insensitive.

Need a Regex that includes all char after expression

I am trying to figure out a regex. That includes all characters after it but if another patterns occurs it does not overlap
This is my current regex
[a-zA-Z]{2}\d{1}\s?\w?
The pattern is always 2 letter followed by a number like AE1 or BE3 but I need all the characters following the pattern.
So AE1 A E F but if another pattern occurs in the string like
AE1 A D BE1 A D C it cannot overlap with and be two separate matches.
So to clarify
AB3 D T B should be one match on the regex
ABC D A F DE3 D CD A
should have 2 matches with all the char following it because of the the two letter word and number.
How do I achieve this
I'm not quite following the logic here, yet my guess would be that we might want something similar to this:
([A-Z]{2}\d\s([A-Z]+\s)+)|([A-Z]{3}\s([A-Z]+\s)+)
which allows two letters followed by a digit, or three letters, both followed by ([A-Z]+\s)+.
Demo
Look, you have to consider where your pattern will start. I mean, you know, what is different between AE1 A E F and BE1 A D C in AE1 A D BE1 A D C? You don't want to treat both similarly. So you have to separate them. Separation of these two texts is possible only determining which one is placed in text start.
Altogether, only adding ^ to start your pattern will solve problem.
So your regex should be like this:
^[a-zA-Z]{2}\d{1}\s?\w?
Demo
What you want to do is to split a string with your pattern having the current pattern match as the start of the extracted substrings.
You may use
(?!^)(?=[a-zA-Z]{2}\d)
to split the string. Details
(?!^) - not at the start of the string
(?=[a-zA-Z]{2}\d) - a location in the string that is immediately followed with 2 ASCII letters and any digit.
See the Scala demo:
val s = "ABC D A F DE3 D CD A"
val rx = """(?!^)(?=[a-zA-Z]{2}\d)"""
val results = s.split(rx).map(_.trim)
println(results.mkString(", "))
// => ABC D A F, DE3 D CD A
You can just use this regex:
(?i)\b[a-z]{2}\d\b(?:(?:(?!\b[a-z]{2}\d\b).)+\s?)?
Demo and explanations: https://regex101.com/r/DtFU8j/1/
It uses a negative lookahead (?!\b[a-z]{2}\d\b) to add the constraint that the character matched after the initial pattern (?i)\b[a-z]{2}\d\b should not contain this exact pattern.

Capture word from regex match

I am working on a regex in perl, which identifies what I want it to: word final g (but not following an 'n') or k (but not following an 'r') that precedes word-initial g (but not l or r), word-initial k, or word-initial c (but not c preceding i, e, y, or h):
(((?<!n)g)|(?<!r)k)\s(g(?!l|r)|k|c(?!i|e|y|h));
However, I want it to capture the word that has the g or k at the end of it, so I tried something like this:
(^|\s.*(((?<!n)g)|(?<!r)k))\s(g(?!l|r)|k|c(?!i|e|y|h)); so that $1 captures the beginning of the line or a white space (to signify the beginning of a word) until the next white space before the g, k, or c (the end of the word). Perhaps this is a parentheses problem, but I'm not sure how to keep the grouping I have while also specifying where I want $1 to capture.
What about /(\S*(((?<!n)g)|(?<!r)k))\s(g(?!l|r)|k|c(?!i|e|y|h))/?
EDIT: Looking at it, it could use some clean up :D
/(\S*([^n]g|[^r]k))\s(g[^lr]|k|c[^ieyh])/

How to select a part of a string OR another with REGEXP in MATLAB

I've been trying to solve this problem in the last few days with no success. I have the following string:
comment = '#disabled, Fc = 200Hz'
What I need to do is: if there's the string 'disabled' it needs to be matched. Otherwise I need to match the number that comes before 'Hz'.
The closest solution I found so far was:
regexpi(comment,'\<#disabled\>|\w*Hz\>','match') ;
It will match the word '#disabled' or anything that comes before 'Hz'. Problem is that when it first finds '#disabled#' it copies also the result '200Hz'.
So I'm getting:
ans = '#disabled' '200Hz'
Summing up, I need to select only the 'disabled' part of a string if there is one, otherwise I need to get the number before 'Hz'.
Can someone give me a hand ?
Suppose your input is:
comment = {'#disabled, Fc = 200Hz';
'Fc = 300Hz'}
The regular expression (match disabled if follows # otherwise match digits if they are followed by Hz):
regexp(comment, '(?<=^#)disabled|\d+(?=Hz)','match','once')
Explaining it:
^# - match # at the beginning of the line
(?<=expr)disabled - match disabled if follows expr
expr1 | expr2 - otherwise match expr2
\d+ - match 1 or more digits, equivalently [0-9]+
expr(?=Hz) - match expr only if followed by 'Hz'
Diagram:
Debuggex Demo

String formatting, with Regex - Remove given character, and insert newline

I have a file full of URLs in a weird format, characters separated by a space character.
h t t p : / / w w w . y o u t u b e . c o m / u s e r / A S D
h t t p : / / m o r c c . c o m / f r m / i n d . p h p ? t o p i c = 5 7 . 0
I would like to make it look like :
http://www.youtube.com/user/ASD
http://morcc.com/frm/ind.php?topic=57.0
I use notepad++, and I think regex could take care of this problem for me, unfortunately I don't know regex.
I want to remove the ' ' character (space) between the characters, and leave them in listed format, so replacing /s with '' is not a solution, because it becomes a mess :/
I think I should also insert a /n BEFORE "http" occurs.
Can you not just replace a space ' ' with an empty string ''? Replacing \s is not working how you want because newlines are also matched.
If that doesn't work you could, as you say, replace \s with '' and then replace http with \nhttp.
Regex is fairly basic. Check out the examples page. The second example seems to have what you're looking for: http://www.regular-expressions.info/examples.html
EDIT: Also, I assume you know this, but just to be sure, regex itself will not do what you want. What language are you planning on using regex with, so that people can provide more detailed responses?
Regex reference page [Bookmark it ;)] - http://www.regular-expressions.info/reference.html