enclosing regular expression in parentheses in Notepad++ [duplicate] - regex

This question already has an answer here:
Notepad++: add parentheses to timestamps
(1 answer)
Closed 1 year ago.
so i have a big list of items in excel. i copied them to Notepad++ because it has regex built in.
it could be AuAC21-XTS02L or BgUX20-C02S etc. basically i want to replace thses two with Au(AC21-XTS02)L and Bg(UX20-C02)S.
with the regular expression \D\D\d\d-(\D){1,3}\d\d i can perfectly find the part of the text that i want to enclose with parentheses but now i dont know how.
i tried using (\D\D\d\d-(\D){1,3}\d\d) as replacement but then i just receive something like Au(DDdd-D{1,3}dd)L.
any help would be appreciated.

You can store the whole matched string in a group and then replace that with ($1). Note that depending on your Notepad++ version you may need to use \ instead of $ to refer to a matching group (i.e. the replacement string would be (\1))
Take a look at this Regex101 snippet: https://regex101.com/r/0e1Wcc/1
It will convert a sample input like,
it could be AuAC21-XTS02L or BgUX20-C02S etc.
it could be AuAC21-XTS02L or BgUX20-C02S etc.
it could be AuAC21-XTS02L or BgUX20-C02S etc.
into
it could be Au(AC21-XTS02)L or Bg(UX20-C02)S etc.
it could be Au(AC21-XTS02)L or Bg(UX20-C02)S etc.
it could be Au(AC21-XTS02)L or Bg(UX20-C02)S etc.

You can take the full match $0 for pattern \D\D\d\d-\D{1,3}\d\d without a capture group because that is not needed, and use it in the replacement between parenthesis \($0\)
The output will be
Au(AC21-XTS02)L or Bg(UX20-C02)S
Note that \D matches any character except a digit, so it could also match a space or a newline.
Looking at the example strings, at bit more precise match (using the same replacement \($0\) could be:
[A-Z][a-z]\K[A-Z]{2}\d\d-[A-Z0-9]{1,3}\d\d(?=[A-Z])
Regex demo

Related

Notepad++ Regex Replace Makeshift Footnotes format With Proper Markdown format

In Word, I had to convert my footnotes to lines appearing at the end of each file to able to make changes in formatting. Some macro I found online was using braces and I ended up using also highlighting so I can see easily where my footnotes used to be. In this way, I have the following strings twice in my documents in the main text and also at the end of each document, sort of like makeshift endnotes.
=={1}==
.
.
.
=={99}==
I want to be able to match those instances in the text and convert them to proper markdown now. The problem is that the in-text format
[^1], [^2], etc.
will be different from what needs to come at the bottom with a semi-colon added:
[^1]:
etc.
So I'm guessing I'll have to live with replacing my old formatting with the new ones with semi-colons and deleting the semi-colons individually while I edit/clean up my text in the future. Without adding the semi-colon, it won't work.
My question is how to use the regex to match the two-digit strings with braces and equation marks.
This
==(\{d{1,2}\})==
did not work.
Also, as I am no pro, I would need the replacement as well. It probably will be
[^($1)]:
I reckon. Apparently, the equal mark doesn't have to be escaped.
Current format:
...some text...makeshift footnote in the format of
=={one- or two-digit number with no spaces in between}==
For example,
=={1}==
=={23}==
etc.
Desired result for all occurences recursively:
[^1]:
.
.
.
[^99]:
The markdown format is single square brackets with a caret and a number, also a semi-colon with the actual footnotes. Usually the number goes up to 42-45 maximum but it doesn't matter, the two digit regex is needed. As I said, the semi-colon will be needed in all instances.
Cheers
You have just some errors in your regex, you forget to escaped the d for digit, it should be \d and the capture group must not include the curly braces.
Use:
Ctrl+H
Find what: =={(\d{1,2})}==
Replace with: [^$1]:
TICK Wrap around
SELECT Regular expression
Replace all
Explanation:
=={ # literally
(\d{1,2}) # group 1, 1 or 2 digits
}== # literally
Screenshot (before):
Screenshot (after):

Regex - Skip characters to match

I'm having an issue with Regex.
I'm trying to match T0000001 (2, 3 and so on).
However, some of the lines it searches has what I can describe as positioners. These are shown as a question mark, followed by 2 digits, such as ?21.
These positioners describe a new position if the document were to be printed off the website.
Example:
T123?214567
T?211234567
I need to disregard ?21 and match T1234567.
From what I can see, this is not possible.
I have looked everywhere and tried numerous attempts.
All we have to work off is the linked image. The creators cant even confirm the flavour of Regex it is - they believe its Python but I'm unsure.
Regex Image
Update
Unfortunately none of the codes below have worked so far. I thought to test each code in live (Rather than via regex thinking may work different but unfortunately still didn't work)
There is no replace feature, and as mentioned before I'm not sure if it is Python. Appreciate your help.
Do two regex operations
First do the regex replace to replace the positioners with an empty string.
(\?[0-9]{2})
Then do the regex match
T[0-9]{7}
If there's only one occurrence of the 'positioners' in each match, something like this should work: (T.*?)\?\d{2}(.*)
This can be tested here: https://regex101.com/r/XhQXkh/2
Basically, match two capture groups before and after the '?21' sequence. You'll need to concatenate these two matches.
At first, match the ?21 and repace it with a distinctive character, #, etc
\?21
Demo
and you may try this regex to find what you want
(T(?:\d{7}|[\#\d]{8}))\s
Demo,,, in which target string is captured to group 1 (or \1).
Finally, replace # with ?21 or something you like.
Python script may be like this
ss="""T123?214567
T?211234567
T1234567
T1234434?21
T5435433"""
rexpre= re.compile(r'\?21')
regx= re.compile(r'(T(?:\d{7}|[\#\d]{8}))\s')
for m in regx.findall(rexpre.sub('#',ss)):
print(m)
print()
for m in regx.findall(rexpre.sub('#',ss)):
print(re.sub('#',r'?21', m))
Output is
T123#4567
T#1234567
T1234567
T1234434#
T123?214567
T?211234567
T1234567
T1234434?21
If using a replace functionality is an option for you then this might be an approach to match T0000001 or T123?214567:
Capture a T followed by zero or more digits before the optional part in group 1 (T\d*)
Make the question mark followed by 2 digits part optional (?:\?\d{2})?
Capture one or more digits after in group 2 (\d+).
Then in the replacement you could use group1group2 \1\2.
Using word boundaries \b (Or use assertions for the start and the end of the line ^ $) this could look like:
\b(T\d*)(?:\?\d{2})?(\d+)\b
Example Python
Is the below what you want?
Use RegExReplace with multiline tag (m) and enable replace all occurrences!
Pattern = (T\d*)\?\d{2}(\d*)
replace = $1$2
Usage Example:

RegEx substract text from inside

I have an example string:
*DataFromAdHoc(cbgv)
I would like to extract by RegEx:
DataFromAdHoc
So far I have figured something like that:
^[^#][^\(]+
But Unfortunately without positive result. Do you have maybe any idea why it's not working?
The regex you tried ^[^#][^\(]+ would match:
From the beginning of the string, it should not be a # ^[^#]
Then match until you encounter a parenthesis (I think you don't have to escape the parenthesis in a character class) [^\(]+
So this would match *DataFromAdHoc, including the *, because it is not a #.
What you could do, it capture this part [^\(]+ in a group like ([^(]+)
Then your regex would look like:
^[^#]([^(]+)
And the DataFromAdHoc would be in group 1.
Use ^\*(\w+)\(\w+\)$
It just gets everything between the * and the stuff in brackets.
Your answer may depend on which language you're running your regex in, please include that in your question.

How to do regular Expression in AutoIt Script

In Autoit script Iam unable to do Regular expression for the below string Here the numbers will get changed always.
Actual String = _WinWaitActivate("RX_IST2_AM [PID:942564 NPID:10991 SID:498702881] sbivvrwm060.dev.ib.tor.Test.com:30000","")
Here the PID, NPID & SID : will be changing and rest of the things are always constant.
What i have tried below is
_WinWaitActivate("RX_IST2_AM [PID:'([0-9]{1,6})' NPID:'([0-9]{1,5})' SID:'([0-9]{1,9})' sbivvrwm060.dev.ib.tor.Test.com:30000","")
Can someone please help me
As stated in the documentation, you should write the prefix REGEXPTITLE: and surround everything with square brackets, but "escape" all including ones as the dots (.) and spaces () with a backslash (\) and instead of [0-9] you might use \d like "[REGEXPTITLE:RX_IST2_AM\ \[PID:(\d{1,6})\ NPID:(\d{1,5})\ SID:(\d{1,9})\] sbivvrwm060\.dev\.ib\.tor\.Test\.com:30000]" as your parameter for the Win...(...)-Functions.
You can even omit the round brackets ((...)) but keep their content if you don't want to capture the content to process it further like with StringRegExp(...) or StringRegExpReplace(...) - using the _WinWaitActivete(...)-Function it won't make sense anyways as it is only matching and not replacing or returning anything from your regular expression.
According to regex101 both work, with the round brackets and without - you should always use a tool like this site to confirm that your expression is actually working for your input string.
Not familiar with autoit, but remember that regex has to completely match your string to capture results. For example, (goat)s will NOT capture the word goat if your string is goat or goater.
You have forgotten to add a ] in your regex, so your pattern doesn't match the string and capture groups will not be extracted. Also I'm not completely sold on the usage of '. Based on this page, you can do something like StringRegExp(yourstring, 'RX_IST2_AM [PID:([0-9]{1,6}) NPID:([0-9]{1,5}) SID:([0-9]{1,9})]', $STR_REGEXPARRAYGLOBALMATCH) and $1, $2 and $3 would be your results respectively. But maybe your approach works too.

RegEx to match string between delimiters or at the beginning or end

I am processing a CSV file and want to search and replace strings as long as it is an exact match in the column. For example:
xxx,Apple,Green Apple,xxx,xxx
Apple,xxx,xxx,Apple,xxx
xxx,xxx,Fruit/Apple,xxx,Apple
I want to replace 'Apple' if it is the EXACT value in the column (if it is contained in text within another column, I do not want to replace). I cannot see how to do this with a single expression (maybe not possible?).
The desired output is:
xxx,GRAPE,Green Apple,xxx,xxx
GRAPE,xxx,xxx,GRAPE,xxx
xxx,xxx,Fruit/Apple,xxx,GRAPE
So the expression I want is: match the beginning of input OR a comma, followed by desired string, followed by a comma OR the end of input.
You cannot put ^ or $ in character classes, so I tried \A and \Z but that didn't work.
([\A,])Apple([\Z,])
This didn't work, sadly. Can I do this with one regular expression? Seems like this would be a common enough problem.
It will depend on your language, but if the one you use supports lookarounds, then you would use something like this:
(?<=,|^)Apple(?=,|$)
Replace with GRAPE.
Otherwise, you will have to put back the commas:
(^|,)Apple(,|$)
Or
(\A|,)Apple(,|\Z)
And replace with:
\1GRAPE\2
Or
$1GRAPE$2
Depending on what's supported.
The above are raw regex (and replacement) strings. Escape as necessary.
Note: The disadvatage with the latter solution is that it will not work on strings like:
xxx,Apple,Apple,xxx,xxx
Since the comma after the first Apple got consumed. You'd have to call the regex replacement at most twice if you have such cases.
Oh, and I forgot to mention, you can have some 'hybrids' since some language have different levels of support for lookbehinds (in all the below ^ and \A, $ and \Z, \1 and $1 are interchangeable, just so I don't make it longer than it already is):
(?:(?<=,)|(?<=^))Apple(?=,|$)
For those where lookbehinds cannot be of variable width, replace with GRAPE.
(^|,)Apple(?=,|$)
And the above one for where lookaheads are supported but not lookbehinds. Replace with \1Apple.
This does as you wish:
Find what: (^|,)(?:Apple)(,|$)
Replace with: $1GRAPE$2
This works on regex101, in all flavors.
http://regex101.com/r/iP6dZ8
I wanted to share my original work-around (before the other answers), though it feels like more of a hack.
I simply prepend and append a comma on the string before doing the simpler:
/,Apple,/,GRAPE,/g
then cut off the first and last character.
PHP looks like:
$line = substr(preg_replace($search, $replace, ','.$line.','), 1, -1);
This still suffers from the problem of consecutive columns (e.g. ",Apple,Apple,").