Notepad++ and delimiters: automatically replace ``string'' by \command{string} - regex

Within Notepad++, I want to replace many instances of the type ``string'' by \command{string} where string can be any string of characters. I am fairly close to what I want to achieve with:
Find: (?<=``)(.*?)(?='')
Replace: \\command{\1}
There is still a problem. With the regex code above, instead of \command{string} I get ``\command{string}'' and I am not sure why the `` and '' are not removed?

It is because you are using lookaround assertions. Lookaround (zero-width) assertions only assert that a position can be matched and do not "consume" any characters on the string. You can use the below regular expression.
Find: ``([^']+)''
Replace: \\command{\1}

You need to wrap everything into a capture group and use that. NP++ seems to not support lookahead/behind, but you dont need that for this specific case anyway:
``([^']+)'' -> \\command{\1}
This will make sure it does not match two commands (longest match) in something like:
run ``ls -l'' or ``ls -a''

Related

Search and convert to lower case on vim

I have a code with object.attribute where attribute can be an array
example: object.SIZE_OF_IMAGE[0] or a simple string. I want to search all occurrences "object.attribute" and replace it with self.lowercase(attribute) I want a regular expression on vim to do that.
I can use that :%s/object.*/self./gc and replace it manually but it is very slow.
Here are some examples:
object.SIZE to self.size
object.SIZE_OF_IMAGE[0] to self.size_of_image[0]
You basically just need two things:
Capture groups :help /\( let you store what's matched in between \(...\) and then reference it (via \1, \2, etc.) in the replacement (or even afterwards in the pattern itself).
The :help s/\L special replacement action that makes everything following lowercase.
This gives you the following command:
:%substitute/\<object\.\(\w\+\)/self.\L\1/g
Notes:
I've established a keyword start assertion (\<) at the beginning to avoid matching schlobject as well.
\w\+ matches letters, digits, and underscores (so it fulfills your example); various alternatives are possible here.
sed -E 's/object\.([^ \(]*)(.*)/self.lowercase(\1)\2/g' file_name.txt
above command considers that your attribute is followed by space or "("
you can tweek this command based on your need
Based on your comment above that the attribute part
"finishes by space or [ or (" you could match it with:
/object\.[^ [(]*
So, to replace it with self.attribute use a capturing
group and \L to make everything lowercase:
:%s/\vobject\.([^ [(]*)/self.\L\1/g
In the command mode try this
:1,$ s/object.attribute/self.lowercase(attribute)/g

Regex - Skip characters to match

I'm having an issue with Regex.
I'm trying to match T0000001 (2, 3 and so on).
However, some of the lines it searches has what I can describe as positioners. These are shown as a question mark, followed by 2 digits, such as ?21.
These positioners describe a new position if the document were to be printed off the website.
Example:
T123?214567
T?211234567
I need to disregard ?21 and match T1234567.
From what I can see, this is not possible.
I have looked everywhere and tried numerous attempts.
All we have to work off is the linked image. The creators cant even confirm the flavour of Regex it is - they believe its Python but I'm unsure.
Regex Image
Update
Unfortunately none of the codes below have worked so far. I thought to test each code in live (Rather than via regex thinking may work different but unfortunately still didn't work)
There is no replace feature, and as mentioned before I'm not sure if it is Python. Appreciate your help.
Do two regex operations
First do the regex replace to replace the positioners with an empty string.
(\?[0-9]{2})
Then do the regex match
T[0-9]{7}
If there's only one occurrence of the 'positioners' in each match, something like this should work: (T.*?)\?\d{2}(.*)
This can be tested here: https://regex101.com/r/XhQXkh/2
Basically, match two capture groups before and after the '?21' sequence. You'll need to concatenate these two matches.
At first, match the ?21 and repace it with a distinctive character, #, etc
\?21
Demo
and you may try this regex to find what you want
(T(?:\d{7}|[\#\d]{8}))\s
Demo,,, in which target string is captured to group 1 (or \1).
Finally, replace # with ?21 or something you like.
Python script may be like this
ss="""T123?214567
T?211234567
T1234567
T1234434?21
T5435433"""
rexpre= re.compile(r'\?21')
regx= re.compile(r'(T(?:\d{7}|[\#\d]{8}))\s')
for m in regx.findall(rexpre.sub('#',ss)):
print(m)
print()
for m in regx.findall(rexpre.sub('#',ss)):
print(re.sub('#',r'?21', m))
Output is
T123#4567
T#1234567
T1234567
T1234434#
T123?214567
T?211234567
T1234567
T1234434?21
If using a replace functionality is an option for you then this might be an approach to match T0000001 or T123?214567:
Capture a T followed by zero or more digits before the optional part in group 1 (T\d*)
Make the question mark followed by 2 digits part optional (?:\?\d{2})?
Capture one or more digits after in group 2 (\d+).
Then in the replacement you could use group1group2 \1\2.
Using word boundaries \b (Or use assertions for the start and the end of the line ^ $) this could look like:
\b(T\d*)(?:\?\d{2})?(\d+)\b
Example Python
Is the below what you want?
Use RegExReplace with multiline tag (m) and enable replace all occurrences!
Pattern = (T\d*)\?\d{2}(\d*)
replace = $1$2
Usage Example:

Notepad++ Regex to find group of lines with condition

Given this example text:
<abr:rules>
<abr:ruleTypeDefinition>
<abr:code>ABB</abr:code>
<abr:ownership>
<abr:owner organization="NT" application="DCS" subapplication="FM"/>
...lines...
...........
</abr:rules>
<abr:rules>
<abr:ruleTypeDefinition>
<abr:code>ADE</abr:code>
<abr:ownership>
<abr:owner organization="NT" application="DCS" subapplication="CM"/>
...lines...
...........
</abr:rules> (end of group)
I would like to find and remove all that goes from <abr:rules> to </abr:rules> with the condition that subapplication IS NOT "CM". Organization and application are the same, <abr:code> it's any string.
What I tried so far is
<abr:rules>\n<abr:ruleTypeDefinition>\n<abr:code>[a-zA-Z0-9]{3,}<\/abr:code>\n<abr:ownership>\n<.*"(FM|PSD|SSC)"\/>\n(?s).*?\n<\/abr:rules>\n
which works but only because I know the other subapplication names.
Is there any way to do it with Regex only ?
Try the following find and replace:
Find:
<abr:rules>((?!subapplication=).)*subapplication="(?!CM")[^"]+"((?!</abr:rules>).)*</abr:rules>
Replace:
(empty string)
Demo
Note: The above pattern will only work if you enable dot in Notepad++ to match newlines. If you don't want to do that, then you may use [\S\s] instead of dot.
You should not use regex for xml, you can read why here:
https://stackoverflow.com/a/1732454/3763374
Instead you can use some parser like Xpath

How to select text between greater than and less than with an additional slash

I'm trying to select text between ></ . Example below I want "text"
>text</
but I'm unable to do so.
tried the following but it doesn't like the slash at the end of the regex
\>(.*?)\<\
I'm trying to do this in TextPad. How is this supposed to be done?
I'm ultimately wanting to delete all text between these two characters so all I'm left with is something like: <element></element>
RegEx wise, you can use 3 groupings and for the replace only use the first and 3rd group: \1\3.
Find: (>)(.*)(</)
Replace: \1\3
Try doing:
\>(.*?)\<\/
The regex that you were trying would actually have given error because you had a \ and nothing after that.
You are close.. use the following:
(>).*?(<\/)
And replace with \1\2
See DEMO
OR
You can use lookbehind and lookaheads:
(?<=>)(.*?)(?=<\/)
And replace with '' (empty string)
See DEMO

VIM - Replace based on a search regex

I've got a file with several (1000+) records like :
lbc3.*'
ssa2.*'
lie1.*'
sld0.*'
ssdasd.*'
I can find them all by :
/s[w|l].*[0-9].*$
What i want to do is to replace the final part of each pattern found with \.*'
I can't do :%s//s[w|l].*[0-9].*$/\\\\\.\*' because it'll replace all the string, and what i need is only replace the end of it from
.'
to
\.'
So the file output is llike :
lbc3\\.*'
ssa2\\.*'
lie1\\.*'
sld0\\.*'
ssdasd\\.*'
Thanks.
In general, the solution is to use a capture. Put \(...\) around the part of the regex that matches what you want to keep, and use \1 to include whatever matched that part of the regex in the replacement string:
s/\(s[w|l].*[0-9].*\)\.\*'$/\1\\.*'/
Since you're really just inserting a backslash between two strings that you aren't changing, you could use a second set of parens and \2 for the second one:
s/\(s[w|l].*[0-9].*\)\(\.\*'\)$/\1\\\2/
Alternatively, you could use \zs and \ze to delimit just the part of the string you want to replace:
s/s[w|l].*p0-9].*\zs\ze\*\'$/\\/