I have some tex files with \section{text} and \subsection{text}, etc. And I want to convert them to # text and ## text in markdown files using regular expressions in Notepad++. How can I achieve that?
Open the replace window with Ctrl + H. Check the radio button "Regular Expression" and search for:
\\section\{([^}]*)}
And replace with:
# \1
For subsections:
\\subsection\{([^}]*)}
## \1
What we're doing:
\\ is an escaped backslash matching the litteral backslash of your expression
{ needs to be escaped as well otherwise it would be recognized as quantifier, hence \{
([^}]*) is a group made of 0 or more characters that are NOT }
\1 is a reference to the first and only group of our regular expression
It can be done in a single pass:
Ctrl+H
Find what: \\(sub)?section{([^}]*)}
Replace with: (?1#)# $2
CHECK Wrap around
CHECK Regular expression
UNCHECK . matches newline
Replace all
Explanation:
\\ # a backslash, have to be escaped
(sub)? # group 1, literally "sub", optional
section{ # literally
([^}]*) # group 2, 0 or more any character that is not "}"
} # "}" character
Replacement:
(?1#) # conditional replace, if group 1 exists, print a "#"
# $2 # "#", a space and content of group 2
Screenshot (before):
Screenshot (after):
Related
This is my text
BROKEN This is a "sentence".
This sentence is an actual normal sentence.
I wish to replace/filter the quotation marks out of every line that has the word BROKEN in it
I thought this would be simple but I couldn't do it
my regex
(?=BROKEN)"
could I get some help?
If you also want to match double quotes before the word BROKEN, you can skip the whole line that does not contain the word.
Find what:
^(?!.*\bBROKEN\b).*\R?(*SKIP)(*F)|"
Replace with: (leave empty)
Explanation
^ Start of string
(?!.*\bBROKEN\b) Negative lookahead, assert that the word BROKEN does not occur
.*\R?(*SKIP)(*F) Match the whole line including an optional newline and skip the match
| Or
" Match a double quote
See a regex101 demo.
Before
After
Ctrl+H
Find what: (?:^.*?\bBROKEN\b|\G(?!^))[^"\r\n]*\K"
Replace with: LEAVE EMPTY
TICK Match case
TICK Wrap around
SELECT Regular expression
UNTICK . matches newline
Replace all
Explanation:
(?: # non capture group
^ # beginning of line
.*? # 0 or more any character but newline
\bBROKEN\b # literally
| # OR
\G # restart from last match position
(?!^) # not at the beginning of line
) # end group
[^"\r\n]* # 0 or more any character that is not a quote or linebreak
\K # forget all we have seen until this position
" # quote
Screenshot (before):
Screenshot (after):
I am trying to find in the Notepad++ strings like this:
'',
And convert them into this:
'',
I've made a regular expression to crop the string beginning with cards/ and ending with </a>:
(cards/)([^\s]{1,50})(([\s\.\?\!\-\,])(\w{1,50}))+(\.mp3"></a>)
Or an alternative approach:
(cards/)([^\s]{1,50})([\s\.\?\!\-\,]{0,})([^\s]{1,50})
Both work fine for search, but I can't get the replacement.
The problem is that the number of words in a sentence may vary.
And I can't get the ID of sub-expressions in the double parentheses.
The following format of replacement: \1\2\3... doesn't work, as I can't get the correct ID of the sub-expressions in the double parentheses.
I tried to google the topic, but couldn't find anything. Any advice, link or best of all a full replacement expression will be very much appreciated.
This will replace all spaces after /cards/ with a hyphen and lowercase the filename.
Ctrl+H
Find what: (?:href="/mp3files/cards/|\G)\K(?!\.mp3)(\S+)(?:\h+|(\.mp3))
Replace with: \L$1(?2$2:-)
CHECK Wrap around
CHECK Regular expression
Replace all
Explanation:
(?: # non capture group
href="/mp3files/cards/ # literally
| # OR
\G # restart fro last match position
) # end group
(?!\.mp3) # negative lookahead, make sure we haven't ".mp3" after this position
\K # forget all we have seen until this position
(\S+) # group 1, 1 or more non spaces
(?: # non capture group
\h+ # 1 or more horizontal spaces
| # OR
(\.mp3) # group 2, literally ".mp3"
) # end group
Replacement:
\L$1 # lowercase content of group 1
(?2 # if group 2 exists (the extension .mp3)
$2 # use it
: # else
- # put a hyphen
) # endif
Screenshot (before):
Screenshot (after):
I'm trying to write a regex that matches strings as the following:
translate("some text here")
and
translate('some text here')
I've done that:
preg_match ('/translate\("(.*?)"\)*/', $line, $m)
but how to add if there are single quotes, not double. It should match as single, as double quotes.
You could go for:
translate\( # translate( literally
(['"]) # capture a single/double quote to group 1
.+? # match anything except a newline lazily
\1 # up to the formerly captured quote
\) # and a closing parenthesis
See a demo for this approach on regex101.com.
In PHP this would be:
<?php
$regex = '~
translate\( # translate( literally
([\'"]) # capture a single/double quote to group 1
.+? # match anything except a newline lazily
\1 # up to the formerly captured quote
\) # and a closing parenthesis
~x';
if (preg_match($regex, $string)) {
// do sth. here
}
?>
Note that you do not need to escape both of the quotes in square brackets ([]), I have only done it for the Stackoverflow prettifier.
Bear in mind though, that this is rather error-prone (what about whitespaces, escaped quotes ?).
In the comments the discussion came up that you cannot say anything BUT the first captured group. Well, yes, you can (thanks to Obama here), the technique is called a tempered greedy token which can be achieved via lookarounds. Consider the following code:
translate\(
(['"])
(?:(?!\1).)*
\1
\)
It opens a non-capturing group with a negative lookahead that makes sure not to match the formerly captured group (a quote in this example).
This eradicates matches like translate("a"b"c"d") (see a demo here).
The final expression to match all given examples is:
translate\(
(['"])
(?:
.*?(?=\1\))
)
\1
\)
#translate\(
([\'"]) # capture quote char
((?:
(?!\1). # not a quote
| # or
\\\1 # escaped one
)* #
[^\\\\]?)\1 # match unescaped last quote char
\)#gx
Fiddle:
ok: translate("some text here")
ok: translate('some text here')
ok: translate('"some text here..."')
ok: translate("a\"b\"c\"d")
ok: translate("")
no: translate("a\"b"c\"d")
You can alternate expression components using the pipe (|) like this:
preg_match ('/translate(\("(.*?)"\)|\(\'(.*?)\'\))/', $line, $m)
Edit: previous also matched translate("some text here'). This should work but you will have to escape the quotes in some languages.
I have a snippet of text from EDI X12. I am trying to find lines where a BBQ segment is followed by another BBQ segment. I want to replace all BBQ segments in the second line with BBB
Orig text
HI*BBR<0Y6D0Z1<D8<20190816~
HI*BBQ<05BC0ZZ<D8<20190806*BBQ<05BB0ZZ<D8<20190729*BBQ<06UM07Z<D8<20190729~
HI*BBQ<0JBL0ZZ<D8<20190809*BBQ<0J9N0ZZ<D8<20190816*BBQ<0KBS0ZZ<D8<20190816~
HI*BI<71<RD8<20190716-20190722~
Needs to become
HI*BBR<0Y6D0Z1<D8<20190816~
HI*BBQ<05BC0ZZ<D8<20190806*BBQ<05BB0ZZ<D8<20190729*BBQ<06UM07Z<D8<20190729~
HI*BBB<0JBL0ZZ<D8<20190809*BBB<0J9N0ZZ<D8<20190816*BBB<0KBS0ZZ<D8<20190816~
HI*BI<71<RD8<20190716-20190722~
This targets what I am looking for in capturing group 3, but how to replace BBQ with BBB within that group?
(^HI\*BBQ.+?~\r\n)(^HI\*)(BBQ.+?~\r\n)
Thanks for any ideas!
Ctrl+H
Find what: (?:^HI\*BBQ\b.+?~\RHI\*BB|\G(?!^).*?\bBB)\KQ\b
Replace with: B
CHECK Match case
CHECK Wrap around
CHECK Regular expression
UNCHECK . matches newline
Replace all
Explanation:
(?: # non capture group
^ # begining of line
HI\*BBQ # literally
.+? # 1 or more any character but newline
~ # a tilde
\R # any kind of linebreak
HI\*BB # literally
| # OR
\G # restart from last match position
(?!^) # not at the beginning of line
.*?BB # 0 or more any character but newline, not greedy, followed by BB
) # end group
\K # forget all we have seen until this position
Q # the letter Q
Screen capture (before):
Screen capture (after):
Using a regex in Notepad++ I am trying to replace 53 characters on a line with spaces:
Find: (^RS.{192})(.{53})(.{265})
Replace: \1(\x20){53}\3
It's replacing group \2 with " {53}" but what I want is 53 spaces.
How do you do this?
Replacement terms are not regex expressions, except they may use back references.
Just code 53 literal spaces:
Replace: \1 \3
A bit tedious, but it works.
space is \s
which means you need to use \s{53}
Assuming there is ALLWAYS RS and 192 characters before and 265 after
Ctrl+H
Find what: (?:^RS.{192}|\G)\K.(?=.{265,}$)
Replace with: # a space
check Wrap around
check Regular expression
UNCHECK . matches newline
Replace all
Explanation:
(?: # start non capture group
^ # beginning of line
RS # literally RS
.{192} # 192 any character
| # R
\G # restart from last match position
) # end group
\K # forget all we've seen until this position
. # 1 any character
(?= # positive lookahead, zero-length assertion to make sure we have after:
.{265,} # at least 256 any characters
$ # end of line
) # en lookahead
Replacement:
% # the character to insert
Given shorter line to illusrate:
RSabcdefghijklmnopqrstuvwxyz
Result for given example:
RSabcdefghij qrstuvwxyz
Screen shot: