Regex to strip single comments and muti-line comments in Notepad++ [duplicate] - regex

This question already has answers here:
How to match c-style block comments in Notepad++ with a regex?
(2 answers)
Closed 9 years ago.
the followings :
// comments
/******
comments
*******/
is it possible to have a regex for them ?

As the comments say, its not possible to strip comments in a correct way with regexes. But maybe its still enough for you to use the following regular expressions:
^\s*//.*$
/\*.*?\*/

You can do this with a simple hack. Select Extended mode and then replace all \r\n with a character/character-sequence that does not occur in your file and that which will match .*. Now change back to Regular Expression mode and apply the regular expression (given by morja) to do your replace. Now replace back the special character/character-sequence with \r\n.

#Mohammad Currently you cannot do this (match multiline) in Notepad++.
This is because matching newlines is possible in Extended search mode, and regular expressions are available in Regexp search mode.
You could however combine different steps and do what you want as pointed by other answers.

The easiest solution is not to use regex from Notepad++, you sould only export as rtf (plugins --> nppexport --> export to RTF) then open with Microsoft Word or other that support format searching, so with that feature you can search and replace the green values only.
I hope it helps.

Related

Notepad++ How to CUSTOM Regex this content [duplicate]

This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 3 years ago.
I have below text:
_pranay_:pranay:104.144.219.145:3128
_ridhoo:rihdonk:104.144.224.242:3128
_shintna_10:shinhana:104.144.235.149:3128
_waled_jr_:ismail:104.144.241.222:3128
Which represent USER:PASS:PROXY
now I want to use a regular expression to remove the USER and PASS and keep the proxy.
Output like:
104.144.219.145:3128
104.144.224.242:3128
104.144.235.149:3128
104.144.241.222:3128
I've tried my best with failed attempts. am not that good in Regex. I wish somebody who can help me out. Thank you.
You can use https://regexr.com/ to play around with regex. For example, the expression below captures the part you want to remove:
([_a-zA-Z]+[0-9]*:)
Or, try the expression below to match the parts you want to keep;
([\d+\.]*:[0-9]{4})
As I'm no regex-expert, there might be better expressions available. But I recommend the link I provided to learn some regex.
You can maybe simply replace,
^[^:\r\n]*:[^:\r\n]*:
with an empty string.
Demo
If you wish to simplify/modify/explore the expression, it's been explained on the top right panel of regex101.com. If you'd like, you can also watch in this link, how it would match against some sample inputs.

Regular Expression Replace on Notepad++ [duplicate]

This question already has answers here:
Notepad++ v4.2.2. regular expressions to match and replace all text between two tags
(2 answers)
Closed 3 years ago.
I need a regular expression to replace the value in XML tags. I need to find * and replace it with XXXXX. I made an attempt to do this but its giving me "invalid regex".
<TAG>\('(.*?'\)</TAG>
// replace with:
<TAG>XXXXX</TAG>
I suspect that your actual starting content is something like this:
<TAG>some content here</TAG>
If you want to mask the content of such tags, you may try the following find and replace, in regex mode:
Find: <TAG>(.*?)</TAG>
Replace: <TAG>XXXXX</TAG>
Demo
Note that in general it is not desirable to manipulate nested content like XML/HTML using regex. But sometimes, e.g. when using tools like NPP, we are forced to do this. My answer should work fine assuming you are only targeting <TAG> elements which have no other children tags inside of them.

Non-greedy regexp matching too much in pandoc-generated markdown file [duplicate]

This question already has answers here:
Regular expression to get text between square brackets including disparity?
(4 answers)
Closed 3 years ago.
The Problem
I'm trying to write a simple intermediary step in a Pandoc workflow. I have an original document in .docx which I'm converting to .md using the --track-changes switch (see Pandoc reader options for more information) to produce a markdown file which has MS word insertions/deletions/comments wrapped in span tags, e.g.
[Insertion text]{.insertion id="1" author="Jamie Bowman" date="2019-04-01T11:05:00Z"}
[Deletion text]{.deletion id="1" author="Jamie Bowman" date="2019-04-01T11:05:00Z"}
[Comment body]{.comment-start id="1" author="Jamie Bowman" date="2019-04-01T11:05:00Z"}[]{.comment-end id="1"}
I want to run a regexp find and replace on the markdown file which effectively 'accepts' insertions and deletions but leaves the comment spans. (This is so when I convert back to .docx, I have a clean .docx file with comments only.)
What I've tried
I have been able to accept all insertion spans and delete all deletion spans, but only when the body text does not carry across more than one line. My attempt at matching across new lines matches too much and I can't work out how to match the exact text only.
The following regexp matches almost all deletions which I can then replace with nothing:
Find: \[(.*?)\]{.deletion(.|\n)*?}
Replace:
Same for insertions which I can then use a backreference to retain the text but remove the span:
Find: \[(.*?)\]{.insertion(.|\n)*?}
Replace: $1
The patterns are matching too much, though, as you can see here.
Please let me know if anything is unclear. I've been working on this quite a bit today and it's difficult to explain the problem plainly! Thanks in advance.
The following regex should match the deletion fragments:
\[[^[]*?\]{\.deletion.*?}
The regex for the insertions are mostly the same, except you have to have a capturing group ([^[]*?\):
\[([^[]*?\)]{\.insertion.*?}

How to use RegEx (in Notepad++) to remove unnecessary information? [duplicate]

This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 5 years ago.
I have opened a list of log files in Notepad++, and would like to use Regular Expressions to remove everything (on each line), which precedes the log file name itself (as attached).
If anybody can offer advice on how to go about removing the unnecessary information on each line (using RegEx) it would be greatly appreciated.
Kind Regards,
Davo
enter image description here
Try these steps:
Ctrl+h
in find type .*IN
In replace type IN
Select Regular Expression checkbox.
Click Replace All
In this case, you don't even need to get fancy with search replace, just use block selection and press delete. Block selection can be done like normal selection but hold down the ALT key

Capture everything after one word [duplicate]

This question already has an answer here:
Learning Regular Expressions [closed]
(1 answer)
Closed 6 years ago.
I am trying to make a regular expression capture any words in the specific line after the word Attachment:
This question is for work, so it is not a homework or test question. I took the paragraph below as an example from www.regular-expressions.info. I did not major in computers but Psychology so this is completely foreign to me. I've read the manuals for the last two days, and because this is going over my head, I don't know how to begin.
I have a task which involves me linking the attachments to a specific file with the same name saved in a folder (at least 500 attachments) on Adobe PDF. What I did before was to manually select the words and link it to a specific file in a folder, but it is tedious to do when they can go up to 500 attachments.
I was aware of an application plug-in called EVERMAP that you can download for Adobe to automatically link specific words to a specific file in a folder. However, it requires me to use regular expressions which again, I don't know how to use.
I will bold the words I want to capture in the paragraph below.
The repetition operator manual expand the match as far as they, and only come back if they must to satisfy the remainder.
Attachment: The repetition operator manual
The asterisk or star tells the engine to attempt to match the preceding token zero or more times. The plus tells the engine to attempt to match the preceding token once or more.
Attachment: Asterisk and stars engine
Attachment: (.+) should work in your case unless there are other exceptions to this rule. The regex simply tells the parser to capture 1 or more character after the word Attachment:. See here for the sample
Like #Kevin said, the Regex is simple. Use Attachment: (.+).
Maybe you are confused on how to use Regex. I don't know about the Evermap plugin, but you can copy all the text from the PDF to Sublime Text (text editor to open .txt but with a lot of features) and do Regex part there. And then, since you are not a programmer, you should remove other irrelevant data. So the Regex will be:
`^\s*Attachment:\s*(.+)$|^(?!Attachment:).+$`
And replace it with:
`\1`
\1 is a variable containing group value caught in ()
In Sublime Text find Find and Replace, then apply the Regex there. Don't forget to turn on the Regex mode.