Notepad++ highlighting anything between two square brackets - regex

I have a document containing a series of strings between hundreds of [] and I want to highlight the strings and copy the information into a spreadsheet.
I have attempted using the Find tool but cannot figure out the regex expression
The final goal of this would be to be able to copy the information in one go into a new file, or highlight it and copy into an excel spreadsheet.
Text file something like:
>X_343435353.3 words like foo bar [Wanted text]
TGATGATGCCATGCTAGCCATCGACTAGCGACTAGCATCGACTAGCATCAGCTACGACTAGCATCGACTACGA
>XP_543857836.3 other information [Text that I want]
TAGCATCGACTAGCTACTACCTGAGCGAGAAATTTTGGCTATCGACATCGACTATCGAGCACAGCTAGGAATT
>NP_3843875938.2 interesting words [Third desired text]
ATCGCATAGCGCGCTTAGAAGGCCTTAGAGGCATCATCTATCGAGCGACGATATCGCGAGGCAGCGCTATACC
The ouput I desire is as follows:
Wanted text
Text that I want
Third desired text
I am not sure if it is possible to do this in Notepad++ or if you need to use a cmd/shell tool to do it. I am using a Windows 10. The thought was that it may be possible to highlight all of the desired text with a regex that can then be copied elsewhere.

To match just the text and not the brackets:
(?<=\[).*?(?=\])
Example:
To delete everything in a document and leave just the wanted text on each line:
Set the cursor at the start of the document.
Macro, Start recording.
Ctrl-F (Find), .*?\[, Select regular expression and . matches newline.
Click Find Next and close the dialog.
Delete the highlighted text.
Ctrl-F (Find), \], Select regular expression and . matches newline.
Click Find Next and close the dialog.
Hit Enter to delete the highlighted text.
Macro, Stop recording.
Macro, Run a macro multiple times, select Until end of file.
Click Run.
Result:
Wanted text
Text that I want
Third desired text
You'll need to delete the last bit after the final match (if any) once the macro completes.

Maybe this expression,
.*\[(.*?)\][\s\S]+?([\r\n]|$)
with a replacement of $1\n also might work.
The expression is explained on the top right panel of this demo if you wish to explore/simplify/modify it.

This one is working fine for me ....
Find what: >.*?\[(.*?)\]\n.*
Replace with: $1

Related

Replace line breaks

I am using visual studio code for several things. Everything is working fine, but I cannot get one specific thing to work.
I need the ability to remove line breaks from the text.
Example:
first line
second line
Should become:
first linesecondline
Since a recent update it is possible to search for line breaks with using ^$.
It is described here: https://github.com/Microsoft/vscode/pull/314
The problem I have is that when I use this for replacing, it does actually "add" to the line break and does not "replace" it.
The latest version of VS Code has a shortcut to join lines (some may say remove breaks) from selection: CTRL + J.
I found that (at least on Windows) the solution was to use search and replace with a regular expression. Search for $\n and replace with nothing to get rid of the newlines. Note that the newline character that we want to replace is placed after the end of line matcher ($).
#tripleonard hint did not work for me (no shortcut key assigned), so what I did was first ctrl+shift+p to list all commands and then just type Join lines
I'm able to manage this with the search and replace tool and "Use Regular Expression" enabled. Search for the pattern \n$ and replace with $
In my case shorcut in VS Code was not set. It took me a while to find out what command in VS Code am I looking for. For other with same problem it is: "Join lines".
Turn on regex mode and find and replace.
Search for \n and replace with nothing.
Select the new line, and press ctrl+D (and hold it).
Then press ctrl+h, you will be able to replace it with whatever you need.
On Mac, use cmd+a to select all lines. Then, use cmd+shift+p to open commands and type Join Line and click on it.
You can use \n to search for new lines
but while finding/searching,
the Use Regular Expression option should be enabled
Press ctrl+f or ctrl + h
Copy and past this ^(\s)*$\n expression into top input field
after click on the * icon, then you can see all white lines break.
Past bottom input field = \n //one line break
That means what you want to replate in white line break
After click on the Replace or Replace All Icon button
https://bitcoden.com/answers/visual-studio-code-delete-all-blank-lines-regex

Notepad++ Regex inverse match

I'm new to Regex and trying to figure out how to remove all text from file open in Notepad++ that does not match #LCxxxx or #LAxxxx. Example below (text wanting to keep in bold):
1.In rare cases, reinstalling this MSP file can cause the Citrix Display Driver.....
[From ICAWS760WX86][#0528688]
30.This release includes an enhancement...
[From ICAWS760WX86022][#LA3014]
New Fixes in This Release
1.Windows Server 2008 R2 and Windows Server 2012 R2,...
[From ICAWS760WX86026][#LC2179]
Fixes from Replaced Hotfixes
1.If the Windows Remote Desktop Session Host....
[From ICAWS760WX86004][#LC1180]
I think this is what you're looking for:
(?:[\S\s]*?)(\#L[AC]\d{4})(?:.*)
Replace with:
$1\n
You could do a regular expression search and replace, searching for
(#L[AC]....)
where "dot matches newline" is NOT selected. Replace with
\r\n\1\r\n
That will put all the wanted pieces of text on a line on their own.
Next use the "Mark" tab in the find window. Select "Bookmark line", use the same search string as above (the capture brackets are not needed this time, but they are harmless and so can be left), and them click "Mark all". Now all the wanted lines are bookmarked. Use menu => Search => Bookmark => Remove unmarked lines.
There may be a way of doing it all in one go, but that would be a complex regular expression. The method above uses two simple steps.
remove all text from file open in Notepad++ that does not match #LCxxxx or #LAxxxx
^.*(\[#L[CA]\d+\])$|^.*$
DESCRIPTION
DEMO
https://regex101.com/r/hO1aL8/2
Notepad++
Do a search and replace like describe in the screenshot below:
Alternatively, if you want to get rid off the empty lines during the replace operation, use the regular expression below:
^[\S\s]+?(\[#L[CA]\d+\])$
\s : Whitespaces (\t,\r,\n ...)
\S : Any character except whitespaces.
Tested on Notepad 6.6.9

Multiline search replace with regexp in Eclipse

Eclipse regexp search works pretty well, so for example in search box I have this:
(?s)(myMethod.*?;)\}\);
Now I want to copy multiline text in the IDE and in replace box, for example I want to paste \1PASTE_MULTILINE_TEXT_HERE. However Eclipse does not allow me to directly copy-paste multiline text without manually inserting newline characters.
In Vim (Gvim, Macvim) it works perfectly well, keeping all the spaces; how can I do the same thing in Eclipse?
For searching multiple lines in Eclipse, you must use the 's' parameter in search expression:
(?s)someExpressionToMatchInAnyLine
For replacing with multiple lines exp you must use \R i.e:
line1\Rline2\Rline3
This will replace the matched exp with:
line1
line2
line3
Generally, the approach I've taken to doing this sort of thing is to type out what I want to use as a replacement, select that, open up the Find/Replace dialog, and copy the contents of the Find text box. I proceed from there and paste what I copied into the Replace text box. There is still a little work to be done (removing backslashes from in front of regex special characters that don't apply in the Replace box), but it gives me a hand up.

Notepad++ replace two lines with other (10) lines in open documents

I want to replace two lines with other lines in Notepad++.
The main problem is that I am not able to copy all the lines which should be replaced. Only the first line is inserted in the "Replace with:" input field if I paste all lines in the field. It seems that the line break is not correctly copied.
Selecting the lines (with the line break) which should be inserted in the "Find what:" field is quite easy because I can select them in the document and simply hit "CTRL + H".
What to do? Please no solutions how it could work with command line tools.
Regards
Albeit a bit late for an answer, I think it's OK.
You can not search for a multi line string in Notepad++ using the normal search mode.You should use the extended search mode instead.
You just have to escape the new lines.What's best, you can use Notepad++ to prepare the escaped text to be searched and replaced.
I assume you are using windows text files meaning the new line is represented with \r\n
To achieve what you want:
1.
Create a new document and paste your multiline text to be replaced
Do a replace on it using the extended search mode. Find what: \r\n Replace With:\\r\\n
The result will be your "Find what" string.
2.
Create a new document and paste your multiline replacement text
Do a replace on it using the extended search mode. Find what: \r\n Replace With:\\r\\n
The result will be your "Replace with" string.
3.
Now that you have your escaped data, do a Replace on all the open documents using the extended search mode AND the results from the previous steps.
Hope this helps.
None of these suggestions are acceptable! TextFX's Ctrl+R replace plugin falls way short.
What EVERYONE wants, everyone that wants to perform a replacement of multiline blocks of text with another multiline block of text, is this...
2 large text boxes:
Find This:
This is line one
This is LIKE two
This is line TREE
Replace with This:
This is line 1
This is line 2
This is line 3
A checkbox for "All Open Documents"
And/Or...
Option for "Find-Replace in all Files of Type"
Then a GO button............
How hard could that be to create in Notepad++? It was done way back in 1998, a freeware utility called Search-Replace 98.
UPDATE:
The plugin suggested by numediaweb DOES EXACTLY what I needed! Hats off to numediaweb for the tip and standing applause for paul at phdesign!
ToolBucket multi-line search plugin for Notepad++
http://www.phdesign.com.au/programming/toolbucket-multi-line-search-plugin-for-notepad/
ToolBucket contains the following features:
Multi-line search and replace dialog.
Change indentation dialog.
Generate GUID
Generate Lorem Ipsum
Compute MD5 Hash
Compute SHA1 Hash
Base 64 encode
Base 64 decode
Download
The latest version is available here:
https://github.com/phdesign/NppToolBucket/downloads
For regular expressions you can use Ctrl-R, aka TextFX -> TextFX Quick -> Find/Replace.
If not check this plugin, it does what you want!
Based on the response of Nikanos Polykarpou below is my...
Notepad++ - Replace by a multiple lines string
Select the string to replace (can have multiple lines).
Follow...
Ctrl+h -> Replace (tab) -> Enable "Extended (\n, \r, \t, \0, \x...)"
... in "Replace with:" enter a string to do the replace with "\r\n" (if Windows) instead of real line breaks as this example...
"model" "models/aztec100500/flo_grass.mdl"\r\n"framerate" "10"\r\n"angles" "0 30 0"\r\n"classname" "cycler_sprite"
... do the replace!

Regex: remove lines not starting with a digit

I have been fighting this problem with the help of a RegEx cheat sheet, trying to figure out how to do this, but I give up... I have this lengthy file open in Notepad++ and would like to remove all lines that do not start with a digit (0..9). I would use the Find/Replace functionality of N++. I am only mentioning this as I am not sure what Regex implementation is N++ using... Thank you
Example. From the following text:
1hello
foo
2world
bar
3!
I would like to extract
1hello
2world
3!
not:
1hello
2world
3!
by doing a find/replace on a regular expression.
You can clear up those line with ^[^0-9].* but it will leave blank lines.
Notepad++ use scintilla, and also using its regex engine to match those.
\r and \n are never matched because in
Scintilla, regular expression searches
are made line per line (stripped of
end-of-line chars).
http://www.scintilla.org/SciTERegEx.html
To clear up those blank lines, only way is choose extended mode, and replace \n\n to \n, If you are in windows mode change \r\n\r\n to \r\n
[^0-9] is a regular expression that matches pretty much anything, except digits. If you say ^[^0-9] you "anchor" it to the start of the line, in most regular expression systems. If you want to include the rest of the line, use ^[^0-9].+.
^[^\d].* marks a whole line whose first character is not a digit. Check if there are really no whitespaces in front of the digits. Otherwise you'd have to use a different expression.
UPDATE:
You will have to do ot in two steps. First empty the lines that do not start with a digit. Then remove the empty lines in extended mode.
One could also use the technique of bookmarking in Notepad++. I started benefiting from this feature (long time present but only more recently made somewhat more visible in the UI) not very long ago.
Simply bring up the find dialogue, type regex for lines not starting with digit ^\D.*$ and select Mark All. This will place blue circles, like marbles, in the left gutter - these are line bookmarks. Then just select from main menu Search -> Bookmark -> Remove bookmarked lines.
Bookmarks are cool, you could extract these lines by simply selecting to copy bookmarked lines, opening new document and pasting lines there. I sometimes use this technique when reviewing log files.
I'm not sure what you are asking. but the reg exp for finding the lines with a digit at the beginning would be
^\d.*
you can remove all the lines that match the above or alternatly keep all the lines that match this expression:
^[^\d].*