How to use Regex to move everything onto one line in notepad++ - regex

I'm trying to figure out how to use Regex to merge the contents of my text file
(25 lines of data) into one line.
So far, I can get Notepad++ to successfully find the lines I'm looking for by making it search for (^) , but what I'm unsure of is what is what to replace it with.
Syntax-wise I'm looking for the correct script that essentially attaches the beginning of one line to the end of the previous one. Can anyone help? Thanks

Find \R and replace with empty string.
\R matches multiple linebreak styles, including most common \r\n and \n.
Search Mode must be set to Regular expression.

Highlight the lines you want to join (or use Ctrl + A to select everything)
Choose Edit → Line Operations → Join Lines from the menu or press Ctrl + J.
It will put in spaces automatically if necessary to prevent words from getting stuck together
As an alternative you can
press Ctrl+H
In Search Mode pick Extended
Find - \r\n Replace - leave it empty.

^ is an anchor, that means it does not match characters (it matches the position after a \n, or the the start of the string). So nothing to replace.
If you need to use regex (aelors answer sounds good => +1), then
[\n\r]+
and replace with nothing or a space, according to your needs.

You can replace
[\r\n]+
with an empty string (or replace \n+ if you know your newlines are \n)

You can do the following:
Highlight all the lines you wish to join, then click "Ctrl" + "J"

Related

NOTEPAD ++ List: How to put each word on new line

If this cane be done with notepad++ I'm sure it's something simple I'm looking over. Or if there is another way i'm all ears.
I have a list of 10,000 - 20,000 words. Each word is a single word. No spaces in any one word but a single space between each and every word.
All the words are in a straight line format and rap-around. I would like to put each word on a new line all the way down my txt file. I need this as I need to be able to append something on the front and back of each word. That I can do. But I do not have the the 24 hours its going to take to drop each word manually. Any ideas? Thanks!
use the Replace function search for space and replace with \n remember to use the extended option.
I tried with the extended version but it didn't work for me so I tried with regular expression it works for me. Here is step by step process:
First type ctrl + H on windows. (Find & Replace)
In find section type: [ ]+ (there is single space between the brackets)
In the replace section type: \n
Select the Regular Expression option.
Finally, click on find & replace all.
It will automatically put all words in the new line.
Hope it will work for you as well!

Remove everything before and after variable=int

I'm terrible at regex and need to remove everything from a large portion of text except for a certain variable declaration that occurs numerous times, id like to remove everything except for instances of mc_gross=anyint.
Generally we'd need to use "negative lookarounds" to find everything but a specified string. But these are fairly inefficient (although that's probably of little concern to you in this instance), and lookaround is not supported by all regex engines (not sure about notepad++, and even then probably depends on the version you're using).
If you're interested in learning about that approach, refer to How to negate specific word in regex?
But regardless, since you are using notepad++, I'd recommend selecting your target, then inverting the selection.
This will select each instance, allowing for optional white space either side of the '=' sign.
mc_gross\s*=\s*\d+
The following answer over on super user explains how to use bookmarks in notepad++ to achieve the "inverse selection":
https://superuser.com/questions/290247/how-to-delete-all-line-except-lines-containing-a-word-i-need
Substitute the regex they're using over there, with the one above.
You could do a regular expression replace of ^.*\b(mc_gross\s*=\s*\d+)\b.*$ with \1. That will remove everything other than the wanted text on each line. Note that on lines where the wanted text occurs two or more times, only one occurrence will be retained. In the search the ^.*\b matches from start-of-line to a word boundary before the wanted text; the \b.*$ matches everything from a word boundary after the wanted text until end of line; the round brackets capture the wanted text for the replacement text. If text such as abcmc_gross=13def should be matched and retained as mc_gross=13 then delete the \bs from the search.
To remove unwanted lines do a regular expression search for ^mc_gross\s*=\s*\d+$ from the Mark tab, tick Bookmark line and click Mark all. Then use Menu => Search => Bookmark => Remove unmarked lines.
Find what: [\s\S]*?(mc_gross=\d+|\Z)
Replace with: \1
Position the cursor at the start of the text then Replace All.
Add word boundaries \b around mc_gross=\d+ if you think it's necessary.

How to replace . in patterned strings with / in Visual Studio

I have lot of code in our solution like this:
Localization.Current.GetString("abc.def.gih.klm");
I want to replace it with:
Localization.Current.GetString("/abc/def/gih/klm");
the number of dots (.) is variable.
How can I do this in Visual Studio (2010)?
Edit: I want to replace strings in code (in VS 2010 editor), not when I run my application
Thank you very much
Misread your request.
If you press ctrl+shift h and put this as your find string
{Localization\.Current\.GetString\("[A-Za-z\/]+}(\.)
Then put this as your replace with:
\1/
And then in find options tick use regular expressions.
This will find the first dot and replace it. Clicking find next will get the second one etc. You will have to keep doing a replace all until they are all done. Someone can probably improve that!
As shown below
Try this in the "Replace in Files" Dialogue with "Use Regular expressions"
Find what:
{[^"]*"[^"]*}\.
If you want to be a bit more strict on the allowed characters between the quotes then try this
{[^"]*"[A-Za-z.]*}\.
this would allow only ASCII characters and dots between the quotes.
Replace with
\1/
It will find the first " in a row and replace the last dot before the next " with /
The problem is, it replaces only the last occurrence of a dot within the first set of "" in each row. So you would have to call this a few times until you get the message "The text was not found"
And be careful if there is a wanted dot between "". it will be replaced also.
EDIT
you can't use this in visual studio as it has its own flavour of regex, not the one used in the .NET regex classes, and I don't think you can do lookbehind with it.
you can use this regex:
(?<=\("[\w.]+)\.
in the find and replace, replacing by .
Breaking it down:
Match a dot (the . at the end)
Which is preceeded by (positive look behind) a bracket ( followed by a " and then any number of characters which are letters or a dot (dots don't need to be escaped in a group)
if you are sure that the text that you want to replace only ever has the Localization.Current.GetString bit then you could include that in the lookbehind of the regex:
(?<=Localization\.Current\.GetString\("[\w.]+)\.

Regex: remove lines not starting with a digit

I have been fighting this problem with the help of a RegEx cheat sheet, trying to figure out how to do this, but I give up... I have this lengthy file open in Notepad++ and would like to remove all lines that do not start with a digit (0..9). I would use the Find/Replace functionality of N++. I am only mentioning this as I am not sure what Regex implementation is N++ using... Thank you
Example. From the following text:
1hello
foo
2world
bar
3!
I would like to extract
1hello
2world
3!
not:
1hello
2world
3!
by doing a find/replace on a regular expression.
You can clear up those line with ^[^0-9].* but it will leave blank lines.
Notepad++ use scintilla, and also using its regex engine to match those.
\r and \n are never matched because in
Scintilla, regular expression searches
are made line per line (stripped of
end-of-line chars).
http://www.scintilla.org/SciTERegEx.html
To clear up those blank lines, only way is choose extended mode, and replace \n\n to \n, If you are in windows mode change \r\n\r\n to \r\n
[^0-9] is a regular expression that matches pretty much anything, except digits. If you say ^[^0-9] you "anchor" it to the start of the line, in most regular expression systems. If you want to include the rest of the line, use ^[^0-9].+.
^[^\d].* marks a whole line whose first character is not a digit. Check if there are really no whitespaces in front of the digits. Otherwise you'd have to use a different expression.
UPDATE:
You will have to do ot in two steps. First empty the lines that do not start with a digit. Then remove the empty lines in extended mode.
One could also use the technique of bookmarking in Notepad++. I started benefiting from this feature (long time present but only more recently made somewhat more visible in the UI) not very long ago.
Simply bring up the find dialogue, type regex for lines not starting with digit ^\D.*$ and select Mark All. This will place blue circles, like marbles, in the left gutter - these are line bookmarks. Then just select from main menu Search -> Bookmark -> Remove bookmarked lines.
Bookmarks are cool, you could extract these lines by simply selecting to copy bookmarked lines, opening new document and pasting lines there. I sometimes use this technique when reviewing log files.
I'm not sure what you are asking. but the reg exp for finding the lines with a digit at the beginning would be
^\d.*
you can remove all the lines that match the above or alternatly keep all the lines that match this expression:
^[^\d].*

Find CRLF in Notepad++

How can I find/replace all CR/LF characters in Notepad++?
I am looking for something equivalent to the ^p special character in Microsoft Word.
[\r\n]+ should work too
Update March, 26th 2012, release date of Notepad++ 6.0:
OMG, it actually does work now!!!
Original answer 2008 (Notepad++ 4.x) - 2009-2010-2011 (Notepad++ 5.x)
Actually no, it does not seem to work with regexp...
But if you have Notepad++ 5.x, you can use the 'extended' search mode and look for \r\n. That does find all your CRLF.
(I realize this is the same answer than the others, but again, 'extended mode' is only available with Notepad++ 4.9, 5.x and more)
Since April 2009, you have a wiki article on the Notepad++ site on this topic:
"How To Replace Line Ends, thus changing the line layout".
(mentioned by georgiecasey in his/her answer below)
Some relevant extracts includes the following search processes:
Simple search (Ctrl+F), Search Mode = Normal
You can select an EOL in the editing window.
Just move the cursor to the end of the line, and type Shift+Right Arrow.
or, to select EOL with the mouse, start just at the line end and drag to the start of the next line; dragging to the right of the EOL won't work.
You can manually copy the EOL and paste it into the field for Unix files (LF-only).
Simple search (Ctrl+F), Search Mode = Extended
The "Extended" option shows \n and \r as characters that could be matched.
As with the Normal search mode, Notepad++ is looking for the exact character.
Searching for \r in a UNIX-format file will not find anything, but searching for \n will. Similarly, a Macintosh-format file will contain \r but not \n.
Simple search (Ctrl+F), Search Mode = Regular expression
Regular expressions use the characters ^ and $ to anchor the match string to the beginning or end of the line. For instance, searching for return;$ will find occurrences of "return;" that occur with no subsequent text on that same line. The anchor characters work identically in all file formats.
The '.' dot metacharacter does not match line endings.
[Tested in Notepad++ 5.8.5]: a regular expression search with an explicit \r or \n does not work (contrary to the Scintilla documentation).
Neither does a search on an explicit (pasted) LF, or on the (invisible) EOL characters placed in the field when an EOL is selected.
Advanced search (Ctrl+R) without regexp
Ctrl+M will insert something that matches newlines. They will be replaced by the replace string.
I recommend this method as the most reliable, unless you really need to use regex.
As an example, to remove every second newline in a double spaced file, enter Ctrl+M twice in the search string box, and once in the replace string box.
Advanced search (Ctrl+R) with Regexp.
Neither Ctrl+M, $ nor \r\n are matched.
The same wiki also mentions the Hex editor alternative:
Type the new string at the beginning of the document.
Then select to view the document in Hex mode.
Select one of the new lines and hit Ctrl+H.
While you have the Replace dialog box up, select on the background the new replacement string and Ctrl+C copy it to paste it in the Replace with text input.
Then Replace or Replace All as you wish.
Note: the character selected for new line usually appears as 0a.
It may have a different value if the file is in Windows Format. In that case you can always go to Edit -> EOL Conversion -> Convert to Unix Format, and after the replacement switch it back and Edit -> EOL Conversion -> Convert to Windows Format.
It appears that this is a FAQ, and the resolution offered is:
Simple search (Ctrl+H) without regexp
You can turn on View/Show End of Line
or view/Show All, and select the now
visible newline characters. Then when
you start the command some characters
matching the newline character will be
pasted into the search field. Matches
will be replaced by the replace
string, unlike in regex mode.
Note 1: If you select them with the
mouse, start just before them and drag
to the start of the next line.
Dragging to the end of the line won't
work.
Note 2: You can't copy and paste
them into the field yourself.
Advanced search (Ctrl+R) without regexp
Ctrl+M will insert something that matches newlines. They will be replaced by the replace string.
On the Replace dialog, you want to set the search mode to "Extended". Normal or Regular Expression modes wont work.
Then just find "\r\n" (or just \n for unix files or just \r for mac format files), and set the replace to whatever you want.
I've not had much luck with \r\n regular expressions from the find/replace window.
However, this works in Notepad++ v4.1.2:
Use the "View | Show end of line" menu to enable display of end of line characters.
(Carriage return line feeds should show up as a single shaded CRLF 'character'.)
Select one of the CRLF 'characters' (put the cursor just in front of one, hold down the SHIFT key, and then pressing the RIGHT CURSOR key once).
Copy the CRLF character to the clipboard.
Make sure that you don't have the find or find/replace dialog open.
Open the find/replace dialog.
The 'Find what' field shows the contents of the clipboard: in this case the CRLF character - which shows up as 2 'box characters' (presumably it's an unprintable character?)
Ensure that the 'Regular expression' option is OFF.
Now you should be able to count, find, or replace as desired.
Image with CRLF
Image without CRLF
The way I found it to work is by using the Replace function, and using "\n", with the "Extended" mode. I'm using version 5.8.5.
In 2013, v6.13 or later, use:
Menu Edit → EOL Conversion → Windows Format.
To find any kind of a line break sequence use the following regex construct:
\R
To find and select consecutive line break sequences, add + after \R: \R+.
Make sure you turn on Regular expression mode:
It matches:
U+000DU+000A -CRLF` sequence
U+000A - LINE FEED, LF
U+000B - LINE TABULATION, VT
U+000C - FORM FEED, FF
U+000D - CARRIAGE RETURN, CR
U+0085 - NEXT LINE, NEL
U+2028 - LINE SEPARATOR
U+2029 - PARAGRAPH SEPARATOR
Assuming it has a "regular expressions" search, look for \r\n. I prefer \r?\n, because some files don't use carriage returns.
EDIT: Thanks for the feedback, whoever voted this down. I have learned that... well, nothing, because you provided no feedback. Why is this wrong?
Use the advanced search option (Ctrl + R) and use the keyboard shortcut for CRLF (Ctrl + M) to insert a carriage return.
If you need to do a complex regexp replacement including \r\n, you can workaround the limitation by a three-step approach:
Replace all \r\n by a tag, let's say #GO# → Check 'Extended', replace \r\n by #GO#
Perform your regexp, example removing multiline ICON="*" from an html bookmarks → Check regexp, replace ICON=.[^"]+.> by >
Put back \r\n → Check 'Extended', replace #GO# by \r\n
Go to View--> Show symbol-->Show all character
// Its worked for me
Make this setting. Menu-> View-> Show Symbol-> uncheck Show End of the Line
I opened the file in Notepad++ and did a replacement in a few steps:
Replace all "\r\n" with " \r\n"
Replace all "; \r\n" with "\r\n"
Replace all " \r\n" with " "
This puts all the breaks where they should be and removes those that are breaking up the file.
It worked for me.
I was totally unable to do this in NP v6.9.
I found it easy enough on Msoft Word (2K).
Open the doc, go to edit->replace.
Then in the bottom of the search box, click "more" then find the "Special" button and they have several things for you. For Dos style, I used the "paragraph" one. This is a cr lf pair in windows land.
Just do a \r with a find and replace with a blank in the replace field so everything goes up to one line. Then do a find and replace (in my case by semi colon) and replace with ;\n
:)
-T&C
To change a document of separate lines into a single line, with each line forming one entry in a comma separated list:
ctrl+f to open the search/replacer.
Click the "Replace" tab.
Fill the "Find what" entry with "\r\n".
Fill the "Replace with" entry with "," or ", " (depending on preference).
Un-check the "Match whole word" checkbox (the important bit that eludes logic).
Check the "Extended" radio button.
Click the "Replace all" button.
These steps turn e.g.
foo bar
bar baz
baz foo
into:
foo bar,bar baz,baz foo
or: (depending on preference)
foo bar, bar baz, baz foo
Maybe you can use TextFX plugins
In TextFX, go to textfx edit → delete blank lines