Notepad++ matchin end of line in regexp - regex

I want to transform this
a
b
b
into this
a
b
b
number of empty lines is variable and can be pretty huge. Empty lines contains spaces. I want to use a regexp like \r\n( *\r\n)+, but notepad++ seems not to like those special characters in regexp, tryed also \\r\\n( *\\r\\n)+
Please note that empty lines may contain spaces, so the correct regexp would be something like \\r\\n( *\\r\\n)+

You can do 'replace all' multiple times on
\r\n\r\n -> \r\n
That's with 'Extended' option selected, not 'Regular expression'.
If the empty line contains spaces, then first replace all lines with only spaces with nothing using regex: ^\s+$ -> ''. Then to the extended replacement above.
Alternatively:
You can also replace all \r\n with some sequence of characters that doesn't exists in the document, e.g. ### then use the following regex replacement : '###(\s*###)+' -> '###' and finally replace back the sequence ('###') with \r\n.

Related

Regex remove everything else except one number

I have a string consists of multiple lines (output from powershell) like
....
junk line
junk line
MyVaraible=xxxx
junk line
junk line
....
I need to use one regex to get rid of all the junk lines and extract the variable value.
It is super easy if I can loop thru all the lines where I can just do
"MyVaraible=(\d+)" replace with "$1"
But I'm being super restricted by this ancient system where one regex replacement is all I am allow to do.
You may use this regex based replacement.
Search using this regex:
/^\w+=(.*)$|.*(\n|\z)/
Replace using back-reference:
$1
RegEx Demo
RegEx ^\w+=(.*)$|.*(\n|\z) matches a name-value pair separated by = or it matches a full line followed by line-break or end of string.

regex match file with multiple extension

I have several strings like this
XYZ_TEST_2017.txt
ASD_TEST_2017.txt.tmp
I need to extract only those strings ending with .txt
So I'm using this regex:
[A-Z]{3}_TEST_[0-9]{4}.txt
However I still get the strings with multiple extensions like the second one (.txt.tmp)
See my regex demo.
How can I handle it?
To have your regex match everything up to the end, append an "end-of-text marker" ($) to your pattern like this:
[A-Z]{3}_TEST_[0-9]{4}\.txt$
As you may have noticed, I also escaped the dot, otherwise this filename would match as well:
SOM_TEST_1234Etxt
The dot (.) would match any character (depending on your flags, even newline and carriage return), in this case, the E before txt.

Remove <space><comma> but not <space><space><comma>

I have a CSV file that has un-encapsulated text strings, some that contain commas. This of course throws off the CSV parser.
My CSV has the following patterns:
A column with no value will contain 2 spaces
A column with a value will look like <comma><value><comma>, with no spaces between the value and the commas.
All of the errant commas that I need to remove (that are contained in text strings) are either preceded by or followed by a single space. Example:
<somevalue,Check this out, I think you'll like it.,<somevalue>
I need to regex to replace that <space><comma> with just a <space> or a <hyphen>. But I can't just search on comma> because that will catch all of the valid instances.
You can use the following to match:
(?<![ ])[ ],
And replace with '' (empty string)
Another option would be to match on non-space followed by space and comma and replace with the non-space:
... -replace '(^|[^ ]) ,', '$1'

Remove multiple commas at the end of lines using Notepad++ regex replace

I have a text below with data in a CSV file
2,3
4,5
6,7
When I save an open this in notepad++, it has extra commas like
2,3,,,,
4,5,,
6,7,,,,,
like you see, there are variable number of leading commas,
I tried a regex match using:
/,{2,}/
I have selected the regular expressions combo-box from the search mode in ctrl + H Replace box.
For some reason this is not working. What do I need to do to match multiple comma and not get rid of single comma?
Is there a better way to get this done in notepad++?
Regex:
,{2,}$
Replacement string:
empty string
This will replace two or more trailing commas with an empty string. To remove all the trailing commas then use ,+$ regex.
\d+(?:,\d+)?\K.*$
You can use this.Replace by empty string.This will work with data like 2,3,
See demo.
https://regex101.com/r/iS6jF6/9

regex in Notepad++ to remove blank lines

I have multiple html files and some of them have some blank lines, I need a regex to remove all blank lines and leave only one blank line.. So it removes anything more than one blank line, and leave those that are just one or none (none like in having text in them).
I need it also to consider lines that are not totally blank, as some lines could have spaces or tabs (characters that doesn't show), so I need it to consider these lines with the regex to be removed as long as it is more than one line..
Search for
^([ \t]*)\r?\n\s+$
and replace with
\1
Explanation:
^ # Start of line
([ \t]*) # Match any number of spaces or tabs, capture them in group 1
\r?\n # Match one linebreak
\s+ # Match any following whitespace
$ # until the last possible end of line.
\1 will then contain the first line of whitespace characters, so when you use that as the replacement string, only the first line of whitespace will be preserved (excluding the linebreak at the end).
This worked for me on notepad++ v6.5.1. UNICODE windows 7
Search for: ^[ \t]*\r\n
Replace with: nothing, leave blank
Search mode: Regular expression.
search for (\r?\n(\t| )*){3,}, replace by \r\n\r\n, check "Regular expression" and ". matches newline".
Tested with Notepad++ 6.2
This will replace the successive blank lines containing white spaces (or not) and replace it with one new line.
Search for
(\s*\r?\n){3,}
replace with
\r\n
You can find it yourself what you need to replace with
\n\n OR \n\r\n or \r\n\r\n etc ... now you can even modify your regular expression ^([ \t]*)\r?\n\s+$ according to your need.
I tested any of the above suggestions, always was either too less or to much deleted. So that either you got no blank line where at least one was beforehand or deleted not enough (whitespaces was left, etc.). Unfortunately I cannot write comments yet. Tested both with 6.1.5 and updated to 6.2 and tested again. depending on how mayn files there are, I would suggest use
Edit->Blank Operations->Trim trailing whitespace
Followed by Ctrl+A and
TextFX -> TextFX Edit -> Delete surplus blank lines
A Macro I tried to record didn't work. Theres even a macro for just remove trailing whitespace (Alt+Shift+S, see Settings | Shortcut Mapper... | Macros). There's a
Edit->Blank Operations->Remove unnecessary EOL and whitespace
but that deletes every EOL and puts everything in a single line.
In notepad++ v8.4.7 there is the option:
Edit > Line Operations > Remove Empty Lines (Containing Blank characters)
or
Edit > Line Operations > Remove Empty Lines
So there is no need to use a regular expressions for this. But this only works for one file at a time.
I looked for ^\r\n and click "Replace All" with nothing (empty) in "Replace with" textbox.