RegEx in Notepad++ to find and replace strings - regex

I'm trying to find and replace strings using RegEx on Notepad++, but cannot identify the right expression to do it:
Here is the data:
TRAIN-II
TRAIN
TRAIN-I
AIRPLANE-II
AIRPLANE
AIRPLANE-I
SHIP-II
SHIP
SHIP-I
Well, I want to keep only the string which has "-II" as suffix. In simpler words, I want to retain only:
TRAIN-II
AIRPLANE-II
SHIP-II
Can anyone please help?

Ctrl+H
Find what: ^.*(?<!-II)(?:\R|\Z)
Replace with: LEAVE EMPTY
CHECK Wrap around
CHECK Regular expression
UNCHECK . matches newline
Replace all
Explanation:
^ # beginning of line
.* # 0 or more any character but newline
(?<!-II) # negative lookbehind, make sure we haven't -II before the following
(?:\R|\Z) # non capture group, any kind of linebreak (i.e. \r, \n, \r\n) or end of file
Screenshot (before):
Screenshot (after):

Use a regex to mark the lines, then remove unmarked lines.
Open the search window and then the "Mark" tab.
Enter -II$ into the "Find what" box. Select "Regular expressions". Select "Match case" if the "II" must be in upper case. Select "Bookmark line". Click on "Mark all". Expect to see the wanted lines marked with a blue circle. Use menu => Search => Bookmark => Remove unmarked lines.

Related

How to encase words with quotations?

I am currently trying to convert a list of 1000 words into this format:
'known', 'buss', 'hello',
and so on.
The list i have is currently in this format:
known
worry
claim
tenuous
porter
I am trying to use notepad++ to do this, if anybody could point me in the correct direction, that would be great!
Use this if you want a comma delimited list but no extra comma at the end.
Ctrl+H
Find what: (\S+)(\s+)?
Replace with: '$1'(?2,:)
CHECK Wrap around
CHECK Regular expression
Replace all
Explanation:
(\S+) # group 1, 1 or more non spaces
(\s+)? # group 2, 1 or more spaces, optional
Replacement:
'$1' # content of group 1 enclosed in quotes
(?2,:) # if group 2 exists, add a comma, else, do nothing
Screen capture (before):
Screen capture (after):
How about replacing (\S+) with '$1'? Make sure your Regular Expression button is selected in the Find and Replace tool inside Notepad++
Explanation
(\S+) is regex for repeating non-whitespace characters (1 or more). Wrapping it in parenthesis puts it in a capture group which can be accessed in numerical order by using a dollar sign ($1).
'$1' will take that found text from the Find above and replace it with capture group #1 ($1) wrapped in single quotes '.
Sample
Input: known worry claim tenuous porter
Output: 'known' 'worry' 'claim' 'tenuous' 'porter'

Regex to match the last character if it is a dot

In OpenOffice Calc how do I remove the last character if it is a dot?
Example:
dog
cat.
rabbit
It becomes:
dog
cat
rabbit
What regex would you use to obtain that?
Regular expression:
To match a ., you have to write \., because . has a special meaning in regular expressions.
To match the end of the line, you have to write $.
Putting it all together, I would use this regular expression: \.$. This will match a dot if it's at the end of the line. Demo
Removing the matched dots:
Open "Find & Replace" window by pressing Ctrl+H
Type regular expression from above into "Search for" field, this is what you want to be replaced
Leave "Replace with" field empty, as you want to replace with nothing, which means it will remove the matched content (the dots)
Under "More/Other options", turn on "Regular expressions"*
Click on "Replace" or "Replace all" button to perform the replacements
Please see documentation of this Calc feature for further details.
In OpenOffice Calc how do I remove the last character if it is a dot?
=IF(RIGHT(A1)=".";LEFT(A1;LEN(A1)-1);A1)

RegEx : Remove line breaks if lookbehind shows a lowercase

I'm doing a CTRL+H (Find & Replace) in Notepad++
I want to find all Line breaks followed by lowercase characters in order to replace them with a space character ; thereby removing unwanted break lines in my text.
Find : \r\n+(?![A-Z]|[0-9])
Replace : insert space character here
*Make sure you selected "Match case" and "Regular expression".
It works perfect.
Now, I'd like to do the same in Microsoft Office Word documents. Any clues?
In Microsoft Word, do the following:
On the Home tab, in the Editing group, click Replace to open the Find and Replace dialog box.
Check the Use wildcards check box. If you don't see the Use wildcards check box, click More, and then select the check box.
In the Find what: box, enter the following regular expression: ([a-z])^13
In the Replace with: box, enter: \1 - thats: (backslash 1 SPACE) (don't forget the space!)
And that's it! Then click either the Replace button or the Replace All button.
Note: In MS Word, the ^13 character matches the paragraph mark at the end of each line.
Here's more information about Microsoft Word and Regular Expressions - http://office.microsoft.com/en-us/word-help/find-and-replace-text-by-using-regular-expressions-advanced-HA102350661.aspx
Edit:
Oh, the above matches lowercase letter PRECEDING line break.
If you want to match line break FOLLOWED by lowercase letter, do the following:
In the Find what: box, enter the following regular expression: ^13([a-z])
In the Replace with: box, enter: \1 - thats: (SPACE backslash 1) (don't forget the space!)
Tested both ways and they both work in Microsoft Word 2010, however documentation says that regular expressions are supported in all versions 97 - 2013.
Good luck! :)
in vscode tap on find press the keys ctrl/enter for second line then type (?=[a-z]) and in the replace add a space

Find lines not starting with " in Notepad++

I try to verify a CSV file where we had problems with line breaks.
I want to find all lines not starting with a ".
I am trying with /!^"/gim but the ! negation is not working.
How can I negate /^"/gim properly?
In regex, the ! does not mean negation; instead, you want to negate a character set with [^"]. The brackets, [], denote a character set and if it starts with a ^ that means "not this character set".
So, if you wanted to match things that are not double-quotes, you would use [^"]; if you don't want to match any quotes, you could use [^"'], etc.
With Notepad++, you should be able to search with the following to find lines that don't start with the " character:
^[^"]
If you want to highlight the full line, use:
^[^"].*
In Notepad++ you can use the very usefull negative lookahead
In your case you can try the following:
^(?!")
If you want to match wholes lines add .+ or .{1,7} or anything e.g.:
^(?!").*
will also match empty lines.
Explanation part
^ line start
(?!regexp) negative lookahead part: this means that if the regexp match, the result will not be shown
Step 1 - Match lines. Find dialog > Mark tab, you can bookmark lines that match.
Step 2 - Remove lines bookmarked OR Remove lines not bookmarked. Search > Bookmark > Remove Unmarked Lines or Remove Bookmarked lines

regex in Notepad++ to remove blank lines

I have multiple html files and some of them have some blank lines, I need a regex to remove all blank lines and leave only one blank line.. So it removes anything more than one blank line, and leave those that are just one or none (none like in having text in them).
I need it also to consider lines that are not totally blank, as some lines could have spaces or tabs (characters that doesn't show), so I need it to consider these lines with the regex to be removed as long as it is more than one line..
Search for
^([ \t]*)\r?\n\s+$
and replace with
\1
Explanation:
^ # Start of line
([ \t]*) # Match any number of spaces or tabs, capture them in group 1
\r?\n # Match one linebreak
\s+ # Match any following whitespace
$ # until the last possible end of line.
\1 will then contain the first line of whitespace characters, so when you use that as the replacement string, only the first line of whitespace will be preserved (excluding the linebreak at the end).
This worked for me on notepad++ v6.5.1. UNICODE windows 7
Search for: ^[ \t]*\r\n
Replace with: nothing, leave blank
Search mode: Regular expression.
search for (\r?\n(\t| )*){3,}, replace by \r\n\r\n, check "Regular expression" and ". matches newline".
Tested with Notepad++ 6.2
This will replace the successive blank lines containing white spaces (or not) and replace it with one new line.
Search for
(\s*\r?\n){3,}
replace with
\r\n
You can find it yourself what you need to replace with
\n\n OR \n\r\n or \r\n\r\n etc ... now you can even modify your regular expression ^([ \t]*)\r?\n\s+$ according to your need.
I tested any of the above suggestions, always was either too less or to much deleted. So that either you got no blank line where at least one was beforehand or deleted not enough (whitespaces was left, etc.). Unfortunately I cannot write comments yet. Tested both with 6.1.5 and updated to 6.2 and tested again. depending on how mayn files there are, I would suggest use
Edit->Blank Operations->Trim trailing whitespace
Followed by Ctrl+A and
TextFX -> TextFX Edit -> Delete surplus blank lines
A Macro I tried to record didn't work. Theres even a macro for just remove trailing whitespace (Alt+Shift+S, see Settings | Shortcut Mapper... | Macros). There's a
Edit->Blank Operations->Remove unnecessary EOL and whitespace
but that deletes every EOL and puts everything in a single line.
In notepad++ v8.4.7 there is the option:
Edit > Line Operations > Remove Empty Lines (Containing Blank characters)
or
Edit > Line Operations > Remove Empty Lines
So there is no need to use a regular expressions for this. But this only works for one file at a time.
I looked for ^\r\n and click "Replace All" with nothing (empty) in "Replace with" textbox.