search and replace \n or \t with and empty space - regex

Can anyone teach me how to search and replace \n or \t and replace it with an "empty" space? I tried using the search and replace in GEDIT, but it just doesn't change. I'm using GEDIT by the way.
sample:
Hi I am \n\t\t\t\t\t\t\t\t\ John Doe`n\t\t\t\t\t` I live in BLAH BLAH.
Output:
Hi I am John Doe I live in BLAH BLAH.
How to do this?

Although this thread is now 5 years old the answers do not really suffice so here is a working answer;
The most simple way to replace newline and tab characters (\n & \t) use the backslash to tell gEdit search you don't want to search for an actual newline but for the chars that represent a new line;
\\n or \\t

Replace \s+ with a single space.
\s means 'any white space character'
+ means 'one or more'
So \s+ means 'one or more white space characters'. You'd want to replace those with a singe space.

Replace text in gedit
Open the Replace tool by clicking Search ▸ Replace or press Ctrl+H.
Enter the text that you wish to replace into the 'Search for:' field.
Enter the new, replacement text into the 'Replace with:' field.
Once you have entered the original and replacement text, select your
desired replacement options:
To replace only the next matching portion of text, click Replace.
To replace all occurrences of the searched-for text, click Replace
All.
Source

Related

RegEx in Notepad++ to find and replace strings

I'm trying to find and replace strings using RegEx on Notepad++, but cannot identify the right expression to do it:
Here is the data:
TRAIN-II
TRAIN
TRAIN-I
AIRPLANE-II
AIRPLANE
AIRPLANE-I
SHIP-II
SHIP
SHIP-I
Well, I want to keep only the string which has "-II" as suffix. In simpler words, I want to retain only:
TRAIN-II
AIRPLANE-II
SHIP-II
Can anyone please help?
Ctrl+H
Find what: ^.*(?<!-II)(?:\R|\Z)
Replace with: LEAVE EMPTY
CHECK Wrap around
CHECK Regular expression
UNCHECK . matches newline
Replace all
Explanation:
^ # beginning of line
.* # 0 or more any character but newline
(?<!-II) # negative lookbehind, make sure we haven't -II before the following
(?:\R|\Z) # non capture group, any kind of linebreak (i.e. \r, \n, \r\n) or end of file
Screenshot (before):
Screenshot (after):
Use a regex to mark the lines, then remove unmarked lines.
Open the search window and then the "Mark" tab.
Enter -II$ into the "Find what" box. Select "Regular expressions". Select "Match case" if the "II" must be in upper case. Select "Bookmark line". Click on "Mark all". Expect to see the wanted lines marked with a blue circle. Use menu => Search => Bookmark => Remove unmarked lines.

Regex remove everything before the first space occurrence in a line?

I would like to remove all the characters before the first space occurrence for each line.
Sample of initial text:
2:2 My dog is good.
1:234 My cat is bad
14:2 My frog is bad but it loves my garden.
Result must be:
My dog is good.
My cat is bad
My frog is bad but it loves my garden
What regular expression would you use to achieve this result using OpenOffice Calc or Notepad++?
Ctrl+H
Find what: ^\S+\s+(.+)$
Replace with: $1
check Wrap around
check Regular expression
DO NOT CHECK . matches newline
Replace all
Explanation:
^ : beginning of line
\S+ : 1 or more non space character
\s+ : 1 or more space character
(.+) : group 1, 1 or more any character (ie. rest of the line)
$ : end of line
Replacement:
$1 : content of group 1
Result for given example:
My dog is good.
My cat is bad
My frog is bad but it loves my garden.
Press Ctrl + h to open Find and replace dialog
Write ^.*?\s+(.*)$ in 'find what' textbox
Write $1 in 'Replace With' textbox
Check Regular Expression radio button OR press ALT+g
You can click on Find Next button to verify whether its working correctly (to find matches)
Click on Replace All button if it looks good OR press ALT+a
Explanation:
^ : Match from beginning of line
.*?\s+: Match anything, any number of times until a space (or more spaces) is encountered
(.*): Capture everything after those spaces
$: Match till end of the line
$1: Access the above captured strings from line

RegEx for expression between 2 white space characters

I have a file directory listing from an embedded target that looks like this:
Directory of D:\
D 0 19-Jan-15 16:12:16 FILE1
D 0 19-Jan-15 16:09:31 FILE2
D 0 21-Jan-15 14:10:33 FILE3
94951/218985 MB unused/total
And I am looking to only get the file names. The string in c# will look like this:
\r\nDirectory of D:\\\r\nD \t 0\t19-Jan-15 16:12:16\tFILE1\r\nD \t 0\t19-Jan-15 16:09:31\tFILE2\r\nD \t 0\t21-Jan-15 14:04:15\tFILE3\r\n94969/218985 MB unused/total\r\n
I noticed that the file names are always contained between a \t and a \r\n so i thought the easiest way to approach it would be with \t(.*?)\r\n But this will get the whole line. What is the best way to combine this with a regex to omit the first 2 \t in the line?
You can use this regex:
\t([^\t]*)\r\n
i.e. find all characters non tab characters between \t and \r\n thus giving you file names in each line.
RegEx Demo
Because file names cannot include tab characters, you can replace the . in \t(.*?)\r\n with [^\t]. Also, you can use lookarounds to not match the \t at the start and the \r at the end, eliminate the unnecessary capturing group, and change *? to +:
(?<=\t)[^\t]+(?=\r)
This regex will match a sequence of characters that does not include any tab characters, as long as the sequence is between a tab (\t) and a carriage return (\r).
You can find an online explanation and demonstration here. Note that to work on regex101, I had to change the \r to a \n; you will most likely still need the \r in your regex.
You can do it with a capturing group or like this:
(?<=\t)[^\t]+(?=[\r\n])

Regex to replace a character between two l

My text file has more than ten thousand lines. Each line starts with a word or a phrase followed by a tab and the content, such as:
[line 1] This is the first line. [tab] Here is the content.[end of line]
I want to find character s in all the words between the beginning of each line and a tab (\t), and replace it by a pipe (|) so that the text will look like:
[line 1] Thi| i| the fir|t line. [tab] Here is the content.[end of line]
Here is what I have done:
Search: ^(.*)s+(.*)?\t
Replace: \1|\2\t
It works but the problem is it does not replace s in one replace. I have to click on Replace All for several times before s in all the words is replaced.
So it comes to my question: how can I replace all the occurrences of character s in just one search and replace?
Note that I'm working on TextWrangler but I'm OK with other editors.
Thanks a lot.
You are searching for lines containing an s and do the match. Instead you should be searching for the s directly, and use lookahead to ensure that it is followed by a tab.
Search: s(?=.*\t)
Replace: |
Note that this catches all s's up to the last tab. - This will be a problem if your main content can contain tabs.
To stop catching s's after the first tab you have to cheat. Since variable length negative lookbehind doesn't work in AFAIK any regexp dialect.
However if we can ensure that the last s catches the whole line...
Search: (?:(^[^s\t]*\t.*$)|s([^s\t]*(?:(?=s.*\t)|\t.*$)))
Replace: |\1\2
This will catch the whole line in the case where no s occurs before the first tab. And put a | in front of that line. I see no way around this.

regex in Notepad++ to remove blank lines

I have multiple html files and some of them have some blank lines, I need a regex to remove all blank lines and leave only one blank line.. So it removes anything more than one blank line, and leave those that are just one or none (none like in having text in them).
I need it also to consider lines that are not totally blank, as some lines could have spaces or tabs (characters that doesn't show), so I need it to consider these lines with the regex to be removed as long as it is more than one line..
Search for
^([ \t]*)\r?\n\s+$
and replace with
\1
Explanation:
^ # Start of line
([ \t]*) # Match any number of spaces or tabs, capture them in group 1
\r?\n # Match one linebreak
\s+ # Match any following whitespace
$ # until the last possible end of line.
\1 will then contain the first line of whitespace characters, so when you use that as the replacement string, only the first line of whitespace will be preserved (excluding the linebreak at the end).
This worked for me on notepad++ v6.5.1. UNICODE windows 7
Search for: ^[ \t]*\r\n
Replace with: nothing, leave blank
Search mode: Regular expression.
search for (\r?\n(\t| )*){3,}, replace by \r\n\r\n, check "Regular expression" and ". matches newline".
Tested with Notepad++ 6.2
This will replace the successive blank lines containing white spaces (or not) and replace it with one new line.
Search for
(\s*\r?\n){3,}
replace with
\r\n
You can find it yourself what you need to replace with
\n\n OR \n\r\n or \r\n\r\n etc ... now you can even modify your regular expression ^([ \t]*)\r?\n\s+$ according to your need.
I tested any of the above suggestions, always was either too less or to much deleted. So that either you got no blank line where at least one was beforehand or deleted not enough (whitespaces was left, etc.). Unfortunately I cannot write comments yet. Tested both with 6.1.5 and updated to 6.2 and tested again. depending on how mayn files there are, I would suggest use
Edit->Blank Operations->Trim trailing whitespace
Followed by Ctrl+A and
TextFX -> TextFX Edit -> Delete surplus blank lines
A Macro I tried to record didn't work. Theres even a macro for just remove trailing whitespace (Alt+Shift+S, see Settings | Shortcut Mapper... | Macros). There's a
Edit->Blank Operations->Remove unnecessary EOL and whitespace
but that deletes every EOL and puts everything in a single line.
In notepad++ v8.4.7 there is the option:
Edit > Line Operations > Remove Empty Lines (Containing Blank characters)
or
Edit > Line Operations > Remove Empty Lines
So there is no need to use a regular expressions for this. But this only works for one file at a time.
I looked for ^\r\n and click "Replace All" with nothing (empty) in "Replace with" textbox.