Regular Expression Search Replace all non leading tabs with single space Notepad++ - regex

Regular Expressions have never been my strong suite, so I need some help here. I have a text file and I want to replace any "embedded" tabs with a space and only one space for x occurrences of tabs, but leave any "leading" tabs alone.
So for a line that looks like this:
\t\t\tThis is a\t\ttest to see\thow things\t will work.
would come out looking like this:
\t\t\tThis is a test to see how things will work.
So the only tabs left in the file would be at the beginning of any lines and there could be x number of tabs at the beginning of any line. Can anybody help me figure this one out?
I'm doing this with NotePad++ Search/Replace but I could use Visual Studio or some other tool if that would work better.

Find what:
(?<!\t)(?!^)\t+
The sequence of tabs \t+ must not be preceded by a tab (?<!\t), and also must not start from the beginning of a line (?!^).
Replace with:
<space>
Demo on regex101 (since Notepad++ also uses PCRE, I use t instead of tab for clarity)

Related

Deleting every 2nd line from a file using Notepad++

I am looking for some regex help.
I have a textfile, nothing super important but I would like to delete every second line from it - I have tried following this guide: Delete every other line in notepad++
However I just can't get it to work, is the regex I am using ok? I am noob with regex
Find:
([^\n]*\n)[^\n]*\n
Replace with:
$1
No matter what I try (mouse position at the beginning, ctrl+a and Replace All) I just can't get it to work. I appreciate any help.
I've put the regex into here: http://regexpal.com/ and if I remove the final \n it highlights the individual rows.
Make sure you select regular expression for the search mode...
Also, you may want to make that final newline optional. In the case that there are an even number of lines and you do not have a trailing newline, it won't remove the last line.
([^\n]*\n)[^\n]*\n?
Update:
See how Windows handle new lines with \r\n instead of just \n. Try updating the expression to take this into account:
([^\r\n]*[\r\n]+)[^\r\n]*[\r\n]*
Final Update:
Thanks to #zx81, I now know that N++ uses PCRE so \R can be used for unicode newline characters. However [^\R] won't work (this looks for anything except R literally), so you will need to keep [^\r\n]. This can be simplified as:
([^\r\n]*\R)[^\r\n]*\R?

Replace line of text Notepad++ or UltraEdit

Real quick question here that i cant work out.
I have a bunch of text files across many directories. Within these dirs are text files named init.txt
In these many text files, are lots of lines starting with
Effective =
What i need to do is replace any line that contains that string with another string,
preferably in Notepad++, or UltraEdit if need be.
In Notepad++, iv found Search -> Replace in Files... which lets me specify a starting directory, but i cant get to replace the entire line with my new line.
I have never used regular expressions before (if thats the best way to do this) as iv never needed to, so any help would be very much appreciated.
Thank you for helping me out.
For your problem, a litter regular expression may help a lot. I use regex search in Notepad++ nearly everyday, and it is really useful.
I do not want to itimidate you with some complicated regex grammar. Instead, I hope after reading my answer, you might see that the basics of regular expression is not so exotic, and it is for regular people's everyday use.
Follow these instructions:
In Notepad++ press Ctrl-F, and switch to the Find in Files tab, in Serach mode part(it is on the bottom of the dialog), select Regular expression
In the Find what field, what you need to input here may vary according to the specific pattern of the text you want to replace.
If the text fragment you want to substitute always
Shows up at the beginning of a line,
There is NO LEADING WHITESPACES before the text,
It containes EXACTLY ONE SPCACE CHARACTER before the = character
^Effective = should be used as the pattern in the Find what Field.
The ^ symbol in ^Effective = means matching begin of the line (so if Effectiv = appears in the middle of a line, it will be ignored ), and the rest is the exact words to be matched.
However, if the above conditions is not all satisfied, e.g.
the text segement may containe leading whitesapces,
the number of withspaces between the word Effective and = symbol may vary, from one to unlimited
Under such circumstance, you may need to use ^Effective\s+=.
The \s+ part in ^Effective\s+= matches one to unlimited number of whitespaces(including, spaces \0x20, tabs \t, carrige-return \r, and new-line \n)
If you want to match zero to unlimited spaces between Effective and =, you can replace \s+ to \s*
In the Rplace with field, input changeLine
In filters filed, select the file type you want to search
Check In all sub-folders
Click Replace in Files button
Set the search mode in Notepad++
Find: Effective =
Replace with: changeLine
Search Mode: Extended (\n, \t, etc)
From: https://superuser.com/questions/34451/notepad-find-and-replace-string-with-a-new-line

Remove everything before and after variable=int

I'm terrible at regex and need to remove everything from a large portion of text except for a certain variable declaration that occurs numerous times, id like to remove everything except for instances of mc_gross=anyint.
Generally we'd need to use "negative lookarounds" to find everything but a specified string. But these are fairly inefficient (although that's probably of little concern to you in this instance), and lookaround is not supported by all regex engines (not sure about notepad++, and even then probably depends on the version you're using).
If you're interested in learning about that approach, refer to How to negate specific word in regex?
But regardless, since you are using notepad++, I'd recommend selecting your target, then inverting the selection.
This will select each instance, allowing for optional white space either side of the '=' sign.
mc_gross\s*=\s*\d+
The following answer over on super user explains how to use bookmarks in notepad++ to achieve the "inverse selection":
https://superuser.com/questions/290247/how-to-delete-all-line-except-lines-containing-a-word-i-need
Substitute the regex they're using over there, with the one above.
You could do a regular expression replace of ^.*\b(mc_gross\s*=\s*\d+)\b.*$ with \1. That will remove everything other than the wanted text on each line. Note that on lines where the wanted text occurs two or more times, only one occurrence will be retained. In the search the ^.*\b matches from start-of-line to a word boundary before the wanted text; the \b.*$ matches everything from a word boundary after the wanted text until end of line; the round brackets capture the wanted text for the replacement text. If text such as abcmc_gross=13def should be matched and retained as mc_gross=13 then delete the \bs from the search.
To remove unwanted lines do a regular expression search for ^mc_gross\s*=\s*\d+$ from the Mark tab, tick Bookmark line and click Mark all. Then use Menu => Search => Bookmark => Remove unmarked lines.
Find what: [\s\S]*?(mc_gross=\d+|\Z)
Replace with: \1
Position the cursor at the start of the text then Replace All.
Add word boundaries \b around mc_gross=\d+ if you think it's necessary.

Notepad++ Regex: Find all 1 and 2 letter words

I’m working with a text file with 200.000+ lines in Notepad++. Each line has only one word. I need to strip out and remove all words which only contains one letter (e.g.: I) and words which contains only two letters (e.g.: as).
I thought I could just pas in regular regex like this [a-zA-Z]{1,2} but I does not recognize anything (I’m trying to Mark them).
I’ve done manual search and I know that there do exists words of that length so therefor can it only be my regex code that’s wrong. Anyone knows how to do this in Notepad++ ???
Cheers,
- Mestika
If you want to remove only the words but leave the lines empty, this works:
^[a-zA-Z]{1,2}$
Replace this with an empty string. ^ and $ are anchors for the beginning and the end of a line (because Notepad++'s regexes work in multi-line mode).
If you want to remove the lines completely, search for this:
^[a-zA-Z]{1,2}\r\n
And replace with an empty string. However, this won't work before Notepad++ 6, so make sure yours is up-to-date.
Note that you will have to replace \r\n with the specific line-endings of your file!
As Tim Pietzker suggested, a platform independent solution that also removes empty lines would be:
^[a-zA-Z]{1,2}[\r\n]+
A platform-independent solution that does not remove empty lines but only those with one or two letters would be:
^[a-zA-Z]{1,2}(\r\n?|\n)
I don't use Notepad++ but my guess is it could be because you have too many matches - try including word boundaries (your exp will match every set of 2 letters)
\b[a-zA-Z]{1,2}\b
The regex you specified should find 1-or-2 characters (even in Notepad++'s Find-dialog), but not in the way you'd think. You want to have the regex make sure it starts at the beginning of the line and ends at the end with ^ and $, respecitevely:
^[a-zA-Z]{1,2}$
Notepad++ version 6.0 introduced the PCRE engine, so if this doesn't work in your current version try updating to the most recent.
You seem to use the version of Notepad++ that doesn't support explicit quantifiers: that's why there's no match at all (as { and } are treated as literals, not special symbols).
The solution is to use their somewhat more lengthy replacement:
\w\w?
... but that's only part of the story, as this regex will match any symbol, and not just short words. To do that, you need something like this:
^\w\w?$

Regex: remove lines not starting with a digit

I have been fighting this problem with the help of a RegEx cheat sheet, trying to figure out how to do this, but I give up... I have this lengthy file open in Notepad++ and would like to remove all lines that do not start with a digit (0..9). I would use the Find/Replace functionality of N++. I am only mentioning this as I am not sure what Regex implementation is N++ using... Thank you
Example. From the following text:
1hello
foo
2world
bar
3!
I would like to extract
1hello
2world
3!
not:
1hello
2world
3!
by doing a find/replace on a regular expression.
You can clear up those line with ^[^0-9].* but it will leave blank lines.
Notepad++ use scintilla, and also using its regex engine to match those.
\r and \n are never matched because in
Scintilla, regular expression searches
are made line per line (stripped of
end-of-line chars).
http://www.scintilla.org/SciTERegEx.html
To clear up those blank lines, only way is choose extended mode, and replace \n\n to \n, If you are in windows mode change \r\n\r\n to \r\n
[^0-9] is a regular expression that matches pretty much anything, except digits. If you say ^[^0-9] you "anchor" it to the start of the line, in most regular expression systems. If you want to include the rest of the line, use ^[^0-9].+.
^[^\d].* marks a whole line whose first character is not a digit. Check if there are really no whitespaces in front of the digits. Otherwise you'd have to use a different expression.
UPDATE:
You will have to do ot in two steps. First empty the lines that do not start with a digit. Then remove the empty lines in extended mode.
One could also use the technique of bookmarking in Notepad++. I started benefiting from this feature (long time present but only more recently made somewhat more visible in the UI) not very long ago.
Simply bring up the find dialogue, type regex for lines not starting with digit ^\D.*$ and select Mark All. This will place blue circles, like marbles, in the left gutter - these are line bookmarks. Then just select from main menu Search -> Bookmark -> Remove bookmarked lines.
Bookmarks are cool, you could extract these lines by simply selecting to copy bookmarked lines, opening new document and pasting lines there. I sometimes use this technique when reviewing log files.
I'm not sure what you are asking. but the reg exp for finding the lines with a digit at the beginning would be
^\d.*
you can remove all the lines that match the above or alternatly keep all the lines that match this expression:
^[^\d].*