Find and Partially Replace Notepad++ Regex - regex

I have a file with a file with lines containing a space, 9 digits, 6 spaces and 5C18. Finding it is easy I'm using
\s\d{9}\s{6}\5C18
The problem is that I need to replace the space at the beginning of the line with a letter, say F. So that everything else remains in tact. Every time I try to do it the entire line is replaced with the expression. I know this is probably something stupidly basic but any help would be appreciated.

Move the part that you do not wish to replace into a lookahead expression:
^\s(?=\d{9}\s{6}5C18)
Now the portion in (?= ... ) is not considered part of the match; only the initial space is. Hence, running a replace with this regex would let you replace the initial space with whatever characters that you want.
It's text on a single line. The F needs to go where that first space is at the beginning of the line.
Note the use of ^ anchor to ensure that the match of the initial space is tied to the beginning of the line.

Related

reuse last matched character of regex in sed

Many of you with a certain leaning towards proper formatting will know the pain of having a lot of space characters insted of a tab character in the beginning of indented lines after another person edited a file and added lines. I seem to be unable to teach my colleagues how to use vim's integrated line pasting function, so I'm searching for some simple ways to automatically correct lines beginning with a certain pattern. ;)
I'm using a regex to find the corresponding lines, but I can't work out how to "reuse" the last matched character in sed when using "find and replace". The regex matching the lines is
'^\ *[A-Z]'
I would like to replace those space characters, but keep the uppercase letter. My idea would be something like
sed 's|^\ *[A-Z]|\t$|g'
or so, but I guess that would replace the whole line with a single tab character since $ usually matches the line ending?
Is there a simple way to reuse parts of the matched regex in sed?
How about simply not including the first non-space character in the match in the first place?
This matches all spaces at the beginning of a line:
^ *
Edit (quote from the comments):
obviously I don't want to replace spaces in front of other characters than uppercase letters
A look-ahead could do that, but unfortunatey sed does not support them. But you can use the next best thing, an expression that determines which lines sed operates on:
sed '|^ *[A-Z]| s|^ *|\t|'
Of course a back-reference would do it as well:
sed 's|^ *\([A-Z]\)|\t\1|'

Regex to remove the first 2 lines of a text file

I am trying to delete only the first 2 lines of a text file.
I tried using \A.*, but this gets the first line and deletes the rest.
Is there a way to do the inverse?
It is maybe not the most convenient way, but it is possible with Regex:
^.*\n.*\n([\s\S]*)$
With default settings (neither single-line nor multi-line modifiers) the '.' captures everything, except newline. Therfore, .*\n captures one line, including the new line character. Repeat it twice, and we are at the beginning of the third line. Now capture all characters, including the new line character ([\s\S] is a nice workaround for this behavior) until the end of the file $.
Then substitute by the first capturing group
\1
and you have everything but the first 2 lines.
The details depend on your regex engine, how you give the substitute string. And depending on the platform or the used new line character of the file, you might need to exchange the \n with \r\n or \r or the one that matches it all (\r\n?|\n).
Here is a working Demo.

How can I remove all the text between matches on a line?

I have this problem:
Input text:
this is my text text text and more text
this is my text myspace this is my text
this space is my text space this is my
this is my text this is my text
this space is my text space space myspace
Let say I want to search for "space"
I would like to have this as output:
this is my text text text and more text
space
space space
this is my text this is my text
space space space space
Matches on the same line have to be separated with a space.
Line without matches must remain as it is.
Same for all other search items.
I'm trying to realize this, this afternoon but without success.
Can anyone help me?
Solution:
:g/space/s/\(.*space\).*$/\1/|s/.\{-}space/ space/g|s/^ //
Explanation:
This is tricky, but it can be done. It can't be done with a single regular expression, though.
The first thing we do is get rid of anything after the last match (we actually exploit the fact that regular expressions are greedy by default here):
s/\(.*space\).*$/\1/
Then we remove anything between all the internal matches (notice we use the lazy version of * here, \{-}):
s/.\{-}space/ space/g
The previous step will leave an initial space in the result, so we get rid of that:
s/^ //
Fortunately, in vim, we can chain replacements together with the | character. So, putting it all together:
:g/space/s/\(.*space\).*$/\1/|s/.\{-}space/ space/g|s/^ //
is this tricky line ok for you?
:g/space/s/space/^G/g|s/[^^G]//g|s/^G/space /g
the ^G above you need press Ctrl-V Ctrl-G
the output of above command is same as your example except for the ending whitespace after pattern (space in this case). but it is easy to be fixed, e.g. chain another s/ $// after the :g line.
Kent's solution uses a nice trick that makes it work only for fixed strings, but it's clean and short. Ethan Brown's answer is more general, but also adds complexity with its three steps. I think the best solution can be developed based on the accepted answer in this very similar question.
Contrary to what Ethan Brown assumes, this can indeed be done with a single regular expression substitution. Here it is, in all its ugliness:
:g/space/s/\%(^\|\%(space \)*space\%( \%(.*space\)\#=\)\?\)\zs\%(\%(space \)*space\%( \%(.*space\)\#=\)\?\)\#!.\{-1,}\ze\%(\%(space \)*space\%( \%(.*space\)\#=\)\?\|$\)//g
It becomes somewhat more readable when you use the :DeleteExcept command from my PatternsOnText plugin:
:g/space/DeleteExcept/\%(space \)*space\%( \%(.*space\)\#=\)\?/
Explanation
This deletes everything except
potentially multiple sequential occurrences \%(space \)*
of the word space
including the trailing whitespace when it's not the last match in the line, i.e. there's a following match \%(.*space\)\#= so that the whitespace is not swallowed
or excluding (i.e. deleting) it \? after the last match in the line.
More practical alternative
Though it's a nice challenge to come up with the above solution, in practice, I would also favor a two-step approach, just because it's way simpler:
:g/space/DeleteExcept/space\%( \|$\)/
This leaves behind trailing whitespace that can be pruned with
:%s/ $//

regular expression to remove the first word of each line

I am trying to make a regular expression that grabs the first word (including possible leading white space) of each line. Here it is:
/^([\s]+[\S]*).*$/\1//
This code does not seem to be working (see http://regexr.com?34o6m). The code is supposed to
Begin at the start of the line
Create a capturing group where it places the first word (with possible leading white space)
Grab the rest of the line
Substitute the entire line with just the inside of the first capturing group
I tried another version also:
/\S(?<=\s).*^//
It looks like this one fails too (http://regexr.com?34o6s). The goal here was to
Find the first non-whitespace character.
Look behind to make sure it has a whitespace character behind it (i.e. not the first letter of the line).
Grab the rest of the line.
Erase everything the expression just grabbed.
Any insight to what is going wrong would be greatly appreciated. Thanks!
Try this regular expression
^(\s*.*?\s).*
Demo: gskinner
You mixed up your + and *.
/^([\s]*[\S]+).*$/\1/
This means zero or more spaces followed by one or more non-spaces.
You might also want to use $1 instead of \1:
/^([\s]*[\S]+).*$/$1/
Okay, well this seems to work using replace() in Javascript:
/^([\s]*[\S]+).*$/
I tested it on www.altastic.com/regexinator, which as far as I know is accurate [I made it though, so it may not be ;-) ]
remove the first two words
#"^.asterisk? .asterisk? "
this works for me
when posted, the asterisk sign doesn't show. have no idea.
if you want to remove the first word, simply start the regex as follow
a dot sign
an asterisk sign
a question mark
a space
replace with ""

clarification on vim pattern matching

I want to convert the 5 space indentation in a python file to 4 space indentation. I want the command to do the following
remove a single space in all the lines which starts with a space followed by characters.
I issued the command %s/^\ [a-zA-Z]*// which seems to work. Later i figured out that the command should actually be
remove a single space in all the lines which starts with a space followed by any number of spaces followed by characters.
However still i am not able to figure out how the command(above) is working. It should basically report error for the following stating pattern not found but still it works.
class H:
def __init__():
hell()
It's working because * means "match zero or more of the previous atom". In your case, it's matching zero. You probably wanted to use \+ instead which means "match one or more of the previous atom".
In actuality, you could have just dropped the * entirely because just a space followed by a single character would have matched what you were originally searching for. There are better regular expressions for what you're trying to accomplish, but that's not what you're asking here.
Edit (clarification):
Your regex as it stands (^\ [a-zA-Z]*) translates to:
^: From the start of the line
\: Match a space
[a-zA-Z]: Followed by a letter
*: Zero or more times (of the previous atom - a letter)