How to replace a sentence - regex

How to replace the text
input:
This glass contains 1.5l of milk. But i don't like milk
ouptput:
Replace_text But i don't like milk
I need to replace text from the beginning to the next full dot. But i don't know what was the sentence.

vim has its definition of sentence. This you can read :h sentence.
For your example, I would do (assume your cursor is at the BOL):
c)WAHTEVER
The c) will remove the 1st sentence and enter INSERT mode.
To do batch replacement, you can record macro or make use of normal command.

What about this substitution command?
:%s/\(.\{1,}\)\.\(\w*\)/Replace_text\2/g
or using character classes (you might need to add special characters etc):
([\w\s]+\.\S[\w\s]+)\.(.*)
http://regexr.com/3cc0r

You can do the following
^cf.
It jumps to first non-blank character, deletes text including . and puts you in insert mode so that you can type in Repace_text.
If you want to change till the ., change f with t
You can also make use of text objects. cis will let you change the sentence.

Related

Remove columns from CSV

I don't know anything about Notepad++ Regex.
This is the data I have in my CSV:
6454345|User1-2ds3|62562012032|324|148|9c1fe63ccd3ab234892beaf71f022be2e06b6cd1
3305611|User2-42g563dgsdbf|22023001345|0|0|c36dedfa12634e33ca8bc0ef4703c92b73d9c433
8749412|User3-9|xgs|f|98906504456|1534|51564|411b0fdf54fe29745897288c6ad699f7be30f389
How can I use a Regex to remove the 5th and 6th column? The numbers in the 5th and 6th column are variable in length.
Another problem is the User row can also contain a |, to make it even worse.
I can use a macro to fix this, but the file is a few millions lines long.
This is the final result I want to achieve:
6454345|User1-2ds3|62562012032|9c1fe63ccd3ab234892beaf71f022be2e06b6cd1
3305611|User2-42g563dgsdbf|22023001345|c36dedfa12634e33ca8bc0ef4703c92b73d9c433
8749412|User3-9|xgs|f|98906504456|411b0fdf54fe29745897288c6ad699f7be30f389
I am open for suggestions on how to do this with another program, command line utility, either Linux or Windows.
Match \|[^|]+\|[^|]+(\|[^|]+$)
Repalce $1
Basically, Anchor to the end of the line, and remove columns [-1] and [-2] (I assume columns can't be empty. Replace + with * if they can)
If you need finer detail then that, I'd recommend writing a Java or Python script to manual parse and rewrite the file for you.
I've captured three groups and given them names. If you use a replace utility like sed or vimregex, you can replace remove with nothing. Or you can use a programming language to concatenate keep_before and keep_after for the desired result.
^(?<keep_before>(?:[^|]+\|){3})(?<remove>(?:[^|]+\|){2})(?<keep_after>.*)$
You may have to remove the group namings and use \1 etc. instead, depending on what environment you use.
Demo
From Notepad++ hit ctrl + h then enter the following in the dialog:
Find what: \|\d+\|\d+(\|[0-9a-z]+)$
Replace with: $1
Search mode: Regular Expression
Click replace and done.
Regex Explain:
\|\d+ : match 1st string that starts with | followed by number
\|\d+ : match 2nd string that starts with | followed by number
(\|[0-9a-z]+): match and capture the string after the 2nd number.
$ : This is will force regex search to match the end of the string.
Replacement:
$1 : replace the found string with whatever we have between the captured group which is whatever we have between the parentheses (\|[0-9a-z]+)

Regex expression that selects only specific word

Welcome guys, I am just new to this community!
Here is the case, I am having some strings like these
thatisanappleaaa
thatisanappleaaa bad
thatisanappleaaa.bad
thatisanapplebadaaa
thatisanbadappleaaa
thatisanbadbadappleaaa
badthatisanappleaaa
and trying to use Sublime Text 3 Find and replace function to achieve the following (note that only the first line is being replaced)
thatisanorangeaaa
thatisanappleaaa bad
thatisanappleaaa.bad
thatisanapplebadaaa
thatisanbadappleaaa
thatisanbadbadappleaaa
badthatisanappleaaaa
Is there a regex that filters "apple" in "thatisanappleaaa"(which is line one) only without the presence of "bad" in any position (except between "apple") in the string, given that the string "bad" does not change every time it appears?
Try
(\w+)apple(\w+).*
will select all text wrapped around apple
if you want to select text trailing after apple use
apple(\w+).*
After reading your description I'm assuming you want to replace the word apple only in sentences which do not have any occurrences of the word bad.
I've used a regex which uses a negative lookahead and used parentheses to capture apple which can then be replaced with any word, in your case orange.
Regex: ^(?!.*bad).*(apple)
DEMO

Regex - replace folder details with filename

Completely new to Regex so I was hoping I could find an answer here.
I'm using Notepad++, and I have a big bulk of file details from a folder in a text document, like so:
01/01/2015 08:00 1,000,000 filename.exe
01/02/2015 08:30 1,450,000 aDifferentFilename.exe
And I want to do a find and replace so that the whole thing is replaced by:
filename.exe
aDifferentFilename.exe
I could delete them manually, but there's over a thousand lines!
I've used ^(.*)% to find the lines one by one, but what would I put in the replace field to keep the filename, i.e filename.exe?
Any help/explanation would be great!
In Notepad++'s find dialog, click on the tab for "replace" (probably obvious, but to be complete). Make sure the radio button for "Regular expression" is checked (again, probably obvious). In the "Find what:" text box enter:
^([^ ]+[ ]+){3}(.*)$
if the pattern in your file is consistently four total fields of information (including the file names) each separated by spaces. Explanation: finds three groups of one or more non spaces followed by one or more spaces followed by everything else on the line. "Everything else on the line" is assigned to group 2 (it is enclosed in the second set of parenthesis the expression). We will use this fact below to specify the "Replace with:" string. This is necessary to advance the search position past the text we want to keep, otherwise after the replacement it would match the expression, and would itself be replaced.
Enter this:
^(.{34})(.*)$
if the consistent pattern in your file is that the file name always starts in the 35th column (both patterns could hold true, in which case you could use either). Explanation: This finds the first 34 characters at the start of each line followed by everything else on the line. See explanation above why we want to "find everything else on the line." Note that it is not necessary to group ".{34}" in parenthesis, I simply did this so that in both exampls the "replace with:" text would be group 2.
In the "Replace with:" text box enter \2
Explanation: This tells Notepad++ to replace what we matched with the group 2 subset of what we matched, in other words, "everything else on the line", which in this case is the file name.
Click "Replace"
Another option: If the text you want to keep always starts in column 35 (like required for the approach immediately above), you can select the column of text you want to delete by holding down ctrl+alt+shift and then left clicking with your mouse and dragging. Once the text is selected, hit delete
You can try matching on either 3 sets of spaces, or assume the comma is always fixed. Here is something quick and dirty which matches the comma in a greedy fashion, and 5 characters after that.
^(.*,.....)

How to search a word using regex and concatenate it to other words also found by using regex on a per line basis?

I have a file in format:
has | have | had\tmeaning of have\n
apple\tmeaning of apple\n
write | wrote\tmeaning of write\n
I want to have it in the following format:
has\tmeaning of have\n
have\tmeaning of have\n
had\tmeaning of have\n
apple\tmeaning of apple\n
etc. Word(s) (has, have, had) can be single or multiple. Multiple words are seperated by space, pipe character, space. Meaning is followed by tab character and ended by new line. I am not sure but want to assume that meaning may contain pipe or tab character (or better any character except newline). Can it be done in notepad++? If not, is there other easy alternative?
My input file uses actual newline and tab characters. Since I can't paste them in stackoverflow, I have presented them as \n and \t (escape sequences) instead in the examples.
EDIT
It sounds like in your input, the tabs and new lines are not literally inserted. This should work:
Search: \s*([^ |]+) \|\s*(?=.*?\t(.*?)(?=(?:\R|$)))
Replace: \1\t\2\n
Original
In the Replace tab, make sure to check the "regex" box at the bottom left, then use this:
Search: \s*([^ |]+) \|\s*(?=.*?\\t(.*?)(?=(?:\\n|$)))
Replace: \1\t\2\n

auto indent in vim string replacement new line?

I'm using the following command to auto replace some code (adding a new code segment after an existing segment)
%s/my_pattern/\0, \r some_other_text_i_want_to_insert/
The problem is that with the \r, some_other_text_i_want_to_insert gets inserted right after the new line:
mycode(
some_random_text my_pattern
)
would become
mycode(
some_random_text my_pattern
some_other_text_i_want_to_insert <--- this line is NOT indented
)
instead of
mycode(
some_random_text my_pattern
some_other_text_i_want_to_insert <--- this line is now indented
)
i.e. the new inserted line is not indented.
Is there any option in vim or trick that I can use to indent the newly inserted line?
Thanks.
Try this:
:let #x="some_other_text_i_want_to_insert\n"
:g/my_pattern/normal "x]p
Here it is, step by step:
First, place the text you want to insert in a register...
:let #x="some_other_text_i_want_to_insert\n"
(Note the newline at the end of the string -- it's important.)
Next, use the :global command to put the text after each matching line...
:g/my_pattern/normal "x]p
The ]p normal-mode command works just like the regular p command (that is, it puts the contents of a register after the current line), but also adjusts the indentation to match.
More info:
:help ]p
:help :global
:help :normal
%s/my_pattern/\=submatch(0).", \n".matchstr(getline('.'), '^\s*').'some_other_text'/g
Note that you will have to use submatch and concatenation instead of & and \N. This answer is based on the fact that substitute command puts the cursor on the line where it does the substitution.
How about normal =``?
:%s/my_pattern/\0, \r some_other_text_i_want_to_insert/ | normal =``
<equal><backtick><backtick>: re-index position before latest jump
(Sorry about the strange formatting, escaping backtick is really hard to use here)
To keep them as separate command you could do one of these mappings:
" Equalize and move cursor to end of change - more intuitive for me"
nnoremap =. :normal! =````<CR>
" Equalize and keeps cursor at beginning of change"
nnoremap =. :keepjumps normal! =``<CR>
I read the mapping as "equalize last change" since dot already means "repeat last change".
Or skip the mapping altogether since =`` is only 3 keys with 2 of them being repeats. Easy peasy, lemon squeezy!
References
:help =
:help mark-motions
Kind of a round-about way of achieving the same thing: You could record a macro which finds the next occurance of my_pattern and inserts after it a newline and your replacement string. If auto-indent is turned on, the indent level will be maintained reagardless of where the occurance of my_pattern is found.
Something like this key sequence:
q 1 # begin recording
/my_pattern/e # find my_pattern, set cursor to end of match
a # append
\nsome_other_text... # the text to append
<esc> # exit insert mode
q # stop recording
Repeated by pressing #1
You can do it in two steps. This is similar to Bill's answer but simpler and slightly more flexible, since you can use part of the original string in the replacement.
First substitute and then indent.
:%s/my_pattern/\0, \r some_other_text_i_want_to_insert/
:%g/some_other_text_i_want_to_insert/normal ==
If you use part of the original string with \0,\1, etc. just use the common part of the replacement string for the :global (second) command.
I achieved this by using \s* at the beginning of my pattern to capture the preceding whitespace.
I'm using the vim addon for VSCode, which doesn't seem to match standard vim completely, but for me,
:%s/(\s*)(existing line)/$1$2\n$1added line/g
turns this
mycode{
existing line
}
into this
mycode{
existing line
added line
}
The parentheses in the search pattern define groups which are referenced by $1 and $2. In this case $1 is the white space captured by (\s*). I'm not an expert on different implementations of vim or regex, but as far as I can tell, this way of referencing regex groups is specific to VSCode (or at least not general). More explanation of that here. Using \s* to capture a group of whitespace should be general, though, or at least have a close analog in your environment.