I have a STRING
"wordride plain fire "
I have tried to replace with Regular Expressions:
Find what: (?>(word)|\G(?<!^))\K\S
Replace with: $1$2$0
In Notepad ++, it does not change the text but it works in regex101 (https://regex101.com/r/aI6gE1/2), where i replaces characters after word as follows
First replace: wordwordide plain fire
Second replace: wordwordwordde plain fire
Third replace: wordwordwordworde plain fire
Fourth replace: wordwordwordwordword plain fire
Fifth replace: wordwordwordwordwordwordplain fire
Sixth replace: wordwordwordwordwordwordwordlain fire
Can you help me to see the error or give me a workaround in Notepad ++ for this purpose: replacing string after "word" character by character using a group not included in match group
Please help me
The answer is yes, it is possible to do with Notepad++ BUT only with the help of a PythonScript plug-in.
Get the plugin ready, and create the following script:
import re
regex = r"^(word)(.+)"
def process_match(match):
return "{0}{1}".format(match.group(1), "".join([match.group(1) for x in list(match.group(2))]))
editor.rereplace(regex, process_match)
The ^(word)(.+) pattern will match a line with word at its start into Group 1 and all the rest of the line into Group 2.
The "{0}{1}".format(match.group(1), "".join([match.group(1) for x in list(match.group(2))])) will paste the Group 1 value into the result first (see format(match.group(1)) and then "".join([match.group(1) for x in list(match.group(2))]) will replace each character in Group 2 with the value in Group 1.
This text:
word1
word1 2
wordride plain fire
will turn into:
NOTE: You can control how many chars after word are replace with word by adjusting (modifying) the (.+) pattern.
It's hard to understand exactly what you want to do but the following is working based on your examples:
Find: ^((word)+).
Replace with: $1$2
Related
I am trying to use Regex in notepad++ to select everything after v+(number|character)* but in the selection it should excluded the v+(num|char)*.
e.g. master\_\move_consolidate_archives_html_to_move_base_v2kjkj_(2021_01_19_11h43m59s-fi_m_dt xx-) - Copy (2).bat"
I am expecting
_(2021_01_19_11h43m59s-fi_m_dt xx-) - Copy (2).bat"
so far I can use this line (?i)(v\d[0-9a-z]*)
to select v2kjkj
but I can't get this to work with lookbehind (?<=xxxx).
I am also trying to use if-then-else condition but no luck for me. I am still don't understand enough to using it.
issue.
because the "v" have different pattern in it. I can't hard code to certain string
v2
v23
v2kjkj
v2343434
Test string:
mmaster\_\move_consolidate_archives_html_to_move_base_v2_16_.bat"
master\_\move_consolidate_archiv es_html_to_move_base_v23_17_.bat"
master\_\move_consolidate_archives_html_to_move_base_v2_17_(2021_01_19_12h37m19s-fi_m_dt xx-).bat"
master\_\move_consolidate_archives_html_to_move_base_v2_(2021_01_19_11h43m59s-fi_m_dt xx-) - CopyCopy.bat"
master\_\move_consolidate_archives_html_to_move_base_v2kjkj_(2021_01_19_11h43m59s-fi_m_dt xx-) - Copy (2).bat"
master\_\move_consolidate_archives_html_to_move_base_v2343434_(2021_01_19_11h43m59s-fi_m_dt xx-) - Copy (3).bat"
I have been reading and searching for a day but I can't apply anything I have seen so for.
the closest one I see was
Regexp match everything after a word
Getting the text that follows after the regex match
I am welcome any comments.
Ctrl+H
Find what: v\d[0-9a-z]*\K.*$
Replace with: LEAVE EMPTY
UNCHECK Match case
CHECK Wrap around
CHECK Regular expression
UNCHECK . matches newline
Replace all
Explanation:
v # a "v"
\d # a digit
[0-9a-z]* # 0 or more alphanum
\K # forget all we have seen until this position
.* # 0 or more any character but newline
$ # end of line
Screenshot (before):
Screenshot (after):
I don't know anything about Notepad++ Regex.
This is the data I have in my CSV:
6454345|User1-2ds3|62562012032|324|148|9c1fe63ccd3ab234892beaf71f022be2e06b6cd1
3305611|User2-42g563dgsdbf|22023001345|0|0|c36dedfa12634e33ca8bc0ef4703c92b73d9c433
8749412|User3-9|xgs|f|98906504456|1534|51564|411b0fdf54fe29745897288c6ad699f7be30f389
How can I use a Regex to remove the 5th and 6th column? The numbers in the 5th and 6th column are variable in length.
Another problem is the User row can also contain a |, to make it even worse.
I can use a macro to fix this, but the file is a few millions lines long.
This is the final result I want to achieve:
6454345|User1-2ds3|62562012032|9c1fe63ccd3ab234892beaf71f022be2e06b6cd1
3305611|User2-42g563dgsdbf|22023001345|c36dedfa12634e33ca8bc0ef4703c92b73d9c433
8749412|User3-9|xgs|f|98906504456|411b0fdf54fe29745897288c6ad699f7be30f389
I am open for suggestions on how to do this with another program, command line utility, either Linux or Windows.
Match \|[^|]+\|[^|]+(\|[^|]+$)
Repalce $1
Basically, Anchor to the end of the line, and remove columns [-1] and [-2] (I assume columns can't be empty. Replace + with * if they can)
If you need finer detail then that, I'd recommend writing a Java or Python script to manual parse and rewrite the file for you.
I've captured three groups and given them names. If you use a replace utility like sed or vimregex, you can replace remove with nothing. Or you can use a programming language to concatenate keep_before and keep_after for the desired result.
^(?<keep_before>(?:[^|]+\|){3})(?<remove>(?:[^|]+\|){2})(?<keep_after>.*)$
You may have to remove the group namings and use \1 etc. instead, depending on what environment you use.
Demo
From Notepad++ hit ctrl + h then enter the following in the dialog:
Find what: \|\d+\|\d+(\|[0-9a-z]+)$
Replace with: $1
Search mode: Regular Expression
Click replace and done.
Regex Explain:
\|\d+ : match 1st string that starts with | followed by number
\|\d+ : match 2nd string that starts with | followed by number
(\|[0-9a-z]+): match and capture the string after the 2nd number.
$ : This is will force regex search to match the end of the string.
Replacement:
$1 : replace the found string with whatever we have between the captured group which is whatever we have between the parentheses (\|[0-9a-z]+)
To start off, I want to be able to do 2 things:
1st Thing:
To extract foo_abc (and similarly every other line, for example, goo_zxy, and doo_fgh), I needed to remove some text appended BEFORE foo_abc, and AFTER foo_abc.
For example:
TEXTBEFOREfoo_abcTEXTAFTER
TEXTBEFOREgoo_zxyTEXTAFTER
TEXTBEFOREdoo_fghTEXTAFTER
to obtain:
foo_abc
goo_zxy
doo_fgh
2nd Thing:
I now need to append different text before and after foo_abc again.
Like so:
TextAfoo_abcTextB
So what I've done is:
Find: ^
Replace: TextA
Find: $
Replace: TextB
Which works well, but I have to perform a find&replace TWICE which is not very efficient. To avoid that, I found this: Multiple word search and replace in notepad++
And applied it like so:
Find: (^)|($)
Replace: (?1TextA)(?2TextB)
But it doesn't work out too well.
AND, as mentioned, I need this to work for EACH and every line:
For example:
foo_abc
goo_zxy
doo_fgh
I need to insert TextA at the beginning for each of those lines, and TextB at the end of each line, like so:
TextAfoo_abcTextB
TextAgoo_zxyTextB
TextAdoo_fghTextB
Can this be done? (Yes, I actually need to do this to over 10000 lines, not just 3 and wanting an efficient way to do so).
Have I missed a quicker way to do all of this? Perhaps by performing a search and replace above in '1st Thing' on the TEXTBEFORE and TEXTAFTER, with TextA and TextB, respectively, in one-go?
Many thanks.
EDIT: Yes, they are literal strings. Yes, they do contain special characters because they are represent parts of a URL.
There are two scenarios: 1) you want to replace the TEXTBEFORE or TEXTAFTER regardless of the fact that either of them exists, 2) both TEXTBEFORE and TEXTAFTER must exist
Scenario 1
You may use a single search and replace operation for this:
Find What: ^(TEXTBEFORE)|TEXTAFTER$
Replace With: (?{1}TextA:TextB)
NOTE: If the TEXTBEFORE and TEXTAFTER contain special chars, you may use
Find What: ^(\QTEXTBEFORE\E)|\QTEXTAFTER\E$
Details:
^(TEXTBEFORE)- match and capture into Group 1 TEXTBEFORE at the start of a line
| - or
TEXTAFTER$ - match TEXTAFTER at the end of a line.
Replacement pattern:
(?{1} - if Group 1 is matched, then
TextA - return TextA
: - else
TextB - replace with TextB
) - end of the conditional replacement pattern.
Scenario 2
If you need to match lines starting with some text and ending with another, use
Find What: ^TEXTBEFORE(.*?)TEXTAFTER$
Replace With: TextA$1TextB
Details:
^ - start of a line
TEXTBEFORE - some text here
(.*?) - Group 1 (that can be referred to with $1 backreference from the replacement pattern) matching any 0+ chars other than line break chars
TEXTAFTER - some text at the...
$ - end of line.
Try:
TEXTBEFORE(.+?)TEXTAFTER
replace with
TextA$1TextB
See this for example and explanation
If you need to find whole line:
^TEXTBEFORE(.+?)TEXTAFTER$
Replace is the same as before.
for example I have txt with content
qqqqaa
qqss
ss00
I want to replace only one q at the beginning of line, that is to get
qqqaa
qss
ss00
I tried replace ^q in notepad++. But After I click replaceAll, I got
aa
ss
ss00
What is wrong? Is my regex wrong? What is the correct form?
The issue is that Notepad++ Replace All functionality replaces in a loop using the modified document.
The solution is to actually consume what we need to replace and keep within one regex expression like
^q(q*)
and replace with $1.
The pattern will find a q at the beginning of the line and then will capture into Group 1 zero or more occurrences of q after the first q, and in the replacement part the $1 will insert these qs inside Group 1 back into the string.
You can use ^q(.+) and replace with $1 if you also want to replace single q's.
I am fixing corrupted DB export to txt file, I am new to Regular expressions:
My corrupted lines can be found using Notepad++ regular expression:
\r\n[^"]
(fine line breaks followed by everything that is not " )
I need to delete these \r\n but I need to preserve the characters following it (in my data these are digits)
Desired data:
"USERNAME"|"Text1"|"Text2"|"Spreadsheet" (CR)(LF)
"USERNAME"|"Text1"|"Text2"|"Spreadsheet" (CR)(LF)
Corrupted data:
"USERNAME"|"Text1"|"Text2line1 - #3.50 (CR)(LF)
1 x text2line2 - #5.40 (CR)(LF)
2 x text2line3 #6.75 (CR)(LF)
|"Spreadsheet" (CR)(LF)
Therefore this does not work:
FIND: \r\n[^"]
REPLACE: [^"]
Because this way I would get rid of "1" and "2" and the beginning of the new line.
I will be grateful for your help :)
Make a minor change to the expression so that it reads \r\n([^"]) (notice the extra ( and )). This will place the match in a regex group.
Then, simply replace that by \1, which is the regex group you are matching in the expression above.
You could use a positive lookahead:
Find what: \R(?!=[ ^"])
Replace with: NOTHING
\R stands for any kind of linebreak.
(?!=[ ^"]) is a zero width assertion that assumes there're no quotes after the linebreak