Using regex for repeating text in Notepad++ - regex

I have links like this:
https://d2ynliea65eb6o.cloudfront.net/6100052500-STXMLOPEN/sub_1.m3u8
https://d2ynliea65eb6o.cloudfront.net/6100052499-STXMLOPEN/sub_1.m3u8
https://d2ynliea65eb6o.cloudfront.net/6100052498-STXMLOPEN/sub_1.m3u8
How can I use a regex in Notepad++ to make them like this:
https://d2ynliea65eb6o.cloudfront.net/6100052500-STXMLOPEN/6100052500-STXMLOPENsub_1.m3u8
https://d2ynliea65eb6o.cloudfront.net/6100052499-STXMLOPEN/6100052499-STXMLOPENsub_1.m3u8
https://d2ynliea65eb6o.cloudfront.net/6100052498-STXMLOPEN/6100052498-STXMLOPENsub_1.m3u8
I want to repeat what is between net/ and /sub for each link.

I am assuming you want to repeat the characters before the last /.
You may try this regex:
Regex
([^/\n]+)/(?=[^/\n]+$)
Substitution
$1/$1
([^/\n]+) // any consecutive non-slash and non-linebreak characters, and capture them in group 1
/ // a slash
(?=[^/\n]+$) // lookahead, there must be non-slash and non-linebreak characters followed by the end of a line ahead
Check the proof

If you want to actually search for and repeat what's in between "net/" and "/sub" and repeat that then you can use:
(net/(.*?))/sub
replace with:
$1/$2sub
the second () ie (.*?) will create group $2 which will contain the variable text that occurs between net/ and /sub
the first (), which DOES NOT contain the /sub will contain the text up to, but not including the "/sub" text and put it into $1. If you want to include the "/sub" you would put the ")" on the right side of "/sub".
then $1/$2sub will be the concatenation of $1 with a "/" then $2 then "sub" then the remainder of the text

Related

Extend string between strings

startABCend
->
startABC123end
I seek to capture text between start and end, and extend it, as shown. I tried:
find = start.*end, replace = \1 123: will capture start and end and between, but replace them all
find = (?s)(?<=start).+?(?=end), replace = \1 123: will keep start and end but replace captured
How to accomplish this with regex in N++?
The exact use case is
func_name(a, b=1) -> func_name(a, b=1, c=2)
# can also be
func_name(g=5, k=7) -> func_name(g=5, k=7, c=2)
# so capture between `func_name(` and `)` and extend with `, c=2`
You could do this without capture groups, and match what you want to replace.
\bstart\K.*?(?=end\b)
The pattern matches:
\bstart Match start preceded by a word boundary
\K Forget what is matched until now
.*? Match as least chars as possible
(?=end\b) Positive lookahead, assert end to the right followed by a word boundary
In the replacement use the full match followed by 123
$&123
For the updated example data, you could match the format of key with an optional =value, and optionally repeat that asserting a ) to the right.
\bfunc_name\([^\s,=]+(?:=[^\s,=]+)?(?:,\h*[^\s,=]+(?:=[^\s,=]+)?)*(?=\))
Regex demo
And replace with
$&, c=2
Your example target does not include the white space you have in your replace string. To accomplish using the group AND append numbers you can use brackets.
Basically:
Find: (?<=start)(.+?)(?=end)
Replace: (\1)123
or just
Find: start(.+?)end
Replace: start(\1)123end

How to use regex to remove semi colons?

Hello I have this kind of data :
a;b;c;d
1;2;3;4
a;g;h;j
f;g;f;d
a;d;8;d
And I would like to modify to have this :
a;bc;d
1;23;4
a;gh;j
f;gf;d
a;d8;d
Obviously I have a lot of lines but every time the semi colons are in the same position. I tried to select the columns with notepad++ and to to replace the semi colon by nothing but the box is grey...
Do you have a solution ?
Thank you !
Here is an online regex tester that i did the work on, you can just replace your data with the samples.
Regex : (\w+;)((\w+);(\w+))(;\w+)
DEMO
Hold ALT+SHIFT and use the arrow keys to select second semicolon and delete it.
OR
Hold ALT and click and drag the mouse to select a second semicolon and delete it.
OR
Find : ^(...)(.)
Replace with: \1
Ctrl+H
Find what: ^[^;]+;[^;]+\K;
Replace with: LEAVE EMPTY
check Wrap around
check Regular expression
Replace all
Explanation:
^ # beginning of string
[^;]+ # 1 or more non semicolon
; # 1 semi colon
[^;]+ # 1 or more non semicolon
\K # forget all we have seen until this position
; # 1 semi colon
Result for given example:
a;bc;d
1;23;4
a;gh;j
f;gf;d
a;d8;d
If the possible values of the data are lowercase characters a-z or a digit you could also capture the first 3 characters using a capturing group and a character class [a-z0-9] and after that match a semicolon. If there can be more than 1 character you could use a quantifier + for the character class like [a-z0-9]+
Then replace with the first capturing group.
Find what
^([a-z0-9];[a-z0-9]);
Replace with
$1
Regex demo
Or Using \K you could find ^[a-z0-9];[a-z0-9]\K; and leave Replace with empty.

How to replace specific character one time

I want to replace character - using regular expression in my text so it would work like this:
Original text: abcd-efg-hijk-lmno
Text after replacing: abcd-efg-hijk/lmno
As you can see I want to replace character - starting from the end just one time with character /.
Thanks in advance for any tips
Find what: -([^-]*)$
Replace with: /$1
Search Mode: Regular Expression
Explanation:
- : a dash
([^-]*$) : text with no dash,
zero or more times,
to the end of the line,
put in the $1 variable
/$1 : literal "/", contents of $1
Good resource: http://www.grymoire.com/Unix/Regular.html
To replace characters in Notepad++, you can open the Replace window using Ctrl+H, or under the "Search" menu. Once open, enter the following regular expression:
(.{4}-.{3}-.{4})(-)(.{4})
This will find:
a group of four characters (the "." being any character, the "{4}" being the quantity),
a dash,
a group of three characters,
another dash,
a group of four characters,
again another dash,
then a group of four characters.
The parentheses group this search into captured groups, which we will use for the replacement part. See https://www.regular-expressions.info/brackets.html for more info.
If you want to restrict the search to lowercase letters as in your example, you would replace the "." with "[a-z]", or for upper and lower "[a-z,A-Z]".
Now for the replacement. The groups from earlier are referenced by the dollar sign then the number, e.g. $1 would be the first. So we will replace the characters found with the first group ($1), disregard the second group containing the dash and insert the "/" instead, then include the third group ($3):
$1/$3
The settings in the replace window need to have "Regular expression" and "Wrap around" checked, and ". matches newline" unchecked.
You can then click Replace all to replace all occurrences, or go through using Replace individually.
Since the beginning and end of line characters are not included, you can find multiple occurrences of this pattern on a single line.
Note: This answer follows the same procedure as Toto's, however uses a different regular expression.
Ctrl+H
Find what: ^(.+)-([^-]+)$
Replace with: $1/$2
check Wrap around
check Regular expression
DO NOT CHECK . matches newline
Replace all
Explanation:
^ : begining of line
(.+) : 1 or more any character, catch in group 1
- : a dash
([^-]+) : 1 or more any character but dash, catch in group 2
$ : end of line

Regex - replace blank spaces in line (Notepad++)

I have a document with multiple information. What I want is to build a Notepad++ Regex replace function, that finds the following lines in the document and replaces the blank spaces between the "" with an underline (_).
Example:
The line is:
&LOG Part: "NAME TEST.zip"
The result should be:
&LOG Part: "NAME_TEST.zip"
The perfect solution would be that the regex finds the &LOG Part: "NAME TEST.zip" lines and replaces the blank space with an underline.
What I have tried for now is this expression to find the text between the " ":
\"[^"]*\"
It should do it, but I don't know which expression to use to replace the blank spaces with an underline.
Anyone could help with a solution?
Thanks!
The \"[^"]*\" will only match whole substrings from " up to another closest " without matching individual spaces you want to replace.
Since Notepad++ does not support infinite width lookbehind, the only possible solution is using the \G - based regex to set the boundaries and use multiple matching (this one will replace consecutive spaces with 1 _):
(?:"|(?!^)\G)\K([^ "]*) +(?=[^"]*")
Or (if each space should be replaced with an underscore):
(?:"|(?!^)\G)\K([^ "]*) (?=[^"]*")
And replace with $1_. If you need to restrict to replacing inside &LOG Part only, just add it to the beginning:
(?:&LOG Part:\s*"|(?!^)\G)\K([^ "]*) (?=[^"]*")
A human-readable explanation of the regex:
(?:"|(?!^)\G)\K - Find a ", or, with each subsequent successful match, the end of the previous successful match position, and omit all the text in the buffer (thanks to \K)
([^ "]*) - (Group 1, accessed with$1from the replacement pattern) 0+ characters other than a space and"`
+ - one or more literal spaces (replace with \h to match all horizontal whitespace, or \s to match any whitespace)
(?=[^"]*") - check if there is a double quote ahead of the current position

Notepad++, replace the first and second comma to ":"

In Notepad++, I'd like to replace only the first and second comma (","), by ":".
Example :
blue,black,red -> blue:black:red (2 first commas replaced)
blue,black,red,yellow -> blue:black:red,yellow (third comma still here)
Thanks!
I believe you can do this by replacing this regex:
^([^,]*),([^,]*),(.*)$
With this:
$1:$2:$3
For compatibility with cases where there are less than 2 commas, use these:
^(([^,]*),)?(([^,]*),)?(.*)$
$2:$4:$5
Something along this line,
^([^,]*),([^,]*),(.*)$
And replace with
$1:$2:$3
Or \1:\2:\3
Just two capturing groups is enough.
Regex:
^([^,]*),([^,]*),
Replacement string:
$1:$2:
DEMO
Explanation:
^ Asserts that we are at the start.
([^,]*) Captures any character not of , zero or more times and stored it into a group.(ie, group 1)
, Matches a literal , symbol.
([^,]*) Captures any character not of , zero or more times and stored it into a group.(ie, group 2)
, Matches a literal , symbol.
Well you can try to capture the parts in groups and then replace them as follows:
/^([^,]*),([^,]*),(.*)$/$1:$2:$3
How does it work: each line is matched such that the first part contains all data before the first comma, the second part in between the two commas and the third part all other characters (including commas).
This is simply replaced by joining the groups with colons.
A no-brainer; virtually "GREP 1-0-1". Not really an effort.
Just find
^([^,]+),([^,]+),
and replace with
\1:\2:
Click on the menu item: Search > Replace
In the dialog box that appears, set the following values...
Find what: ^([^,]+),([^,]+),
Replace with: $1:$2:
Search Mode: Regular expression