I have this file :
example-site1.com site-site1.com user1 password1
example-site2.com site-site2.com user2 password2
example-site3.com site-site3.com user3 password3
there are irregular spaces, and i want to add port 80 behind the site and the # character
i need to arrange the words like this :
example-site1.com:80#user1#password1
example-site2.com:80#user2#password2
example-site3.com:80#user3#password3
can someone help me?
Thanks.
Ctrl+H
Find what: ^(\S+)\h+\S+\h+(\S+)\h+(\S+)
Replace with: $1:80#$2#$3
CHECK Wrap around
CHECK Regular expression
Replace all
Explanation:
^ # beginning of line
(\S+) # group 1, 1 or more non spaces (domain)
\h+ # 1 or more horizontal spaces
\S+ # 1 or more non spaces
\h+ # 1 or more horizontal spaces
(\S+) # group 2, 1 or more non spaces (user)
\h+ # 1 or more horizontal spaces
(\S+) # group 3, 1 or more non spaces (password)
Replacement:
$1 # content of group 1
:80# # literally :80#
$2 # content of group 2
# # literally #
$3 # content of group 3
Screenshot (before):
Screenshot (after):
Related
I am trying to find in the Notepad++ strings like this:
'',
And convert them into this:
'',
I've made a regular expression to crop the string beginning with cards/ and ending with </a>:
(cards/)([^\s]{1,50})(([\s\.\?\!\-\,])(\w{1,50}))+(\.mp3"></a>)
Or an alternative approach:
(cards/)([^\s]{1,50})([\s\.\?\!\-\,]{0,})([^\s]{1,50})
Both work fine for search, but I can't get the replacement.
The problem is that the number of words in a sentence may vary.
And I can't get the ID of sub-expressions in the double parentheses.
The following format of replacement: \1\2\3... doesn't work, as I can't get the correct ID of the sub-expressions in the double parentheses.
I tried to google the topic, but couldn't find anything. Any advice, link or best of all a full replacement expression will be very much appreciated.
This will replace all spaces after /cards/ with a hyphen and lowercase the filename.
Ctrl+H
Find what: (?:href="/mp3files/cards/|\G)\K(?!\.mp3)(\S+)(?:\h+|(\.mp3))
Replace with: \L$1(?2$2:-)
CHECK Wrap around
CHECK Regular expression
Replace all
Explanation:
(?: # non capture group
href="/mp3files/cards/ # literally
| # OR
\G # restart fro last match position
) # end group
(?!\.mp3) # negative lookahead, make sure we haven't ".mp3" after this position
\K # forget all we have seen until this position
(\S+) # group 1, 1 or more non spaces
(?: # non capture group
\h+ # 1 or more horizontal spaces
| # OR
(\.mp3) # group 2, literally ".mp3"
) # end group
Replacement:
\L$1 # lowercase content of group 1
(?2 # if group 2 exists (the extension .mp3)
$2 # use it
: # else
- # put a hyphen
) # endif
Screenshot (before):
Screenshot (after):
I have a dataset:
1.
Name1
Name2
Name3
2.
Name1
Name2.
Name3
and so on.
Using regex, I want the output to be:
Name1,Name2,Name3
Name1,Name2.,Name3
I'm trying to import into a google sheet, so need a comma delimited file. I believe that the steps are to replace the numbers followed by a period with \n and then add a comma after each column name. Note that some fields Ex: Name2. have a number followed by a period so having issues with \d+[.]
Ctrl+H
Find what: ^(.+)\R(.+)\R(.+)$
Replace with: $1,$2,$3
CHECK Wrap around
CHECK Regular expression
UNCHECK . matches newline
Replace all
Explanation:
^ # beginning of line
(.+) # group 1, 1 or more any character but newline
\R # any kind of linebreak
(.+) # group 2, 1 or more any character but newline
\R # any kind of linebreak
(.+) # group 3, 1 or more any character but newline
$
Replacement:
$1 # content of group 1
, # comma
$2 # content of group 2
, # comma
$3 # content of group 3
Screenshot (before):
Screenshot (after):
I'm using Notepad++ to replace some lines. Basically what I want to do is:
line 1 -
STR::P=FOOXPATTERN=5 AND MORETHINGS YPATTERN=9 BUT XPATTERN=3 AND YPATTERN=20
line 2 -
MOR::P=BAR XPATTERN=1 STRSTR MORETHINGS YPATTERN=1BUT XPATTERN=10 AND YPATTERN=40
...
So this must be transformed in:
line 1
XPATTERN=5|YPATTERN=9|PATTERN=3|YPATTERN=20
line 2 -
XPATTERN=1|YPATTERN=1|XPATTERN=10|YPATTERN=40
My point is that I can have many XPATTERN and many YPATTERN in the same line. Then I would like to replace all my line for the pattern found.
I tried to use negation on regex, but with no success.
Ctrl+H
Find what: (?:^|\G(?!^)).*?((?:XPATTERN|YPATTERN)=\d+)(?:(?!(?:XPATTERN|YPATTERN)=).)*($)?
Replace with: $1(?2:|))
CHECK Match case
CHECK Wrap around
CHECK Regular expression
UNCHECK . matches newline
Replace all
Explanation:
(?: # non capture group
^ # beginning of line
| # OR
\G(?!^) # restart from last match position, not at the beginning of line
) # end group
.*? # 0 or more any character but newline
( # group 1
(?: # non capture group
XPATTERN # XPATTERN
| # OR
YPATTERN # YPATTERN
) # end group
=\d+ # equal sign followed by 1 or more digits
) # end group 1
(?: # non capture group
(?! # negative lookahead, make sure we haven't after:
(?: # non capture group
XPATTERN # XPATTERN
| # OR
YPATTERN # YPATTERN
) # end group
= # equal sign
) # end lookahead
. # any character but newline
)* # end group, may appear 0 or more times
($)? # group 1, end of line, optional
Replacement:
$1 # content of group 1 (i.e. X or Y PATTERN = digits)
(?2 # IF group 2 exists (end of line), do nothing
: # ELSE
| # add a pipe character
) # ENDIF
Screen capture (before):
Screen capture (after):
Use a regexp that matches the pattern and anything before it, and replaces it with just the pattern.
Replace: .*?((XPATTERN|YPATTERN|ZPATTERN|...)=\d+)
With: |\1
If there's something after all the patterns, you can remove the rest after the above replacements with:
Replace: ^((\|(XPATTERN|YPATTERN|ZPATTERN|...)=\d+)*).*
With: \1
This will leave a | at the beginning of each line. You can remove that as a third step:
Replace: ^\|
With: empty string
I want an expression that allows number and one dash OR number and one space. Space or dash are optional.
I tried this
/^([0-9]+(-[0-9]+)?)|([0-9]+(\s[0-9]+)?)$/
Accepted regular expressions:
11-222
444 99
You can put the OR in the middle of your expression: ^([0-9]+)(\s|-)([0-9]+)$ works with your examples in Notepad++.
Let's explain your regex.
^ # beginning of line
( # start group 1
[0-9]+ # 1 or more digits
( # start group 2
- # a hyphen
[0-9]+ # 1 or more digits
)? # end group 2, optional
) # end group 1
| # OR
( # start group 3
[0-9]+ # 1 or more digits
( # start group 4
\s # a space
[0-9]+ # 1 or more digits
)? # end group 4, optional
) # end group 3
$ # end of line
The OR acts between the group 1 at the beginning of the line and the group 3 at the end of the line. But you want group 1 and group 3 anchored at the beginning and at the end.
Add a group over group 1 and 3:
^(([0-9]+(-[0-9]+)?)|([0-9]+(\s[0-9]+)?))$
You can use non capture groups (more efficient) instead of capture group
^(?:(?:[0-9]+(?:-[0-9]+)?)|(?:[0-9]+(?:\s[0-9]+)?))$
Combine the hyphen and the space in a character class and remove the superfluous groups:
^[0-9]+(?:[-\s][0-9]+)?$
If your regex flavour supports it, change the [0-9] into \d. Finally your regex becomes:
^\d+(?:[-\s]\d+)?$
Much simpler, no?
My file has 4000k lines. I need to reformat it. So, I am trying notepad++ (or awk). The structure every line is
acc|GENBANK|ABJ91977.1|GENBANK|DQ876324|pol protein Tabulator[Human immunodeficiency virus 1]TabulatorTLWQRPFVTIKVGGQLKEALLDTGADDTVLEEIELPGRWKPKMIGGIGGFIKVRQYDQIXVEICGHKAIGTVLVGPTPVNVIGRNLMTQIGCTLN
The characters among the 4th vertical bar | and the first [ is variable length. Only I am looking for tips or where to focus to do it myself. I tried to print with awk but how there are one part variable in length, I obtained different results. Neither I can select by columns.
I would like to obtain a file with this structure
acc|GENBANK|ABJ91977.1|GENBANK|DQ876324,acc|GENBANK|ABJ91977.1|GENBANK|DQ876324,pol protein
and other file with this structure
acc|GENBANK|ABJ91977.1|GENBANK|DQ876324TabulatorTLWQRPFVTIKVGGQLKEALLDTGADDTVLEEIELPGRWKPKMIGGIGGFIKVRQYDQIXVEICGHKAIGTVLVGPTPVNVIGRNLMTQIGCTLN
TAB are in bold letters - Tabulator
Here is a way to do for the first file.
Ctrl+H
Find what: (^[^|]+(?:\|[^|]+){4})\|(.+?)\h+\[.+$
Replace with: $1,$1,$2
check Wrap around
check Regular expression
UNCHECK . matches newline
Replace all
Explanation:
( # group 1
^ # beginning of line
[^|]+ # 1 or more non pipe
(?: # start non capture group
\| # a pipe
[^|]+ # 1 or more non pipe
){4} # end group, must appear 4 times
) # end group 1
\| # a pipe
(.+?) # group 2, 1 or more any character but newline, not greedy
\h+ # 1 or more horizontal spaces (space or tabulation)
\[ # 1 openning square bracket
.+ # 1 or more any character but newline
$ # end of line
Replacement:
$1 # content of group 1
, # a comma
$1 # content of group 1
, # a comma
$2 # content of group 2
Result for given example:
acc|GENBANK|ABJ91977.1|GENBANK|DQ876324,acc|GENBANK|ABJ91977.1|GENBANK|DQ876324,pol protein
Screen capture:
For the second file:
Ctrl+H
Find what: (^[^|]+(?:\|[^|]+){4})\|.+?\h+\[.+?\](.+)$
Replace with: $1$2
check Wrap around
check Regular expression
UNCHECK . matches newline
Replace all
Explanation:
( # group 1
^ # beginning of line
[^|]+ # 1 or more non pipe
(?: # start non capture group
\| # a pipe
[^|]+ # 1 or more non pipe
){4} # end group, must appear 4 times
) # end group 1
\| # a pipe
.+? # 1 or more any character but newline, not greedy
\h+ # 1 or more horizontal spaces (space or tabulation)
\[ # 1 openning square bracket
.+? # 1 or more any character but newline, not greedy
\] # a closing square bracket
(.+) # group 2, 1 or more any character but newline
$ # end of line
Screen capture: