I have strings like is below,
nn"h11p3ppppvxq3b288N1 m 227"]
{vanxtageendganmesbhorgtgt(1702)}' d3zd6xf8dz8xd6dz8f6zd8`
[nn"5rvh11p3ppppvxq3b288N1 n 227"]
{vanxtageendganmesbhorgtgt(1802)}
d3zd6xf8dz8xd6dz8f6zd8
I start my 1st capturing group from m 227 till end of third line,
And my 2nd group from n 227 till end of third line .....
Now I want to add some digits to end of first captured group , say it -22
And some digits to end of second captured group, say it -11
My first regex can match and works separately so 2nd as well .... but to make them combine with | it doesn't .....
Search: (m\s.*\n.*\n.*)
Replace: $1 -22
My combined regex is as below
(m\s.*\n.*\n.*|n\s.*\n.*\n.*)
Replace: $1-22 $2-11
But this will add (-22 -11) to both intendeds ...
I want the output to be as below
nn"h11p3ppppvxq3b288N1 m 227"]
{vanxtageendganmesbhorgtgt(1702)}
d3zd6xf8dz8xd6dz8f6zd8 -22
[nn"5rvh11p3ppppvxq3b288N1 n 227"]
{vanxtageendganmesbhorgtgt(1802)}
d3zd6xf8dz8xd6dz8f6zd8 -11
I have used | or for to combine both regexes to works as one for the purpose of time Savage ....
Any help will be appreciated
You can use
Find What: ([mn])\s.*\R.*\R.*
Replace With: $& -$1
Details:
([mn]) - Group 1 ($1): m ior n
\s - a whitespace
.*\R.*\R.* - a line, a line break, then again a line and a line break and then a line.
The $& in the replacement is the backreference to the whole match.
Related
I have the following string
020075307354H 021133360876 981497910079937800ABC CDE FGH THY 0M19780403015001O+2¹qujzh_¢o\piVN¤«²µerNA¥\^?©E|=V_®¢Zu<£;Æ^TV½IÌc¤±·Gl.ÁEÊO·9y¹Bs¾Ë©ºFT¥*ÉA¬=iÚÒ®{æ*»¨;ÄNÕ®Ûòæ¦'Ñ…9>ÙYKè¹t/R{(>ÔÕBã2½7q¹|u…nztf~¦spw_ZX£\¦~Qa²mn¡¨QX«W±¯¯¦¨d£¾}·`B¶M}Qc|AµOÇ~Äd¤·¯HÇaI_¶²ÂÆYC?xÄR²>½HpÃjÁNLifm#ÕEí¾)ZvÇÊzØ)D&¦áÑM¡ç…1F¥Åh9R[9Fä¤Ãå<÷¼T}Ã…©ÎCDNs«E`É?¤eñ/ï´¯Åíÿt
and I want to use 1 Regex substitution to do the following 2 tasks:
Get the substring from position 49 to 58 -> 0079937800
Strip leading zeros from this substring -> 79937800
The desired end result is 79937800.
I figured out, that I can substitute the substring of task 1 with .{48}(.{10}).+.
The second task of removing leading I figured I can get using (\b0*([1-9][0-9]*|0)\b) , but how can I combine both tasks and get a working substitution string?
You can capture the "marker" that follows the 10-character string in a capture group in a positive lookahead, then match the desired substring with an arbitrary number of leading zeroes, and follow it with another positive lookahead to ensure that it is followed by the marker captured in the first capture group. The desired substring will then be in the second capture group:
^.{48}(?=.{10}(.*))0*(.*?)(?=\1)
Demo: https://regex101.com/r/Q61KYJ/1
Since you commented that the requirement for a substitution is mandated by your software, you can simply add .* at the end of the above regex and substitute the match with the second capture group:
^.{48}(?=.{10}(.*))0*(.*?)(?=\1).*
Demo: https://regex101.com/r/FcRAGB/1
I am very new to the world of regular expressions. I am trying to use Notepad++ using Regex for the following:
Input file is something like this and there are multiple such files:
Code:
abc
17
015
0 7
4.3
5/1
***END***
abc
6
71
8/3
9 0
***END***
abc
10.1
11
9
***END***
I need to be able to edit the text in all of these files so that all the files look like this:
Code:
abc
1,2,3,4,5
***END***
abc
6,7,8,9
***END***
abc
10,11,12
***END***
Also:
In some files the number of * around the word END varies, is there a way to generalize the number of * so I don't have to worry about it?
There is some additional data before abcs which does not need to be transposed, how do I keep that data as it is along with transposing the data between abc and ***END***.
Kindly help me. Your help is much appreciated!
Try the following find and replace, in regex mode:
Find: ^(\d+)\R(?!\*{1,}END\*{1,})
Replace: $1,
Demo
Here is an explanation of the regex pattern:
^ from the start of the line
(\d+) match AND capture a number
\R followed by a platform independent newline, which
(?!\*{1,}END\*{1,}) is NOT followed by ***END***
Note carefully the negative lookahead at the end of the pattern, which makes sure that we don't do the replacement on the final number in each section. Without this, the last number would bring the END marker onto the same line.
This will eplace only between "abc" and "***END***" with any number of asterisk.
Ctrl+H
Find what: (?:(?<=^abc)\R|\G(?!^)).+\K\R(?!\*+END\*+)
Replace with: ,
CHECK Match case
CHECK Wrap around
CHECK Regular expression
UNCHECK . matches newline*
Replace all
Explanation:
(?: # non capture group
(?<=^abc) # positive look behind, make sure we have "abc" at the beginning of line before
\R # any kind of linebreak
| # OR
\G # restart from last match position
(?!^) # negative look ahead, make sure we are not at the beginning of line
) # end group
.+ # 1 or more any character but newline
\K # forget all we have seen until this position
\R # any kind of linebreak
(?!\*+END\*+) # negative lookahead, make sure we haven't ***END*** after
Screen capture (before):
Screen capture (after):
I have some lines which I need to alter. They are protein sequences. How would I copy the first 4 characters of the line to the end of the line, and also copy the last 4 characters to the beginning of the line?
The strings are variable which complicates it, for example:
>X
LTGLGIGTGMAATIINAISVGLSAATILSLISGVASGGAWVLAGAKQALKEGGKKAGIAF
>Y
LVATGMAAGVAKTIVNAVSAGMDIATALSLFSGAFTAAGGIMALIKKYAQKKLWKQLIAA
Moreover, how could I exclude lines with a '>' at the beginning (these are names of the corresponding sequence)?
Does anyone know a regex which will allow this to work?
I've already tried some regex solutions but I'm not very experienced with this sort of thing and I can find the end string but can't get it to replace:
Find:
(...)$
Replace:
^$2$1"
An example of what I want to achieve is:
>1
ABCDEFGHIJKLMNOPQRSTUVWXYZ
becomes:
>1
WXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCD
Thanks
Try doing a find, in regex mode, on the following pattern:
^([A-Z]{4}).*([A-Z]{4})$
Then replace with the first four and last four characters swapped:
$2$0$1
Demo
You can use the regex below.
^(([A-Z]{4})([A-Z]*)([A-Z]{4}))$
^ asserts the position at the start of the line, so nothing can come before it.
( is the start of a capture group, this is group 1.
( is the start of a capture group, this is group 2. This group is inside group 1.
[A-Z]{4} means exactly 4 capital characters from A to Z.
) is the end of capture group 2.
( is the start of a capture group, this is group 3.
[A-Z]* matches capital characters from A to Z between zero and infinite times.
) is the end of capture group 3.
( is the start of a capture group, this is group 4.
[A-Z]{4} means exactly 4 capital characters from A to Z.
) is the end of capture group 4.
$ asserts the position at the end of the line, so nothing can come after it.
See how it works with a replace here: https://regex101.com/r/W786uL/3.
$4$1$2
$4 means put capture group 4 here. Which is the last 4 characters.
$1 means put capture group 1 here. Which is everything in the entire string.
$2 means put capture group 2 here. Which is the first 4 characters.
You can use
^(.{4})(.*?)(.{4})$
^ - start of sting
(.{4}) - Match any for characters except new line
(.*?) - Match any character zero or more time (lazy mode)
$ - End of string
Demo
I have a file with the following lines (condensed example, real file is 1.000+ lines):
...
type1.value1=60 <-- replace 60 with 72 from line 5
type1.value2=15 <-- replace 15 with 14 from line 6
type2.value1=50 <-- replace 50 with 72 from line 5
type2.value2=18 <-- replace 18 with 14 from line 6
type3.value1=72
type3.value2=14
...
I want to replace all values from type(x) with the values from type3. There are many type/value combinations, so i would like to avoid handwork. Also, i have to do this really often.
Is that possible with Notepad++ Regex find/replace?
The matching expression is the following, where the first group should stay the same and the second should be replaced by the result of yet anoter regex.
^type1.([\w]+)=([\S]+)
Regex:
type(?!3\.)\d+\.value(\d+)=\K\d+(?=[\s\S]*?type3\.value\1=(\d+))
Replace with:
\2
Explanation:
type(?!3\.)\d+ Match a type other than 3
\.value(\d+)= Match every thing up to = but capture digits
\K Forget matches up to now
\d+ Match following digits
(?= Start of positive lookahead
[\s\S]*? Match anything lazily
type3\.value\1= Up to the same value of type3
(\d+) Then capture its value in CP #2
) End of positive lookahead
Live demo
The point is matching valueX from a type different than 3 then look for the same valueX from type3. If valueX is hypothetical or there isn't anything special to be looked, then there is no pure approach using regex in a find / replace functionality.
I've got a document that looks something like this:
# Document ID 8934
# Last updated 2018-05-06
52 84 12 70 23 2 7 20 1 5
4 2 7 81 32 98 2 0 77 6
(..and so on..)
In other words, it starts off with a few comment lines, then the rest of the document is just a bunch of numbers separated by spaces.
I'm trying to write a regex that gets all digits on all lines that don't start with #, but I can't seem to get it.
I've read over answers such as
Regular Expressions: Is there an AND operator?
Regex: Find a character anywhere in a document but only on lines that begin with a specific word
and pawed through sites such as http://regular-expressions.info, but I still can't get an expression that works (the best I can get is a lengthy version of ^[^#].*
So how can I match digits (or text, or whatever) in a string, but only on lines that don't start with a certain character?
Your regex ^[^#].* uses a negated character class which matches not a # from the start of the string ^ and after that matches any character zero or more times.
This would for example also match t test
What you might do is use an alternation to match a whole line ^#.*$ that starts with a # or capture in a group one or more digits (\d+)
Your digits are captured group 1. You could change the (\d+) to for example a character class ([\w+.]+) to match more than only digits.
(?:^#.*$|(\d+))
Details
(?: Non capturing group
^#.*$ Match from the start of the line ^ a # followed by any character zero or more times .* until the end of the string $
| Or
(\d+) capture one or more digits in a group
) Close non capturing group
I think a way simpler method would be to replace the lines with "" first with this regex:
^#.*
And then you can just match all the numbers with this:
-?\d+ (-? is for negative)