Notepad++ regex replace with captures from the same file - regex

I have a file with the following lines (condensed example, real file is 1.000+ lines):
...
type1.value1=60 <-- replace 60 with 72 from line 5
type1.value2=15 <-- replace 15 with 14 from line 6
type2.value1=50 <-- replace 50 with 72 from line 5
type2.value2=18 <-- replace 18 with 14 from line 6
type3.value1=72
type3.value2=14
...
I want to replace all values from type(x) with the values from type3. There are many type/value combinations, so i would like to avoid handwork. Also, i have to do this really often.
Is that possible with Notepad++ Regex find/replace?
The matching expression is the following, where the first group should stay the same and the second should be replaced by the result of yet anoter regex.
^type1.([\w]+)=([\S]+)

Regex:
type(?!3\.)\d+\.value(\d+)=\K\d+(?=[\s\S]*?type3\.value\1=(\d+))
Replace with:
\2
Explanation:
type(?!3\.)\d+ Match a type other than 3
\.value(\d+)= Match every thing up to = but capture digits
\K Forget matches up to now
\d+ Match following digits
(?= Start of positive lookahead
[\s\S]*? Match anything lazily
type3\.value\1= Up to the same value of type3
(\d+) Then capture its value in CP #2
) End of positive lookahead
Live demo
The point is matching valueX from a type different than 3 then look for the same valueX from type3. If valueX is hypothetical or there isn't anything special to be looked, then there is no pure approach using regex in a find / replace functionality.

Related

Search&replace notepad regex to be used one

I have strings like is below,
nn"h11p3ppppvxq3b288N1 m 227"]
{vanxtageendganmesbhorgtgt(1702)}' d3zd6xf8dz8xd6dz8f6zd8`
[nn"5rvh11p3ppppvxq3b288N1 n 227"]
{vanxtageendganmesbhorgtgt(1802)}
d3zd6xf8dz8xd6dz8f6zd8
I start my 1st capturing group from m 227 till end of third line,
And my 2nd group from n 227 till end of third line .....
Now I want to add some digits to end of first captured group , say it -22
And some digits to end of second captured group, say it -11
My first regex can match and works separately so 2nd as well .... but to make them combine with | it doesn't .....
Search: (m\s.*\n.*\n.*)
Replace: $1 -22
My combined regex is as below
(m\s.*\n.*\n.*|n\s.*\n.*\n.*)
Replace: $1-22 $2-11
But this will add (-22 -11) to both intendeds ...
I want the output to be as below
nn"h11p3ppppvxq3b288N1 m 227"]
{vanxtageendganmesbhorgtgt(1702)}
d3zd6xf8dz8xd6dz8f6zd8 -22
[nn"5rvh11p3ppppvxq3b288N1 n 227"]
{vanxtageendganmesbhorgtgt(1802)}
d3zd6xf8dz8xd6dz8f6zd8 -11
I have used | or for to combine both regexes to works as one for the purpose of time Savage ....
Any help will be appreciated
You can use
Find What: ([mn])\s.*\R.*\R.*
Replace With: $& -$1
Details:
([mn]) - Group 1 ($1): m ior n
\s - a whitespace
.*\R.*\R.* - a line, a line break, then again a line and a line break and then a line.
The $& in the replacement is the backreference to the whole match.

Regex to parse 15 mins files only and skip 60 mins files

we have below file formats
60min-->
A20210217.0300-0000-0400-0000_GBM053.xml.gz
15min -->
A20210217.0300-0000-0315-0000_GBM053.xml.gz ,A20210217.0315-0000-0330-0000_GBM053.xml.gz, A20210217.0330-0000-0345-0000_GBM053.xml.gz , A20210217.0345-0000-0400-0000_GBM053.xml.gz
Tried with below regex but not working
!(^A[0-9]{8}.[0-9]{2}[0]{2}-[0-9]{4}-[0-9]{2}[0]{2}-[0-9]{4}_.*.xml(|\.gz)$)
The ! at the start of the pattern matches a ! literally which is not there in the example data. If it was meant as a delimiter, it should also be at the end.
You could make the second part match either 15, 30 or 45 and use an alternation to those values either in the first or in the third part of the hyphened string.
^A\d{8}\.(?:\d\d(?:[14]5|30)(?:-\d{4}){3}|\d{4}-\d{4}-\d\d(?:[14]5|30)-\d{4})_.*\.xml\.gz$
The pattern matches
^ Start of string
A\d{8}\. Match A and 8 digits followed by a .
(?: Non capture group for the alternation to match either
\d\d(?:[14]5|30) Match 2 digits and either 15 or 45 or 30
(?:-\d{4}){3} Match 3 times - and 4 digits
| Or
\d{4}-\d{4}- Match 2 times 4 digits and -
\d\d(?:[14]5|30)-\d{4} Match 2 digits and either 15 or 45 or 30 followed by 4 digits
) Close non capture groups
_.*\.xml\.gz Match _, 0+ times any char except a newline and .xml.gz
$ End of string
Regex demo
https://regex101.com/r/KqB81T/2
^A\d{8}\.(\d{2}(?:[14]5|30)-0000-\d{4}-0000|\d{4}-0000-\d{2}(?:[14]5|30)-0000)_.*\.xml(|\.gz)$
Break down structure:
First two entries are matched: \d{2}(?:[14]5|30)-0000-\d{4}-0000
Last two entries are matched: \d{4}-0000-\d{2}(?:[14]5|30)-0000
Add matches (UNION between the two SET matches): (FIRST_MATCH|SECOND_MATCH). Also make sure you don't have any character/space at the end (between gz and $)
Let me be the first to say: Welcome to SO, Muskan Garg Bansal!

How would I find values in a file, but only on lines that don't start with #?

I've got a document that looks something like this:
# Document ID 8934
# Last updated 2018-05-06
52 84 12 70 23 2 7 20 1 5
4 2 7 81 32 98 2 0 77 6
(..and so on..)
In other words, it starts off with a few comment lines, then the rest of the document is just a bunch of numbers separated by spaces.
I'm trying to write a regex that gets all digits on all lines that don't start with #, but I can't seem to get it.
I've read over answers such as
Regular Expressions: Is there an AND operator?
Regex: Find a character anywhere in a document but only on lines that begin with a specific word
and pawed through sites such as http://regular-expressions.info, but I still can't get an expression that works (the best I can get is a lengthy version of ^[^#].*
So how can I match digits (or text, or whatever) in a string, but only on lines that don't start with a certain character?
Your regex ^[^#].* uses a negated character class which matches not a # from the start of the string ^ and after that matches any character zero or more times.
This would for example also match t test
What you might do is use an alternation to match a whole line ^#.*$ that starts with a # or capture in a group one or more digits (\d+)
Your digits are captured group 1. You could change the (\d+) to for example a character class ([\w+.]+) to match more than only digits.
(?:^#.*$|(\d+))
Details
(?: Non capturing group
^#.*$ Match from the start of the line ^ a # followed by any character zero or more times .* until the end of the string $
| Or
(\d+) capture one or more digits in a group
) Close non capturing group
I think a way simpler method would be to replace the lines with "" first with this regex:
^#.*
And then you can just match all the numbers with this:
-?\d+ (-? is for negative)

Regex is possible to match?

I have files with these filename:
ZATR0008_2018.pdf
ZATR0018_2018.pdf
ZATR0218_2018.pdf
Where the 4 digits after ZATR is the issue number of magazine.
With this regex:
([1-9][0-9]*)(?=_\d)
I can extract 8, 18 or 218 but I would like to keep minimum 2 digits and max 3 digits so the result should be 08, 18 and 218.
How is possible to do that?
You may use
0*(\d{2,3})_\d
and grab Group 1 value. See the regex demo.
Details
0* - zero or more 0 chars
(\d{2,3}) - Group 1: two or three digits
_\d - a _ followed with a digit.
Here is a PCRE variation that grabs the value you need into a whole match:
0*\K\d{2,3}(?=_\d)
See another regex demo
Here, \K makes the regex engine omit the text matched so far (zeros) and then matches 2 to 3 digits that are followed with _ and a digit.
(?:[1-9][0-9]?)?[0-9]{2}(?=_[0-9])
or perhaps:
(?:[1-9][0-9]+|[0-9]{2})(?=_[0-9])
(https://www.freeformatter.com/regex-tester.html, which claims to use the XRegExp library, that you mention in another answer doesn't seem to backtrack into the (?:)? in my first suggestion where necessary, which makes it very different from any regex engine I've encoutered before and makes it prefer to match just the 18 of 218 even though it starts later in the string. But it does work with my second suggestion.
([1-9]\d{2,3})(?=_\d)
{x,y} will match from x to y times the previous pattern, in this case \d
Edit: from your own regex it looked as you wanted the part of the number which starts with a non-zero. However since your examples include leading 0s, maybe you really wanted :
(\d{2,3})(?=_\d)
Which will give you the last 3 digits before underscore unless there are only 2 digits.
I propose you:
^ZATR0*(\d{2,3})_\d+\.pdf$
demo code here. Result:
Match 1 Full match 0-17 ZATR0008_2018.pdf Group 1. 6-8 08
Match 2 Full match 18-35 ZATR0018_2018.pdf Group 1. 24-26 18
Match 3 Full match 36-53 ZATR0218_2018.pdf Group 1. 41-44 218

Select digits on the end of line

I need to replace only digits at the end of line with semicolon ; using RegEx in Notepad++.
Before:
ddd 66 ffff 5
d 44 dds 55
After:
ddd 66 ffff;
d 44 dds;
I'm trying to find digits at the end of lines with expression
($)(\d+)
but Notepad++ can't find anything by use of this expression. How to achieve this?
Find:
\s\d+$
Replace:
;
\d+ will match one or more digits. $ will match the end of the line--this is non-capturing (so don't worry... the end of the line will not be replaced in a find/replace operation). And so \d+$ will match one or more digits immediately followed by the end of the line.
I included \s (a single whitespace character) because it looks like you want to replace the space preceding the digits as well.
Note that you will need to do "Replace All" for this to work like you want. (because each regex match is for one instance only)
Try this find/replace:
find:
^(.*) \d+$
replace:
\1;
The find regex above matches anything up to and excluding a final space followed by at least one digit. If the end pattern for a given line is not space followed by one or more digits, the regex should not match. The replacement is the capture group, what is in parenthesis, which is everything up to but excluding the final space and number.