search and replace with regex to increment numbers in Visual Studio Code - regex

I'm currently working on a big svg sprite.
The diffrent images are always 2000px apart.
What I have is:
<g transform="translate(0,0)">
<g transform="translate(0,2000)">
<g transform="translate(0,4000)">
After regex want this so just adding 2000 onto the second number:
<g transform="translate(0,2000)">
<g transform="translate(0,4000)">
<g transform="translate(0,6000)">
I have the issue now that some new images have to be put at the top of the document, thus meaning i would need to change all numbers and they are quite alot.
I was thinking about using regular expressions and even found out that it works in the search bar of VS Code. The thing is i never worked with any regex and i'm kinda confused.
Could someone give me a solution and an explanation for incrementing all the sample numbers by 2000?
I hope i understand it afterwards so i can get my foot into that topic.
I'm also happy with just links to tutorials in general or my specific use case.
Thank you very much :)

In VSCode, you can't replace with an incremented value inside a match/capture. You can only do that inside a callback function passed as the replacement argument to a regex replace function/method.
You may use Notepad++ to perform these replacements after installing Python Script plugin. Follow these instructions and then use the following Python code:
def increment_after_openparen(match):
return "{0}{1}".format(match.group(1),str(int(match.group(2))+2000))
editor.rereplace(r'(transform="translate\(\d+,\s*)(\d+)', increment_after_openparen)
See the regex demo.
Note:
(transform="translate\(\d+,\s*)(\d+) matches and captures into Group 1 transform="translate( + 1 or more digits, then , and 0 or more whitespaces (with (transform="translate\(\d+,\s*))) and then captures into Group 2 any one or more digits (with (\d+))
match.group(1) is the Group 1 contents, match.group(2) is the Group 2 contents.
Basically, any group is formed with a pair of unescaped parentheses and the group count starts with 1. So, if you use a pattern like (Item:\s*)(\d+)([.;]), you will need to use return "{0}{1}{2}".format(match.group(1),str(int(match.group(2))+2000), match.group(3)). Or, return "{}{}{}".format(match.group(1),str(int(match.group(2))+2000), match.group(3)).

you can use the extension Regex Text Generator
Select the numbers with Multi Cursor, can be done with Regex Find and Alt+Enter in find box
Run command: Generate text based on regular expression
As Match Expression use: (\d+)
As generator extression use: {{=N[1]+2000}}
You get a preview of the result.
Press Enter if OK, or Esc to abort
You can set this type of search replace as a predefined in the setting regexTextGen.predefined
"regexTextGen.predefined": {
"Add/Subtract a number" : {
"originalTextRegex": "(\d+)",
"generatorRegex": "{{=N[1]+1}}"
}
}
You can edit the expressions (change the 1) if you choose a predefined.

SublimeText3 with the Text-Pastry add-in can also do \i

I wrote an extension, Find and Transform, to make these math operations on find and replaces with regex's quite simple (and much more like path variables, conditionals, string operations, etc.). In this case, this keybinding (in your keybindings.json) will do what you want:
{
"key": "alt+r", // whatever keybinding you want
"command": "findInCurrentFile",
"args": {
"find": "(?<=translate\\(\\d+,\\s*)(\\d+)", // double-escaped
"replace": "$${ return $1 + 2000 }$$",
"isRegex": true,
// "restrictFind": "document", // or line/once/selections/etc.
}
}
That could also be a setting in your settings.json if you wanted that - see the README.
(?<=translate\\(\\d+,\\s*) a positive lookbehind, you can use non-fixed length items in the lookbehind, like \\d+.
(\\d+) capture group 1
The replace: $${ return $1 + 2000 }$$
$${ <your string or math operation here> }}$
return $1 + 2000 add 2000 to capture group 1
Demo:

Related

How to find non-matching strings with Regex in Sublime?

I have this text:
<Path Fill="None"
PathData="M244.87,363.97 L245.38,363.91 M245.38,363.91 L245.46,363.84 M245.46,363.84 L245.52,363.75 M245.52,363.75 L245.54,363.7 M245.54,363.7 L246.07,370.18 M246.07,370.18 L245.95,370.25 M245.95,370.25 L245.8,370.37 M245.8,370.37 L245.63,370.54 M245.63,370.54 L245.52,370.73 M245.52,370.73 L245.42,370.9 M245.42,370.9 L245.17,368.03 M245.17,368.03 L244.87,363.97"
Stroke="#898989" StrokeWidth="0.5"/>
<Path Fill="None"
PathData="M247.4,371.21 L247.49,371.16 M247.49,371.16 L247.91,371.13 M247.91,371.13 L249.74,371.01 M249.74,371.01 L252.52,370.82 M252.52,370.82 L252.72,370.83 M252.72,370.83 L252.72,370.84 M252.72,370.84 L252.71,370.89 M252.71,370.89 L252.72,370.95 M252.72,370.95 L252.75,371.38 M252.75,371.38 L251.86,371.44 M251.86,371.44 L249.62,371.63 M249.62,371.63 L247.55,371.79 M247.55,371.79 L247.51,371.35 M247.51,371.35 L247.47,371.28 M247.47,371.28 L247.42,371.22 M247.42,371.22 L247.4,371.21"
Stroke="#878787" StrokeWidth="0.5"/>
<Path Fill="None"
PathData="M246.46,372.67 L246.47,372.05 M246.47,372.05 L246.47,372.05 M246.47,372.05 L246.52,372.07 M246.52,372.07 L246.58,372.09 M246.58,372.09 L247.44,372.02 M247.44,372.02 L248.68,371.91 M248.68,371.91 L248.81,373 M248.81,373 L248.07,373.06 M248.07,373.06 L247.88,373.07 M247.88,373.07 L248.54,379.11 M248.54,379.11 L247.62,379.18 M247.62,379.18 L247.2,379.21 M247.2,379.21 L247.15,379.24 M247.15,379.24 L247.12,379.27 M247.12,379.27 L247.06,379.17 M247.06,379.17 L246.83,376.84 M246.83,376.84 L246.46,372.67"
Stroke="#898989" StrokeWidth="0.5"/>
And I am trying to find and delete the paths which are not of a certain color, i.e. - #898989. I would like to use regex to find the non-matching strings.
I am trying the following:
.*(<Path Fill).*(\r\n|\r|\n).*(\r\n|\r|\n).*(?!#898989).*(\r\n|\r|\n)
But this returns the same as the one I would use to find the matching strings:
.*(<Path Fill).*(\r\n|\r|\n).*(\r\n|\r|\n).*(#898989).*(\r\n|\r|\n)
I thought the ?! was a negative lookahead, and would exclude those strings. It seems to not change the results, though.
Any help?
There are many regex solutions to your problem. Lets first discuss why the regex you proposed does not work as expected.
Problem
.*(<Path Fill).*(\r\n|\r|\n).*(\r\n|\r|\n).*(?!#898989).*(\r\n|\r|\n)
The problem occurs at the part
.*(?!#898989).*(\r\n|\r|\n)
The regex simply says match as much of anything as you can. After matching, check if at the current position there is no #898989. Then again....
The match as much of anything as you can is causing the problem. The first .* is actually capturing the whole line.
Stroke="#898989" StrokeWidth="0.5"/>
Then (?!#898989) comes into play which will succeed since after > there is no #898989. To make it obvious, change the regex to -
.*(?:<Path Fill).*[\r\n].*[\r\n](.*)(?!#898989).*
This regex does the same thing. In this regex, (\r\n|\n|\r) is replaced with [\r\n]. Nothing is being captured by the starting brackets (?:<Path Fill). However, this time the .* before #898989 is surrounded by (...) to highlight the text being captured by it.
Observe the yellow lines to see what is being captured by the .* before the #898989. Here is the link: https://regex101.com/r/2R54uW/1
Correction
As already mentioned in the comments, the regex can be corrected by forcing the .* to stop at Stroke=" and then making the position check.
.*(?:<Path Fill).*[\r\n].*[\r\n].*Stroke=\"(?!#898989).*[\r\n]
Here is another regex that does the same thing -
.*(?:<Path Fill).*[\r\n].*[\r\n]((?!#878787).)*/>
Final Thoughts
Try using [\r\n] in place of (\r\n|\r|\n) since character class is faster than alternation.
If you have any additional doubts please comment.
If there can not be a < and > char in the Path, you could assert that the color does not occur before the closing />
<Path Fill(?![^<>]*#898989[^<>]*/>)[^<>]*/>
Regex demo
If it should be the Stroke specifically
<Path Fill(?![^<>]*Stroke="#898989"[^<>]*/>)[^<>]*/>
Regex demo

Regex in Notepad++ to select on string length between specific XML tags

I'm working with Emergency Services data in the NEMSIS XSD. I have a field, which is constrained to only 50 characters. I've searched this site extensively, and tried many solutions - Notepad++ rejects all of them, saying not found.
Here's an XML Sample:
<E09>
<E09_01>-5</E09_01>
<E09_02>-5</E09_02>
<E09_03>-5</E09_03>
<E09_04>-5</E09_04>
<E09_05>this one is too long Non-Emergency - PT IS BEING DISCHARGED FROM H AFTER BEING ADMITTED FOR FAILURE TO THRIVE AND ALCOHOL WITHDRAWAL</E09_05>
</E09>
<E09>
<E09_01>-5</E09_01>
<E09_02>-5</E09_02>
<E09_03>-5</E09_03>
<E09_04>-5</E09_04>
<E09_05>this one is is okay</E09_05>
</E09>
I've tried solutions naming the E09_05 tag in different ways, using <\/E09_05> for the closing tag as I've seen in some examples, and as just </E09_05> as I've seen in others. I've tried ^.{50,}$ between them, or [a-zA-Z]{50,}$ between them, I've tried wrapping those in-between expressions in () and without. I even tried just [\s\S]*? in between the tags. The only thing that Notepad++ finds is when I use ^.{50,}$ by itself with no XML tags ... but then I wind up hitting on all the E13_01 tags (which are EMS narratives, and always > 50 characters) -- making for painstaking and wrist-aching clicks.
I wanted to XSLT this, but there is too much individual, hands on tweeking of each E09_05 field for automating it. Perl is not an option in this environment (and not a tool I know at all anyway).
To be truly sublime, both E09_05 and E09_08 fields with string lengths >50 need to be what is selected on the search ... but no other elements of any kind or length.
Thanks in advance. I'm sure I'm just missing some subtle \, or () or [] somewhere ... hopefully ...
The following regex will find the text content of <E09_05> elements with more than 50 characters.
(?<=<E09_05>).{51,}?(?=</E09_05>)
Explanation
(?<=<E09_05>) Start matching right after <E09_05>
.{51,}? Match 51 or more characters (in a single line)
The ? makes it reluctant, so it'll stop at first </E09_05>
(?=</E09_05>) Stop matching right before </E09_05>
For truly sublime matching, i.e. both E09_05 and E09_08 fields with string lengths >50, use:
(?<=<(E09_0[58])>).{51,}?(?=</\1>)
Explanation
<(E09_0[58])> Match <E09_05> or <E09_08>, and capture the name as group 1
</\1> Use \1 backreference to match name inside </name>
If you want to shorten the text with ellipsis at the end, e.g. Hello World with max length 8 becomes Hello..., use:
Find what: (?<=<(E09_0[58])>)(.{47}).{4,}(?=</\1>)
Replace with: \2...

How to combine multiple RegEx commands for Notepad++ using capture groups and alternations?

I am converting exported SQL views as files to a different syntax using a separate specialized conversion tool. This tool can't handle certain commands and formatting so I'm using Notepad++ with RegEx to alter the files ahead of time.
So far I am getting the results that I want, but it takes three separate Find/Replace actions. I'd like to reduce these three RegEx actions down to one if possible.
Find: (.*)(CREATE VIEW.*\nGO)(.*)
Replace: \2
Find: (CREATE VIEW )(.*)(\r\nAS)
Replace: \1"\2"\3
Find: (oldschema1\.|\[oldschema1\]\.|\[|\]|TOP \(100\) PERCENT|oldschema2\.)|(^GO$)|(\A^(.*?))
Replace: (?1)(?2\;)(?3SET SCHEMA schemaname\; \n\n\1)```
I'm using Notepad++ 7.7.1 64-bit, Find/Replace with Regular Expression search mode - ". matches newline" check on.
You'll see in my code that I'm already using capture groups with alternation. I thought I could combine the first two RegEx steps as additional capture groups to Step 3 but it doesn't work out, possibly because they are nested.
I tried referencing the nested groups by incrementing the referencing number accordingly, but it doesn't work (blanks out the result).
Here is an example SQL view file. It's not a working view because I added "oldschema2" so the RegEx would have something to find for one of the replacements, but it's representative as an example here.
garbage
text
beforehand
CREATE VIEW [oldschema1].[viewname]
AS
SELECT DISTINCT
TOP (100) PERCENT oldschema1.TABLENAME.FIELD1, oldschema1.TABLENAME.FIELD2
FROM oldschema1.TABLENAME
WHERE (oldschema1.TABLENAME.FIELD3 = N'Z003') AND oldschema2.TABLENAME.FIELD2 = 1
ORDER BY oldschema1.TABLENAME.FIELD1
GO
garbage
text
after
Here is some additional details of what I'm trying to achieve with each pass.
Notepad++ RegEx Step 1 - isolate view block from CREATE VIEW to GO
Find:
(.*)(CREATE VIEW.*\nGO)(.*)
Replace:
\2
Step 2 - put quotes around view name
Find:
(CREATE VIEW )(.*)(\r\nAS)
Replace:
\1"\2"\3
Step 3 - remove/replace various texts and insert a line at the beginning of the file
Find:
(oldschema1\.|\[oldschema1\]\.|\[|\]|TOP \(100\) PERCENT|oldschema2\.)|(^GO$)|(\A^(.*?))
Replace:
(?1)(?2\;)(?3SET SCHEMA schemaname\; \n\n\1)
The expected output from the above example would be:
SET SCHEMA schemaname;
CREATE VIEW "viewname"
AS
SELECT DISTINCT
TABLENAME.FIELD1, TABLENAME.FIELD2
FROM TABLENAME
WHERE (TABLENAME.FIELD3 = N'Z003') AND TABLENAME.FIELD2 = 1
ORDER BY TABLENAME.FIELD1
;
which I achieve with the above three steps, but I'd like to do it in one Find/Replace if possible.
I'm pretty new to RegEx, and StackOverflow for that matter. Your help is greatly appreciated.
Step 1
I'm not so sure about it, but I'm guessing that maybe we would want an expression similar to:
[\s\S]*?(CREATE VIEW[\s\S]*GO\s*)[\s\S]*
to be replaced with $1, where our desired data is in this capturing group:
(CREATE VIEW[\s\S]*GO\s*)
and we can even remove \s*:
(CREATE VIEW[\s\S]*GO)
and just try:
[\s\S]*?(CREATE VIEW[\s\S]*GO)[\s\S]*
with an m flag.
In the right panel of this demo, the expression is further explained, if you might be interested.
Step 2
We can likely try:
(CREATE VIEW)(.*)
and replace with:
SET SCHEMA schemaname;\n\n$1 "viewname"
Demo
Step 3
This step would probably be done with an expression similar to:
TOP \(100\) PERCENT |oldschema1\.
being replaced with an empty string.
Demo
Step 4:
\s*GO being replaced with \n; or just ; and we might likely have the desired output, not sure though.
Demo

Google sheets REGEXREPLACE() replace text with itself

In Google Sheets, I'd like to replace the following text:
The ANS consists of fibers that stimulate ((1smooth muscle)), ((2cardiac muscle)), and ((3glandular cells)).
with this text below:
The ANS consists of fibers that stimulate {{c1::smooth muscle}}, {{c2::cardiac muscle}}, and {{c3::glandular cells}}.
I know if I use =REGEXREPLACE(E3, "\(\([0-9]*", "{{c::") I can get here:
The ANS consists of fibers that stimulate {{c::smooth muscle)), {{c::cardiac muscle)), and {{c::glandular cells)).
BUT I don't know how to keep the original numbers
Nvm, figured it out.
Putting parentheses around the term allows you to reference it again in your replacement string.
For example, my solution for my problem was this:
=REGEXREPLACE(E3, "\(\(([0-9]*)", "{{c$1::")
This works because putting [0-9]* in parentheses like so: ([0-9]*) allowed it to be referenced as $1 in my substitution string.
I assume that if I had another phrase enclosed in parentheses after that it would be able to be referenced with $2.
Hope this helps someone in the future.
Pass 1:
Search Pattern:(((\d)(.+)
Replacement:{{C$1::$2
Pass 2:
Search Pattern: ))
Replacment: }}
I played around with it some more and this will do it in one pass:
Search Pattern: \(\((\d)(.+?)\b\)\)
Replacements: {{C$1::$2}}

Using regex multiple capture groups to split up a string

I have a file that looks like this...
"1234567123456","V","0","0","BLAH","BLAH","BLAH","BLAH"
"1234567123456","D","TEST1 "
"1234567123456","D","TEST 2~TEST3"
"1234567123456","R","TEST4~TEST5"
"1234567123457","V","0","0","BLAH","BLAH","BLAH","BLAH"
"1234567123457","D","TEST 6"
"1234567123457","D","TEST7"
"1234567123457","R","TEST 8~TEST9~TEST,10"
All I'm trying to do is parse the D and R lines. The ~ is used in this case as a separator. So the end results would be...
"1234567123456","V","0","0","BLAH","BLAH","BLAH","BLAH"
"1234567123456","D","TEST1 "
"1234567123456","D","TEST3"
"1234567123456","D","TEST3"
"1234567123456","R","TEST4"
"1234567123456","R","TEST5"
"1234567123457","V","0","0","BLAH","BLAH","BLAH","BLAH"
"1234567123457","D","TEST 6"
"1234567123457","D","TEST7"
"1234567123457","R","TEST 8"
"1234567123457","R","TEST9"
"1234567123457","R","TEST,10"
I'm using regex on applications like Textpad and Notepad++. I have not figured out how to use a regex like /.+/g because the applications do not like the forward slashes. So I don't think I can use things like the global modifier. I currently have the following regex...
//In a program like Textpad/Notepad++
<FIND> "(.{13})","D","([^~]*)~(.*)
<REPLACE> "\1","D","\2"\n"\1","D","\3
Now if I run a find and replace with the above params a few times it would work fine (for the D lines only). The problem is there is an unknown number of lines to be made. For example...
"1234567123456","D","TEST1~TEST2~TEST3~TEST4~TEST5"
"1234567123457","D","TEST1~TEST2~TEST3"
"1234567123458","D","TEST1~TEST2"
"1234567123459","D","TEST1~TEST2~TEST3~TEST4"
I was hoping to be able to use a MULTI capture group to make this work. I found this PAGE talking about the common mistake between repeating a capturing group and capturing a repeated group. I need to capture a repeated group. For some reason I just could not make mine work right though. Anyone else have an idea?
Note: If I could get rid of the leading and trailing spaces EX: "1234567123456","D","TEST1 " ending up as "1234567123456","D","TEST1" that would be even better but not necessary.
RESOURCES:
http://www.regular-expressions.info/captureall.html
http://regex101.com/