Regex in Sublime Text tmLanguage file doesn't use multiline - regex

I'm trying to create a custom syntax language file to highlight and help with creating new documents in Sublime Text 2. I have come pretty far, but I'm stuck at a specific problem regarding Regex searches in the tmLanguage file. I simply want to be able to match a regex over multiple lines within a YAML document that I then convert to PList to use in Sublime Text as a package. It won't work.
This is my regex:
/(foo[^.#]*bar)/
And this is how it looks inside the tmLanguage YAML document:
patterns:
- include: '#test'
repository:
test:
comment: Tester pattern
name: constant.numeric.xdoc
match: (foo[^.#]*bar)
If I build this YAML to a tmLanguage file and use it as a package in Sublime Text, I create a document that uses this custom syntax, try it out and the following happens:
This WILL match:
foo 12345 bar
This WILL NOT match:
foo
12345
bar
In a Regex tester, they should and will both match, but in my tmLanguage file it does not work.
I also already tried to add modifiers to my regex in the tmLanguage file, but the following either don't work or break the document entirely:
match: (/foo[^.#]*bar/gm)
match: /(/foo[^.#]*bar/)/gm
match: /foo[^.#]*bar/gm
match: foo[^.#]*bar
Note: My Regex rule works in the tester, this problem occurs in the tmLanguage file in Sublime Text 2 only.
Any help is greatly appreciated.
EDIT: The reason I use a match instead of begin/end clauses is because I want to use capture groups to give them different names. If someone has a solution with begin and end clauses where you can still name 'foo', '12345' and 'bar' differently, that's fine by me too.

I found that this is impossible to do. This is directly from the TextMate Manual, which is the text editor Sublime Text is based on.
12.2 Language Rules
<...>
Note that the regular expressions are matched against only a single
line of the document at a time. That means it is not possible to use a
pattern that matches multiple lines. The reason for this is technical:
being able to restart the parser at an arbitrary line and having to
re-parse only the minimal number of lines affected by an edit. In most
situations it is possible to use the begin/end model to overcome this
limitation.
My situation is one of the few in which a begin/end model cannot overcome the limitation. Unfortunate.

Long time since asked, but are you sure you can't use begin/end? I had similar problems with begin/end until I got a better grasp of the syntax/logic. Here's a rough example from a json tmLanguage file I'm doing (don't know the proper YAML syntax).
"repository": {
"foobar": {
"begin": "foo(?=[^.#]*)", // not sure about what's needed for your circumstance. the lookahead probably only covers the foo line
"end": "bar",
"beginCaptures": {
"0": {
"name": "foo"
}
},
"endCaptures": {
"0": {
"name": "bar"
}
},
"patterns": [
{"include": "#test-after-foobarmet"}
]
},
"test-after-foobarmet": {
"comment": "this can apply to many lines before next bar so you may need more testing",
"comment2": "you could continue to have captures here that go to another deeper level...",
"name": "constant.numeric.xdoc",
"match": "anyOtherRegexNeeded?"
}
}
I didn't follow your
"i need to number the different sections between the '#' and '.'
characters."
, but you should be able to have a test in test-after-foobarmet with more captures if needed for naming different groups between foo bar.
There's are good explanation of TextMate Grammar here. May still suffer from some errors but explains it in a way that was helpful for me when I didn't know anything about the topic.

Related

VS Code snippet regex for relative path

I'm coding in Elixir/Phoenix Framework using VS Code and trying to transform the following relative path
lib/shop_web/live/product_live/index.ex
into
ShopWeb.Live.ProductLive.Index
using snippets.
The closest to that was the regex below
"${RELATIVE_FILEPATH/^(lib\\/|test\\/)(\\w)|(.ex|.exs)$|\\/(\\w)|_(\\w)/${2:/upcase}${4:/upcase}${5:/upcase}/g}"
who gives me the following output
ShopWebLiveProductLiveIndex
I could not find a way to insert the missing dots.
Can anyone help me?
Thanks in advance!
Try this:
"test7": {
"prefix": "al",
"body": [
// my version
"${RELATIVE_FILEPATH/^([^\\/\\\\]+[\\/\\\\])|(\\.ex|\\.exs)$|([^._\\/\\\\]+)|_|([\\/\\\\])/${3:/capitalize}${4:+.}/g}",
// your version tweaked
"${RELATIVE_FILEPATH/^(lib[\\/\\\\]|test[\\/\\\\])(\\w)|(\\.ex|\\.exs)$|([\\/\\\\])(\\w)|_(\\w)/${2:/upcase}${4:+.}${5:/upcase}${6:/upcase}/g}",
],
"description": "alert line"
}
[Note: I made these work for both path.separators / and \. If you don't need that you could shorten the snippet by a lot.]
Your version was very close. I changed it to \\.ex just to make the dots explicit.
I also added a 4th capturing group ([\\/\\\\]) just before the 5th as in ([\\/\\\\])(\\w).
Now that 4th group can be used in a conditional ${4:+.} to add the .'s where the path separators were.
My version is a little shorter - it matches but doesn't use whatever directory is first, be it lib or test or whatever. If that doesn't work for you it is easy to modify that bit of the regexp. I shortened it to 4 capture groups.
([^._\\/\\\\]+)|_|([\\/\\\\]) the end of my version:
([^._\\/\\\\]+) : match characters other than ._\/, or
_ : match it but we aren't using it so no need for a capture group, or
([\\/\\\\]) : match just the path separator in group 4 to use in the conditional.
${4:+.} : conditional, if there is a group 4 (a path separator) add a ..
Thanks to #Mark, my snippet to create a module in Elixir or Phoenix Framework looks like this now:
"Module": {
"prefix": "defmodule",
"description": "Create a module by the Elixir naming convention",
"body": [
"defmodule ${RELATIVE_FILEPATH/^([^\\/\\\\]+[\\/\\\\])|(\\.ex|\\.exs)$|([^._\\/\\\\]+)|_|([\\/\\\\])/${3:/capitalize}${4:+.}/g} do",
"\t$1",
"end"
],
}
As the naming convention, the output for the file in my question lib/shop_web/live/product_live/index.ex will be:
defmodule ShopWeb.Live.ProductLive.Index do
end

search and replace with regex to increment numbers in Visual Studio Code

I'm currently working on a big svg sprite.
The diffrent images are always 2000px apart.
What I have is:
<g transform="translate(0,0)">
<g transform="translate(0,2000)">
<g transform="translate(0,4000)">
After regex want this so just adding 2000 onto the second number:
<g transform="translate(0,2000)">
<g transform="translate(0,4000)">
<g transform="translate(0,6000)">
I have the issue now that some new images have to be put at the top of the document, thus meaning i would need to change all numbers and they are quite alot.
I was thinking about using regular expressions and even found out that it works in the search bar of VS Code. The thing is i never worked with any regex and i'm kinda confused.
Could someone give me a solution and an explanation for incrementing all the sample numbers by 2000?
I hope i understand it afterwards so i can get my foot into that topic.
I'm also happy with just links to tutorials in general or my specific use case.
Thank you very much :)
In VSCode, you can't replace with an incremented value inside a match/capture. You can only do that inside a callback function passed as the replacement argument to a regex replace function/method.
You may use Notepad++ to perform these replacements after installing Python Script plugin. Follow these instructions and then use the following Python code:
def increment_after_openparen(match):
return "{0}{1}".format(match.group(1),str(int(match.group(2))+2000))
editor.rereplace(r'(transform="translate\(\d+,\s*)(\d+)', increment_after_openparen)
See the regex demo.
Note:
(transform="translate\(\d+,\s*)(\d+) matches and captures into Group 1 transform="translate( + 1 or more digits, then , and 0 or more whitespaces (with (transform="translate\(\d+,\s*))) and then captures into Group 2 any one or more digits (with (\d+))
match.group(1) is the Group 1 contents, match.group(2) is the Group 2 contents.
Basically, any group is formed with a pair of unescaped parentheses and the group count starts with 1. So, if you use a pattern like (Item:\s*)(\d+)([.;]), you will need to use return "{0}{1}{2}".format(match.group(1),str(int(match.group(2))+2000), match.group(3)). Or, return "{}{}{}".format(match.group(1),str(int(match.group(2))+2000), match.group(3)).
you can use the extension Regex Text Generator
Select the numbers with Multi Cursor, can be done with Regex Find and Alt+Enter in find box
Run command: Generate text based on regular expression
As Match Expression use: (\d+)
As generator extression use: {{=N[1]+2000}}
You get a preview of the result.
Press Enter if OK, or Esc to abort
You can set this type of search replace as a predefined in the setting regexTextGen.predefined
"regexTextGen.predefined": {
"Add/Subtract a number" : {
"originalTextRegex": "(\d+)",
"generatorRegex": "{{=N[1]+1}}"
}
}
You can edit the expressions (change the 1) if you choose a predefined.
SublimeText3 with the Text-Pastry add-in can also do \i
I wrote an extension, Find and Transform, to make these math operations on find and replaces with regex's quite simple (and much more like path variables, conditionals, string operations, etc.). In this case, this keybinding (in your keybindings.json) will do what you want:
{
"key": "alt+r", // whatever keybinding you want
"command": "findInCurrentFile",
"args": {
"find": "(?<=translate\\(\\d+,\\s*)(\\d+)", // double-escaped
"replace": "$${ return $1 + 2000 }$$",
"isRegex": true,
// "restrictFind": "document", // or line/once/selections/etc.
}
}
That could also be a setting in your settings.json if you wanted that - see the README.
(?<=translate\\(\\d+,\\s*) a positive lookbehind, you can use non-fixed length items in the lookbehind, like \\d+.
(\\d+) capture group 1
The replace: $${ return $1 + 2000 }$$
$${ <your string or math operation here> }}$
return $1 + 2000 add 2000 to capture group 1
Demo:

Using RegEx select line based on positive criteria but exclude certain lines based on negative

I apologize if there is an answer for this somewhere, but my search skills have failed me if there is.
I'm using UltraEdit and there are lines I need to remove from some JSON schemas to make comparing them easier.
So given the following:
"PaymentMethod": {
"$id": "/properties/PaymentMethod",
"items": {
"$ref": "PaymentMethod.json"
},
"type": "array"
}
Using this RegEx:
^.*\".*\"\: \".*$\r\n
Selects these lines:
"$id": "/properties/PaymentMethod",
"$ref": "PaymentMethod.json"
"type": "array"
What I need to do is skip the $ref line.
I've tried to get negative lookaround to work using (?!json) in various ways with the selection criteria and have failed miserably.
The purpose of this is to delete the selected lines.
Thanks for any help.
Update:
To clarify, there are lines I want to delete that match the criteria my Regex finds, but I do not want to delete the $ref line.
So I was hoping to find a relatively easy way to do this using straight up perl regex within UltraEdit.
I've created a workaround with a Python script so I can get my work done, but it would still be interesting to find out if there is a way to do this. :)
Don't write your own broken parser; use an existing one.
use Cpanel::JSON::XS qw( decode_json );
my $json_utf8 = '...';
my $data = decode_json($json_utf8);
my $payment_method = $data->{PaymentMethod};
my $id = $payment_method->{'$id'};
my $item = $payment_method->{items}{'$ref'};
my $type = $payment_method->{type};
Using a JSON module to manipulate the data directly would be a more robust solution, but a quck and dirty way to edit the JSON file would be to run this on the command line.
Again, this is not a good way to manipulate JSON, but it may be suitable for your ad hoc case.
perl -ne 'print unless /"\$ref:"/' file.json > new_file.json
If you don't want $ref after the initial quotes:
^.*\"(?!\$ref).*\"\: \".*$\r\n
see it here: https://regex101.com/r/yxTwck/1
or to exclude based on .json" at the end of the line:
^.*\".*\"\: \".*(?<!\.json")$\r\n
Also note that you are using greedy quantifiers (.* vs. .*?). If your idea is to have the first .* stop at the first ", you should probably use [^\n"]* which will prevent line feeds or "s from being consumed. Your regex matches """""""""""""" "type": "array", for example.

VSCode Code-Snippets transform: downcase and capitalize at the same time

I have this:
${1/([A-Z]*)(?:_)([A-Z]+)*/${1:/downcase}${2:/downcase}/g}
How to make use downcase and capitalize on the same (2) group?
${1/([A-Z]*)(?:_)([A-Z]+)*/${1:/downcase}${2:/downcase/capitalize}/g}
I want to tansform ZXC_ASD to zxcAsd.
Try it like this:
"camelCaseSnail": {
"scope": "javascript,typescript",
"prefix": "log",
"body": "${1/([A-Z]*)(?:_)(?:([A-Z])([A-Z]+))*/${1:/downcase}${2:/capitalize}${3:/downcase}/g}"
}
Basically, I've changed the second capture group ([A-Z]+)* to a non-capture group that has two inner capture groups (?:([A-Z])([A-Z]+))*, a single letter for camel-case and the rest, which I refer in the replace/transform part: /downcase}${2:/capitalize}${3:/downcase}/
Apparently coming to vscode v1.58 is a /camelcase modifier. So your case is as easy as
"${1/(.*)/${1:/camelcase}/}"
Tested in the Insiders Build. See Add a camelCase transform for Snippet variables. See also https://stackoverflow.com/a/51228186/836330 for another example.
Old answer:
Using the undocumented (see Snippet regex: match arbitrary number of groups and transform to CamelCase) /pascalcase transform, it is quite easy:
"${1/([A-Z]*)(?:_)([A-Z]+)*/${1:/downcase}${2:/pascalcase}/g}"
as the /pascalcase will do both the /capitalize and the /downcase at once.

Grok debugging - Match first only regex not working as intended

So I have the following log message:
[localhost-startStop-1] SystemPropertiesConfigurer$ExportingPropertyOverrideConfigurer loadProperties > Loading properties file from class path resource [SystemConfiguration.overrides]
I'm trying to match the first thread ( [localhost-startStop-1] ) with the following pattern:
EVENT_THREAD (\[.+?\])
This works when I pass it into regex101.com but doesn't work when I represent it as
%{(\[.+?\]):EVENT_THREAD} on grokdebugger for reasons unknown to me...
Can someone help me understand this?
Thanks,
See Grok help:
Sometimes logstash doesn’t have a pattern you need. For this, you have a few options.
First, you can use the Oniguruma syntax for named capture which will let you match a piece of text and save it as a field:
(?<field_name>the pattern here)
So, use (?<EVENT_THREAD>\[.+?\]).
Alternately, you can create a custom patterns file.
Create a directory called patterns with a file in it called extra (the file name doesn’t matter, but name it meaningfully for yourself)
In that file, write the pattern you need as the pattern name, a space, then the regexp for that pattern.
# contents of ./patterns/postfix:
EVENT_THREAD (?:\[.+?\])
Then use the patterns_dir setting in this plugin to tell logstash where your custom patterns
filter {
grok {
patterns_dir => ["./patterns"]
match => { "message" => "%{EVENT_THREAD:evt_thread}" }
}
}