Finding and replacing key: value pairs - regex

I'm in the process of porting over a Python library to JavaScript / TypeScript. To help myself out, I'm trying to develop various regex rules that I can apply to files that will automatically convert a lot of the syntax and at least get me close, cleaning up where needed.
I've got the following example:
https://regex101.com/r/mIr0pl/1
this.mk(attrs={keyCollection.key: 40}))
this.mk(attrs={keyCollection.key: 50, override.key: override.value})
this.mk(attrs={keyCollection.key: 60,
override.key: override.value})
I am trying to do a Find/Replace in my editor, to find all key: value pairs associated with attrs dictionaries. Here's the regex I've got:
/attrs={(.+?):\s*(.+?)}/gms
I want to convert it to this:
this.mk(attrs=[[keyCollection.key, 40]]))
this.mk(attrs=[[keyCollection.key, 50], [override.key, override.value]])
this.mk(attrs=[[keyCollection.key, 60],
[override.key, override.value]])
I'm having trouble first nailing down the regex to get the repeated key: value groups, and then also how I would go about utilizing those repeated groups in a replace.
(my editor is VSCode, but I'm using this nifty extension to run these modifications: https://marketplace.visualstudio.com/items?itemName=bhughes339.replacerules)
Any help would be appreciated :)

Since VS Code already supports infinite-width lookbehind construct you may use
"replacerules.rules": {
"Wrap the attrs with square brackets first": {
"find": "(attrs=){([^:{]+:*[^}]*)}",
"replace": "$1[[$2]]"
},
"Format attributes inside attrs": {
"find": "(?<=attrs=\\[\\[[^\\]]*(?:](?!])[^\\]]*)*),(\\s*)",
"replace": "],$1["
},
"Replace colons with commas inside attrs": {
"find": "(?<=attrs=\\[\\[[^\\]]*(?:](?!])[^\\]]*)*):",
"replace": ","
}
}
"replacerules.rulesets": {
"Revamp attrs": {
"rules": [
"Wrap the attrs with square brackets first",
"Format attributes inside attrs",
"Replace colons with commas inside attrs"
]
}
}
Step #1 regex demo
Step #2 regex demo
Step #3 regex demo
Output:
this.mk(attrs=[[keyCollection.key, 40]]))
this.mk(attrs=[[keyCollection.key, 50], [override.key, override.value]])
this.mk(attrs=[[keyCollection.key, 60],
[override.key, override.value]])

Maybe,
(?<=attrs={|,)([^:}]*):([^:},]*)(?=}|,)
might be somehow closer.
If you might have had other attrs, you might want to initially filter out those others.
If you wish to explore/simplify/modify the expression, it's been
explained on the top right panel of
regex101.com. If you'd like, you
can also watch in this
link, how it would match
against some sample inputs.

Related

Regex to capture between two words, and then within that result

I have the following text:
"BOONS": ["Debarrier+Rainbow Shift"
},
"CLUTCH_BOONS": [
"Boost+Wall"
],
Regex:
(?<=[A-Z a-z])(\+)(?=[A-Z a-z])/g
Using That I am capable of capturing all of the +'s which is great, but I only want to capture the + signs inside of "CLUTCH_BOONS", I have tried really hard with little success.
I also want to close the "BOONS" bracket, I managed to get the left side going properly but cannot get the right quote
(?<=.*)(\")(?=.*\})
end result should look like this
"BOONS": ["Debarrier","Rainbow Shift"]
},
"CLUTCH_BOONS": [
"Boost","Wall"
],
(I was trying to use Atom / regexr to fix problematic json)
For the plus signs, you can use this regex:
"\w+": \[\s*"\w+\K\+
see here:
https://regex101.com/r/fJSl37/1
and for the second one:
"(\s*)},
see here:
https://regex101.com/r/Oy0CiJ/1

Sublime Workflow for replacing quotes

I use text editor Sublime Text 3 to edit code, and very often I'll have a string literal wrapped in double quotes, that I want to change to single quotes, or vise versa. Right now I scroll to each quotation mark, and replace it with the one I want. Is there a faster workflow for this? Say, highlighting the word or a hotkey or something? I would find it super useful.
If you have a large number of such strings in a file and you want to convert all of them at once, you could use a regex find/replace operation to find and replace them all. You would use Find > Replace... or Find > Find in files... to search for a matching regex that captures the text in the quotes.
For example you could use \"([^"\n]*)\" as a search term and '\1' as the replacement text to swap all double quoted strings for single quotes.
You can't bind something like that to a key directly because Find/Replace can't be used in a Macro, but you could use the RegReplace package to do this if you want to go that route.
You can potentially speed up the workflow that you're currently using by taking advantage of multiple cursors, if you're not already doing that.
You could for example select the first quote, then press Ctrl+D or Option+D to select the other one. Now that you have two cursors, press Backspace to delete both quotes and press the new quote character to insert the new ones.
This can't be macro-ized and bound to a key because the find_under_expand command can't be used in a macro, though.
For a full key press solution, as far as I'm aware you would need a plugin of some sort to do this for you. One such example appears to be ChangeQuotes, although I've never personally used it.
It's also possible to write your own small plugin such as the following:
import sublime
import sublime_plugin
class SwapQuotesCommand(sublime_plugin.TextCommand):
pairs = ["'", '"']
def run(self, edit):
self.view.run_command("expand_selection", {"to": "scope"})
for sel in self.view.sel():
self.toggle(edit, sel)
def toggle(self, edit, region):
begin = self.view.substr(region.begin())
end = self.view.substr(region.end() - 1)
if begin == end and begin in self.pairs:
index = self.pairs.index(begin) + 1
new = self.pairs[index % len(self.pairs)]
for point in (region.begin(), region.end() - 1):
self.view.replace(edit, sublime.Region(point, point+1), new)
This expands the selection in all of the cursors out by the current scope, and then if both ends of the selection are a matching quote, the quote in use is swapped.
In use, you would use a key binding such as the following, which includes a context to make the key only trigger while the cursor is inside of a string so that it doesn't mess up your selection in cases where it definitely won't work.
{
"keys": ["ctrl+shift+'"], "command": "swap_quotes",
"context": [
{ "key": "selector", "operator": "equal", "operand": "string.quoted", "match_all": true }
]
},

Regex match text followed by curly brackets

I have a text like this:
"entity"
{
"id" "5040044"
"classname" "weapon_defibrillator_spawn"
"angles" "0 0 0"
"body" "0"
"disableshadows" "0"
"skin" "0"
"solid" "6"
"spawnflags" "3"
"origin" "449.47 5797.25 2856"
editor
{
"color" "0 0 200"
"visgroupshown" "1"
"visgroupautoshown" "1"
"logicalpos" "[-13268 14500]"
}
}
What would regex expression be to select only that part in Notepad++:
editor
{
"color" "0 0 200"
"visgroupshown" "1"
"visgroupautoshown" "1"
"logicalpos" "[-13268 14500]"
}
First word is always "editor", but the number of lines and content in curly brackets may vary.
editor\s*{\s*(?:\"[a-z]*\"\s*\".*\"\s*)*\}
Demo
Also tested it in Notepad++ it works fine
The simplest way to find everything between curly brackets would be \{[^{}]*\} (example 1).
You can prepend editor\s* on it so it limits the search to only that specific entry: editor\s*\{[^{}]*\} (example 2).
However... if any of the keys or value strings within editor {...} contain a { or }, you're going to have edge cases.
You'll need to find double-quoted values and essentially ignore them. This example shows how you would stop before the first double quote within the group, and this example shows how to match up through the first key-value pair.
You essentially want to repeatedly match those key-value pairs until no more remain.
If your keys or values can contain \" within them, such as "help" "this is \"quoted\" text", you need to look for that \ character as well.
If there are nested groups within this group, you'll need to recursively handle those. Most regex (Notepad++ included) don't handle recursion, though, so to get around this, you copy-paste what you have so far inside of the code if it happens to come across more nested { and }. This does not handle more than one level of nesting, though.
TL;DR
For Notepad++, this is a single line regex you could use.

vscode regex sub match evaluate instead of concatenate?

Test 300
Test 301
Test 302
I can use regex find to loop through these:
Test (3[0-9]*)
When I try replace with math it concatenates instead of evaluates?
Test $1-100
So, it becomes:
Test 300-100
Is it possible to evaluate instead of concatenate, so it becomes:
Test 200
Thanks.
You can use the VS Code Super Replace extension to achieve this.
Find field is the regular expression
Replace is the replace expression. Sub match with $$index syntax will be resolved using the function in Processing function field
Here is an example of use that answers your question :
There are more extensions that can do this now, including one I wrote Find and Transform.
With this keybinding:
{
"key": "alt+m", // whatever keybinding you want
"command": "findInCurrentFile",
"args": {
"find": "(?<=Test\\s)(3\\d\\d)", // get 300+ in capture group 1
"replace": "$${ return $1 - 100 }", // subtract 100 from capture group 1
"isRegex": true
}
}

Regex get between 2 string with spaces

how could I parse this response text using Regex?
info = {
"title": "Developers",
"image": "http://i.ytimg.com/vi/KMU0tzLwhbE/default.jpg",
"length": "3",
"status": "serving",
"progress_speed": "",
"progress": "",
"ads": "",
"pf": "http://70efd.pf.aclst.com/ping.php/10754233/KMU0tzLwhbE?h=882634",
"h": "87d0670f6822946338a610a6b9ec5322",
"px": ""
};
The outcome I need should look like this "87d0670f6822946338a610a6b9ec5322", however, I can't get the correct syntax. I'm new to using Regex and what I have tried using is "\s+", can anyone point me in the right direction?
If you must use a regex, you could use a regex along the lines of:
"h" : "(.+?)",
You can see an example of it here. Just read from the first capture group and that would select your text.
That looks like like JSON aside from the info = prefix. If you have any specific language you are working in that could parse JSON, that might be a better way of handling that input.
You could also use (?<="h": ")[a-z0-9]+(?="), which will match any sequence of lowercase letters and numbers, as long as the sequence is preceded by "h": " and followed by ". I made an explanation and demonstration here.