I have the following text:
"BOONS": ["Debarrier+Rainbow Shift"
},
"CLUTCH_BOONS": [
"Boost+Wall"
],
Regex:
(?<=[A-Z a-z])(\+)(?=[A-Z a-z])/g
Using That I am capable of capturing all of the +'s which is great, but I only want to capture the + signs inside of "CLUTCH_BOONS", I have tried really hard with little success.
I also want to close the "BOONS" bracket, I managed to get the left side going properly but cannot get the right quote
(?<=.*)(\")(?=.*\})
end result should look like this
"BOONS": ["Debarrier","Rainbow Shift"]
},
"CLUTCH_BOONS": [
"Boost","Wall"
],
(I was trying to use Atom / regexr to fix problematic json)
For the plus signs, you can use this regex:
"\w+": \[\s*"\w+\K\+
see here:
https://regex101.com/r/fJSl37/1
and for the second one:
"(\s*)},
see here:
https://regex101.com/r/Oy0CiJ/1
Related
I am pretty bad at regex and need some help implementing my idea with the already complicated if-else syntax being used for user-defined snippets in VS Code.
I want to achieve the following:
Whenever I enter a number for variable $1 I want the snippet to create the
text "MAXVALUE $1" at the placeholder positon
if anything else is entered, there should be printed nothing
My current line for this with $1 being the variable I enter is:
"\t ${1/([0-9])|([a-zA-Z])/${1:+MAXVALUE }${2:+ }/}"
At this state I can capture the entire number EXCEPT the FRIST CHARACTER being entered and I can print MAXVALUE _mynumber_minus_char_at_index_0, LOL?!
If I enter a text, MAXVALUE won't be printed, but again the value from $1 minus the character at index 0 is being printed on screen.
Any help would be highly appreciated. If you got some useful links that explain advanced snippet creation for those kinda cases, I would be thankful as well.
For RegEx, well, time to learn them, so why not starting with a crazy-ass example like this - at least for me it is like rocket-science atm :D
Thanks in advance and best regards.
Using this snippet:
"Maxvalue": {
"prefix": "cll",
"body": [
"\t ${1/([0-9]+)|([a-zA-Z]+)/${1:+MAXVALUE }$1${2:+ }/}",
],
"description": "maxvalue"
},
([0-9]+) captures all the numbers you type; or
([a-zA-Z]+) captures all the letters you type
You were using ([0-9]) which captures, but more importantly
matches only the first number. If you don't match something it will not be transformed by the snippet transform, it just remains
untouched. That is why you were seeing everything but the first
number in the output.
You weren't actually outputting $1 anywhere - you see I added it to the transform after the MAXVALUE conditional.
${1:+MAXVALUE } is a conditional which means if there is a capture group 1, do something, in this case output MAXVALUE. That 1 in ${1:+MAXVALUE } is not a reference to your $1 tabstop. It is only a reference to the first capture group of your regex.
So you correctly outputted MAXVALUE when you had a capture group 1, but you didn't follow that up by outputting capture group 1 anywhere.
{2:+ } is anther conditional where the 2 refers to a capture group 2, if any, here ([a-zA-Z]+). So if there is a capture group 2, a space will be output. If there is no capture group 2, the conditional will fail and provide no output of its own. If you want nothing printed if you type letters, then match it and do nothing with it. As in the following:
"\t ${1/([0-9]+)|[a-zA-Z]+/${1:+MAXVALUE }$1/}", this will match all the letters you type (before tabbing to complete the transform) and they will disappear because you matched them and then didn't output them in the transform part anywhere.
If you simply want those letters to remain, don't match them as in
"\t ${1/([0-9]+)/${1:+MAXVALUE }$1/}"
If there is something you don't understand let me know.
[By the way, your question title mentions if/else conditions but you are using only if conditionals.]
I'm in the process of porting over a Python library to JavaScript / TypeScript. To help myself out, I'm trying to develop various regex rules that I can apply to files that will automatically convert a lot of the syntax and at least get me close, cleaning up where needed.
I've got the following example:
https://regex101.com/r/mIr0pl/1
this.mk(attrs={keyCollection.key: 40}))
this.mk(attrs={keyCollection.key: 50, override.key: override.value})
this.mk(attrs={keyCollection.key: 60,
override.key: override.value})
I am trying to do a Find/Replace in my editor, to find all key: value pairs associated with attrs dictionaries. Here's the regex I've got:
/attrs={(.+?):\s*(.+?)}/gms
I want to convert it to this:
this.mk(attrs=[[keyCollection.key, 40]]))
this.mk(attrs=[[keyCollection.key, 50], [override.key, override.value]])
this.mk(attrs=[[keyCollection.key, 60],
[override.key, override.value]])
I'm having trouble first nailing down the regex to get the repeated key: value groups, and then also how I would go about utilizing those repeated groups in a replace.
(my editor is VSCode, but I'm using this nifty extension to run these modifications: https://marketplace.visualstudio.com/items?itemName=bhughes339.replacerules)
Any help would be appreciated :)
Since VS Code already supports infinite-width lookbehind construct you may use
"replacerules.rules": {
"Wrap the attrs with square brackets first": {
"find": "(attrs=){([^:{]+:*[^}]*)}",
"replace": "$1[[$2]]"
},
"Format attributes inside attrs": {
"find": "(?<=attrs=\\[\\[[^\\]]*(?:](?!])[^\\]]*)*),(\\s*)",
"replace": "],$1["
},
"Replace colons with commas inside attrs": {
"find": "(?<=attrs=\\[\\[[^\\]]*(?:](?!])[^\\]]*)*):",
"replace": ","
}
}
"replacerules.rulesets": {
"Revamp attrs": {
"rules": [
"Wrap the attrs with square brackets first",
"Format attributes inside attrs",
"Replace colons with commas inside attrs"
]
}
}
Step #1 regex demo
Step #2 regex demo
Step #3 regex demo
Output:
this.mk(attrs=[[keyCollection.key, 40]]))
this.mk(attrs=[[keyCollection.key, 50], [override.key, override.value]])
this.mk(attrs=[[keyCollection.key, 60],
[override.key, override.value]])
Maybe,
(?<=attrs={|,)([^:}]*):([^:},]*)(?=}|,)
might be somehow closer.
If you might have had other attrs, you might want to initially filter out those others.
If you wish to explore/simplify/modify the expression, it's been
explained on the top right panel of
regex101.com. If you'd like, you
can also watch in this
link, how it would match
against some sample inputs.
I have a CLI calculator, and I am adding square root function. I have this regex that parses the users input:
string.scan(/\d*\.?\d+\^?|[-+\/*%()]|sqrt\(\d*\.?\d+\)/)
It works with these inputs as expected:
calc -o "sqrt(9)" #=> ["sqrt(9)"]
calc -o "sqrt(9) + sqrt(9)" #=> ["sqrt(9)", "+", "sqrt(9)"]
However, my regex has not accounted for nested sqrt. With this,
calc -0 "sqrt(6+3)"
I desire the output:
["sqrt(6+3)"]
because when the program finds sqrt while searching, it will simply recursively apply the scan method with the regex until it gets into the deepest nested formula and work its way back. But I get:
["(", "6", "+", "3", ")"]
I have tried to capture all but within the sqrt brackets, but it also captures everything in every other bracket as well. So I am having trouble capturing sqrt(9) and sqrt(6+3) without one messing with the other.
Any guidance is much appreciated.
UPDATE: So following on from an answer provided, perhaps I need to explain my program more so you get an idea of what is going on.
Say I have the input 2 * (3 + 5), this will be interpreted into the following array:
["2", "*", "(", "3", "+", "5", ")"]
So program conforms to PEDMAS so will first look for parenthesis, in this situation it finds them. A basic loop is basically like this:
function find_backets
start_i, end_i
for i in array do
if i == "("
start_i = index
find_brackets
end
if i == ")"
end_i = index
# end of nest
end
end
I can then pass my start and end locations within the array to a function that will iteration over each nested operation. So the above can interpret this just fine:
calc -o "2 * (6 + (2 * 2))"
#=> ["2", "*", "(", "6", "+", "(", "2", "*", "2", ")", ")"]
My idea is that when it comes across the sqrt function, it will simply just reuse the same regex used for the users input, and create a whole new array and do the above on it. Then once it's done, I take index 0 and place it where the sqrt used to be.
EDIT: So yeah didn't actually mentioned, I at to basically capture the entirely of a sqrt. So anything and everything in something like sqrt(5+5*(6/2+sqrt(9))
UPDATE: I think I have found a solution
So I done some more reading to learn how * + ? and that worked a bit more and I think (at least so far) this is working
string.scan(/\d*\.?\d+\^?|[-+\/*%()^]|sqrt\(.+?\)+|pi/)
calc -o "sqrt(9)" #=> ["sqrt(9)"]
calc -o "sqrt(3+6)" #=> ["sqrt(3+6)"]
calc -o "sqrt(9) + sqrt(9)" #=> ["sqrt(9)", "+", "sqrt(9)"]
calc -o "sqrt(9) + 2" #=> ["sqrt(9)", "+", "2"]
Will update in a bit
There's a few issues which are getting in your way:
First, regex does not handle recursive searching, so you won't be able to find matching parentheses. If you're wanting to be able to accept parenthetical expressions inside of sqrt() you're going to need to attack it from a different angle (the answer there points to this algorithm).
If you're only expecting to match simple expressions inside the sqrt(), then the next problem is: in your sqrt sub-expression, you're optionally matching a literal period character \.? between digits, but you're not allowing any operators. You can approach this directly by adding a match for the operators and an optional second float into that sub-expression. In the following example, I wrapped the addition in a non-capturing group (?:_expression_) and used a * to match it 0 or more times.
sqrt\(\d*\.?\d+\) becomes sqrt\(\d*\.?\d+(?: *?[-+\/*%]? *?\d*\.?\d*)*\)
Last, you will most likely want to evaluate the contents of sqrt() before evaluating the sqrt() itself. To do this, you'll want to make use of capture groups. There are a few ways you could approach this, but one way is to have the entire expression wrapped in unescaped parentheses (capture group 1), then the contents of sqrt() should also be wrapped in unescaped parenthesis (capture group 2).
/(\d*\.?\d+\^?|[-+\/*%()]|sqrt\((\d*\.?\d+(?: *?[-+\/*%]? *?\d*\.?\d*)*)\))/
The results from your scan will be an array of capture group arrays. Running it against "sqrt(9) + sqrt(9)" will return [["sqrt(9)", "9"], ["+", nil], ["sqrt(9)", "9"]] so anytime capture group 2 is not nil, it contains the contents of a sqrt().
You can see this regex in action at Regexr
I have the following string:
#Function1['param1', -100, 'param3'] * 120 + #Function2 - 15.5
Here #Function1 and #Function2 are name of functions and may be any word ([a-z]+). Function name will always begin with #.
How do we write a RegEx expression so that I get both the functions:
#Function1['param1', -100, 'param3']
#Function2
If a function name is followed by [ then I need everything up to the next ], otherwise just the function name.
I have tried the following regex, but it works for upto only 1 parameter. It stops at the first comma after the function name which has parameters.
#[^\s\]]+?[\s\]]
Edit:
I think this does the job, but not sure I'll miss any cases?
#\w+?(?:\[.+?\])?(?:\s)
Another possible solution:
\#\w*(\s|\[.*])
With the global flag set
I think I should answer my own question. (Sorry for posting the question without trying too hard)
This is the best I could get to:
#\w+?(?:\[[^]]+?\])?(?=\s)
It matches the function name, then anything between [ and ] is made optional. And the search ends with the next space.
how could I parse this response text using Regex?
info = {
"title": "Developers",
"image": "http://i.ytimg.com/vi/KMU0tzLwhbE/default.jpg",
"length": "3",
"status": "serving",
"progress_speed": "",
"progress": "",
"ads": "",
"pf": "http://70efd.pf.aclst.com/ping.php/10754233/KMU0tzLwhbE?h=882634",
"h": "87d0670f6822946338a610a6b9ec5322",
"px": ""
};
The outcome I need should look like this "87d0670f6822946338a610a6b9ec5322", however, I can't get the correct syntax. I'm new to using Regex and what I have tried using is "\s+", can anyone point me in the right direction?
If you must use a regex, you could use a regex along the lines of:
"h" : "(.+?)",
You can see an example of it here. Just read from the first capture group and that would select your text.
That looks like like JSON aside from the info = prefix. If you have any specific language you are working in that could parse JSON, that might be a better way of handling that input.
You could also use (?<="h": ")[a-z0-9]+(?="), which will match any sequence of lowercase letters and numbers, as long as the sequence is preceded by "h": " and followed by ". I made an explanation and demonstration here.