I tried various combinations but unsuccesfull at figuring out correct regex pattern.
Basically I want to capture patterns like examples below:
{{variable}}
{{variable.function1{param1}}}
{{variable.function1{param1}.function2{param2}}}
and so on..
I wanted to capture variable,function1,param1,function2,param2 from this
So far I have below regex which does not work completely
\{\{([^{}.]+)(\.([^{}]+)\{([^{}]+)\})*\}\}
If I try to apply above pattern on example 3, I get below groups
Group#1 - variable
Group#2 - .function2{param2}
Group#3 - function2
Group#4 - param2
I was expecting somthing as below,
Group#1 - variable
Group#2 - .function1{param1}
Group#3 - function1
Group#4 - param1
Group#5 - .function2{param2}
Group#6 - function2
Group#7 - param2
PS: you can check without writing code at http://regexr.com/3e4st
Okay, so the reason why your thing doesn't work, is because you're basically only capturing one instance of the thing in general, which means each capture group can only return one instance of what you want. So what's needed is the global variable or your equivalent in whatever language you're using.
Example: https://regex101.com/r/pO8xN2/3
The number of groups in a regex match is fixed. See an older post of mine with more explanation. In your case that number is 4.
When a group matches repeatedly, you will usually only be able to access the value of the last occurrence in the string. That's what you see with your issue: function2, param2.
Some regex engines allow accessing previous group captures (for example the one in .NET). The majority don't. Whether you can solve your issue easily or not strictly depends on your regex engine.
Related
So I'm a real rooky with REGEX and I usually get my way through it back reference a static word in the string and then using just basic functions to find what I need, this one has me stuck though
So I have this address string "MITCHAM SA 5062" and to go through this parser i need to split the suburb, state and postcode.
I can get "MITCHAM" using /\w+/
And postcode "5062" using /\d+/
The state I'm struggling with though. I think I'm close, I'm currently using (?!\w+) (\w+) Issue here is it is still picking up the whitespace before the suburb which won't be allowed in the database.
Halp pls!
Edit - Few questions about if the state will ever be more than two letters - correct it could be. It won't always be SA
Edit 2 - Another person asked if one while regex can capture it all - No, the way our SaaS product works, I need to map each bit of data to the correct place separately (using a GUI)
If MITCHAM SA 5062 is the full string, and you want to capture each group in one regex than this will work:
^(\w+)\s*?(\w+)\s*(\d+)
If you are trying to capture the middle section only you can try:
\s(\w+)\s
Or if for some reason you cannot use capturing groups, this will work for the middle portion.
(?<=\s)(\w+)(?=\s+)
As you know, Google links can be pretty unwieldy:
https://www.google.com/search?q=some+search+here&source=hp&newwindow=1&ei=A_23ssOllsUx&oq=some+se....
I have MANY Google links saved that I would like to clean up to make them look like so:
https://www.google.com/search?q=some+search+here
The only issue is that I cannot figure out the correct regex pattern for Vim to do this.
I figure it must be something like this:
:%s/&source=[^&].*//
:%s/&source=[^&].*[^&]//
:%s/&source=.*[^&]//
But none of these are working; they start at &source, and replace until the end of the line.
Also, the search?q=some+search+here can appear anywhere after the .com/, so I cannot rely on it being in the same place every time.
So, what is the correct Vim regex pattern to use in order to clean up these links?
Your example can easily be dealt with by using a very simple pattern:
:%s/&.*
because you want to keep everything that comes before the second parameter, which is marked by the first & in the string.
But, if the q parameter can be anywhere in the query string, as in:
https://www.google.com/search?source=hp&newwindow=1&q=some+search+here&ei=A_23ssOllsUx&oq=some+se....
then no amount of capturing or whatnot will be enough to cover every possible case with a single pattern, let alone a readable one. At this point, scripting is really the only reasonable approach, preferably with a language that understands URLs.
--- EDIT ---
Hmm, scratch that. The following seems to work across the board:
:%s#^\(https://www.google.com/search?\)\(.*\)\(q=.\{-}\)&.*#\1\3
We use # as separator because of the many / in a typical URL.
We capture a first group, up to and including the ? that marks the beginning of the query string.
We match whatever comes between the ? and the first occurrence of q= without capturing it.
We capture a second group, the q parameter, up to and excluding the next &.
We replace the whole thing with the first capture group followed by the second capture group.
I am using a regular expression to determine when to fire a tracking tag or not.
If a visitor to one of the sites is on one of these three domains the tag should fire:
- www.grousemountainlodge.com
- www.glacierparkinc.com
- reserveglacierdenali.com
I actually have a regular expression that works. But I'm not confident and wanted to bounce it off the folk on this board.
This is what I have. Is there a simpler, more elegant or more robust regex to use for matching the 3 domains?
^(www\.)?((glacierparkinc|grousemountainlodge)\.com)$|(^reserveglacierdenali\.com)$
Following some answers, this regex should exlude other domains e.g. cats.glacierparkinc.com or similar.
I'm not sure whether glacierparkinc.com should match, without the www. prefix - from your list it seems that no, but from your regex it seems it will be matched.
In either case I guess you can simplify it a bit:
^(?:www\.(?:glacierparkinc|grousemountainlodge)|reserveglacierdenali)\.com$
Note the use of (?:) instead of just (): this means positive look-ahead assertion without capturing. Its a best practice not to capture when you don't need to - saving time and memory.
It must be at starting position with or not www.. So:
^(?:www\.)?(?:glacierparkinc|grousemountainlodge|reserveglacierdenali)\.
If it maches, then do something.
Regex live here.
Hope it helps.
I am trying to understand the inner pipings of express.js, but I'm having a little trouble on one thing.
If you add a new route, like such:
app.get("/hello/darkness/myold/:name", ...)
The string I provided internally becomes a regular expression. Now, I worked out what I thought the regex should be internally, and I came up with:
^\/hello\/darkness\/myold\/([^\/]+?)\/?$
The ([^\/]+?) will capture the name parameter, \/? is present if strict routing is disabled, and the whole thing is encapsulated in ^...$. However, when I went and looked what is actually stored inside express, it's actually this:
^\/hello\/darkness\/myold\/(?:([^\/]+?))\/?$
As you can see, there is a non-capturing group around the capturing group. My question is: what is the purpose of this non-capturing group?
The method I used to see what regex express.js was using internally was simply to make an invalid regex and view the error console:
app.get('/hello/darkness/myold/:friend/[', function(req, res){});
yields
SyntaxError: Invalid regular expression: ^\/hello\/darkness\/myold\/(?:([^\/]+?))\/[\/?$
The answer to this question is that the non-capturing group is a relic of the case where a parameter is optional. Consider the difference between the following two routes:
/hello/:world/goodbye
/hello/:world?/goodbye
They will generate, respectively:
^\/hello\/(?:([^\/]+?))\/goodbye\/?$
^\/hello(?:\/([^\/]+?))?\/goodbye\/?$
Note the important but subtle change that happens to the non-capturing group when an optional parameter is present.
Hi trying to capture the following data to export out to another part of the program.
Ideally would use regular expressions as TOKEN could be problematic (its for names so the string would vary, especially for users abroad, I've seen these people with 4+ different names)
Sample data which I want to capture from would be in this format
New Starter - First Last - test
I'd want to capture everything between the hyphens rather than the entire thing
So far I have the following regex: -([^-]+)-
Which just captures
- First Last -
(?<=-\s).+(?=\s-)
If you dont want something to appear in the match, but need to check its there you can use lookahead/lookbehind
More info here
This is assuming the same format will appear on all other inputs.