Regex match multiple elements - regex

I have regex that I am trying to match to specific function parameters. I want to be able to style them a certain way in a language package.
Here is the text I am trying to match:
addFill(path:svgjs.Element, pattern:Pattern, docMaxSide:number) {
pathFillId(path)
}
In this example, I want to match the words "path" "pattern" and "docMaxSide" from the parameters. I want to make sure it does NOT match the word "path" in the second line (where I am calling pathFillId).
Here is my current regex: \(.*?(\w+):.*?\)
Broken down:
\( Find open parens
.*? It may have stuff before it, but after the parens
(\w+): Capture a word before a colon
.*? There may be more stuff after the colon
\) Close parens
Right now, it will only match the first item, "path". But I need it to match all the words I mentioned above.
UPDATE: I should have been more specific. It should only match if it's a function parameter. For example, I don't want path1 matched in the following: var path1:string. The difficulty is coming up with regex that matches items only between parens.

Try this:
\w+(?=:)
with the g modifier (the global modifier finds all elements and don't return on the first match)
Also see the example
UPDATE
If you want only match the parameters in the parenthesis you can do this:
\w+(?=:[\w.]+\s*[,)])
Here is the example for this regex

You problem is this part of your regex: .*?. So you specify that you want any character (.), that's correct. But then you must decide for one of * and ? - * means {0,}, ? means {0,1}.
If that doesn't help, you might test your regex with regexe.com or similar.

Related

Mixing Lookahead and Lookbehind in 1 Regexp

I'm trying to match first occurrence of window.location.replace("http://stackoverflow.com") in some HTML string.
Especially I want to capture the URL of the first window.location.replace entry in whole HTML string.
So for capturing URL I formulated this 2 rules:
it should be after this string: window.location.redirect("
it should be before this string ")
To achieve it I think I need to use lookbehind (for 1st rule) and lookahead (for 2nd rule).
I end up with this Regex:
.+(?<=window\.location\.redirect\(\"?=\"\))
It doesn't work. I'm not even sure that it legal to mix both rules like I did.
Can you please help me with translating my rules to Regex? Other ways of doing this (without lookahead(behind)) also appreciated.
The pattern you wrote is really not the one you need as it matches something very different from what you expect: text window.location.redirect("=") in text window.location.redirect("=") something. And it will only work in PCRE/Python if you remove the ? from before \" (as lookbehinds should be fixed-width in PCRE). It will work with ? in .NET regex.
If it is JS, you just cannot use a lookbehind as its regex engine does not support them.
Instead, use a capturing group around the unknown part you want to get:
/window\.location\.redirect\("([^"]*)"\)/
or
/window\.location\.redirect\("(.*?)"\)/
See the regex demo
No /g modifier will allow matching just one, first occurrence. Access the value you need inside Group 1.
The ([^"]*) captures 0+ characters other than a double quote (URLs you need should not have it). If these URLs you have contain a ", you should use the second approach as (.*?) will match any 0+ characters other than a newline up to the first ").

regex to start matching from last letter in previous match

I have following regex
(\{\w*\}\s*[^{}]+\s*)\{?
and I am testing it on this string
this {match} is cool{match} but {match} this one is more cool
currently I am able to capture 2 groups -> {match} is cool and {match} this one is more cool, so as you can see group but {match} is missing.
Reason for this is because last matched character is {, so in next matching turn he will skip {, and won't be able to match until new { occurrence.
Does anyone knows how to force to match middle group also?
Debugging: http://regex101.com/r/hM5xE6/2
You can probably just remove the \{? (and the \s* too); you also don't need the capturing parentheses:
\{\w*\}[^{}]+
Test it live on regex101.com.
If you want to enforce the match to end before a { or at the end of the string, you can use a positive lookahead assertion for that:
\{\w*\}[^{}]+(?=\{|$)
But you would only need that if you wanted to avoid a match completely if there are nested braces, like in {{match} whatever}, where the first regex would find {match} whatever.
You can use the following regular expression to start matching from last letter in previous match
\{\w*\}[^{}]+(?=\{|$)
You are including the next { in the regex (but not in the match), so it begins the next match on the character after, skipping the first { and not matching until you get to the second.
There's no need for lookaheads or anything like that.
If you remove the trailing check for \{?, you get all 3 matches (can also remove the } from the brackets and the last \s*):
(\{\w*\}\s*[^{]+)
(http://regex101.com/r/hM5xE6/7)
you can also use the following regex, depending on how specific you need to be with the capture:
(\{\w*\}[\w\s]*)
http://regex101.com/r/hM5xE6/5
(\{\w*\}\s*[^{}]+\s*)(?=\{|$)
Try this.Use lookahead for 0 width assertion.See demo.
http://regex101.com/r/qC9cH4/18

looking for a regex pattern key:value pairs

I am trying to find a regex pattern for the outlook search, I am looking for grouping pattern to handle this
from:Jack subject:(sending invoice) title:ibm
I used this pattern but i do not get the words after the first word
(?<name>\\w+):[(](?<value>\\w*)[)]*
\w doesn't handle spaces, change your regex to:
(?<name>\\w+):[(](?<value>[^)]*)[)]
[^)]* means 0 or more characters that is not right parens.
May be you'd prefer to use [^)]+ that means ONE or more characters that is not right parens.
If the parenthesis are optional, use:
(?<name>\\w+):[(]?(?<value>[^)]+)[)]?
The first bracket in 'value' is not optional in your regex
from:Jack subject:(sending invoice) title:ibm
As for whitespaces, I'd do it this way, as the brackets are not present in every key value pair anyway:
(?<name>\\w+):[(]*(?<value>(?:\\w|\\s)+)[)]*
However, it seems that the value is either one word or a sequence of words but within enclosing brackets - let's re-write the regex then, otherwise you'll get the whole thing after ':' as the first value:
(?<name>\\w+):(?:(?<value>\\w+)|(?:\\((?<value>(?:\\w|\\s)+)))

Regex matching only words inside custom tag

I'm fairly new to regex and trying to figure out a pattern that will only match the instance of the word inside my custom tag.
In the example below both words match the condition of being after a | and before a ]
Pattern: (?=|)singleline(?=.*])
Sample: [if #sample|singleline second] <p>Another statement singleline goes here</p> [/if]
words that match the condition of being after a | and before a ]
the .*, which means "anything, zero or more times, and be greedy about it", will race to the end of the string and back up only enough to get to a ] (the last one). (and your lookbehind is a lookahead):
if you really want to match what you say you want to match (see quote), then this is it:
Pattern: (?<=|)(\w+)(?=])
Edit: or this one if you want to "match alphanumerics and spaces inside | and ]":
Pattern: (?<=|)([\w\s]+?)(?=])
(?=|) asserts that the next thing in the string either nothing or nothing. That will always evaluate to true; it's always possible to match nothing. I think sweaver2112 is correct that you meant to use a lookbehind there, but you also need to escape the pipe: (?<=\|). Or just match a pipe in the normal way; I don't see any need to use lookarounds for that part.
The other part probably does need to be a lookahead, but you need to expand it a bit. You want to assert that the word is followed by a closing bracket, but not if there's an opening bracket first. Assuming the brackets are always correctly paired, that should mean the word is between a pair of them. Like this:
Pattern: \|singleline(?=[^\]\[]*\])
[^\]\[]*\] matches zero or more of any characters except ] or [, followed by a ]. The backslashes escaping the "real" brackets may or may not be necessary depending on the regex flavor, but escaping them is always safe.

Regex: optimal syntax for optional combined expression?

I want to match a combination of expressions that is optional. In this specific example, I want to match on the word through. Also, if the words run or swim precede through (with whitespace) then match on the whole phrase. So that combination of expressions preceding through must be optional.
I want all the following lines to be positive matches:
swim through <-- match entire phrase
jump through <-- match entire phrase
hike through <-- match only the word "through"
To do this, I can use the following expression:
(jump\W|swim\W)?through
However, is it possible to accomplish the same thing without having to add \W after jump and swim? I was trying something like this:
(jump|swim)?\W?through
But that wasn't working properly because it would include the space that precedes through on the 3rd example. I only want the word through, not the whitespace around it.
What about this one: (?:(jump|swim)\W)?through