how to get individual parameters using regular expression

how to get individual parameters using regular expression - regex

Here is a string, func("abc", "def", "ghi"), I want to get the individual parameter using regular expression '"\"".*"\""', but it doesn't work, it will matches all of the arguments. How to get the individual parameter using regular expression?

this one may give what you want:
"[^"]*"
above is just the regex, you may need quote it to use in your programming language.

.* is greedy i.e it would match as much as possible
You have to make .* match lazily using .*?
.*? is lazy i.e it would match as less as possible
So your regex would be
"".*?""

Related

How to match with regexp any occurence of a specific char within a string delimited by specific delimiters? [duplicate]

My regex pattern looks something like
<xxxx location="file path/level1/level2" xxxx some="xxx">
I am only interested in the part in quotes assigned to location. Shouldn't it be as easy as below without the greedy switch?
/.*location="(.*)".*/
Does not seem to work.

You need to make your regular expression lazy/non-greedy, because by default, "(.*)" will match all of "file path/level1/level2" xxx some="xxx".
Instead you can make your dot-star non-greedy, which will make it match as few characters as possible:
/location="(.*?)"/
Adding a ? on a quantifier (?, * or +) makes it non-greedy.
Note: this is only available in regex engines which implement the Perl 5 extensions (Java, Ruby, Python, etc) but not in "traditional" regex engines (including Awk, sed, grep without -P, etc.).

location="(.*)" will match from the " after location= until the " after some="xxx unless you make it non-greedy.
So you either need .*? (i.e. make it non-greedy by adding ?) or better replace .* with [^"]*.
[^"] Matches any character except for a " <quotation-mark>
More generic: [^abc] - Matches any character except for an a, b or c

How about
.*location="([^"]*)".*
This avoids the unlimited search with .* and will match exactly to the first quote.

Use non-greedy matching, if your engine supports it. Add the ? inside the capture.
/location="(.*?)"/

Use of Lazy quantifiers ? with no global flag is the answer.
Eg,
If you had global flag /g then, it would have matched all the lowest length matches as below.

Here's another way.
Here's the one you want. This is lazy [\s\S]*?
The first item:
[\s\S]*?(?:location="[^"]*")[\s\S]* Replace with: $1
Explaination: https://regex101.com/r/ZcqcUm/2
For completeness, this gets the last one. This is greedy [\s\S]*
The last item:[\s\S]*(?:location="([^"]*)")[\s\S]*
Replace with: $1
Explaination: https://regex101.com/r/LXSPDp/3
There's only 1 difference between these two regular expressions and that is the ?

The other answers here fail to spell out a full solution for regex versions which don't support non-greedy matching. The greedy quantifiers (.*?, .+? etc) are a Perl 5 extension which isn't supported in traditional regular expressions.
If your stopping condition is a single character, the solution is easy; instead of
a(.*?)b
you can match
a[^ab]*b
i.e specify a character class which excludes the starting and ending delimiiters.
In the more general case, you can painstakingly construct an expression like
start(|[^e]|e(|[^n]|n(|[^d])))end
to capture a match between start and the first occurrence of end. Notice how the subexpression with nested parentheses spells out a number of alternatives which between them allow e only if it isn't followed by nd and so forth, and also take care to cover the empty string as one alternative which doesn't match whatever is disallowed at that particular point.
Of course, the correct approach in most cases is to use a proper parser for the format you are trying to parse, but sometimes, maybe one isn't available, or maybe the specialized tool you are using is insisting on a regular expression and nothing else.

Because you are using quantified subpattern and as descried in Perl Doc,
By default, a quantified subpattern is "greedy", that is, it will
match as many times as possible (given a particular starting location)
while still allowing the rest of the pattern to match. If you want it
to match the minimum number of times possible, follow the quantifier
with a "?" . Note that the meanings don't change, just the
"greediness":
*? //Match 0 or more times, not greedily (minimum matches)
+? //Match 1 or more times, not greedily
Thus, to allow your quantified pattern to make minimum match, follow it by ? :
/location="(.*?)"/

import regex
text = 'ask her to call Mary back when she comes back'
p = r'(?i)(?s)call(.*?)back'
for match in regex.finditer(p, str(text)):
print (match.group(1))
Output:
Mary

I'm trying to create a javascript flavored Regex expression that will returned multiple of a named group [duplicate]

My regex pattern looks something like
<xxxx location="file path/level1/level2" xxxx some="xxx">
I am only interested in the part in quotes assigned to location. Shouldn't it be as easy as below without the greedy switch?
/.*location="(.*)".*/
Does not seem to work.

You need to make your regular expression lazy/non-greedy, because by default, "(.*)" will match all of "file path/level1/level2" xxx some="xxx".
Instead you can make your dot-star non-greedy, which will make it match as few characters as possible:
/location="(.*?)"/
Adding a ? on a quantifier (?, * or +) makes it non-greedy.
Note: this is only available in regex engines which implement the Perl 5 extensions (Java, Ruby, Python, etc) but not in "traditional" regex engines (including Awk, sed, grep without -P, etc.).

location="(.*)" will match from the " after location= until the " after some="xxx unless you make it non-greedy.
So you either need .*? (i.e. make it non-greedy by adding ?) or better replace .* with [^"]*.
[^"] Matches any character except for a " <quotation-mark>
More generic: [^abc] - Matches any character except for an a, b or c

How about
.*location="([^"]*)".*
This avoids the unlimited search with .* and will match exactly to the first quote.

Use non-greedy matching, if your engine supports it. Add the ? inside the capture.
/location="(.*?)"/

Use of Lazy quantifiers ? with no global flag is the answer.
Eg,
If you had global flag /g then, it would have matched all the lowest length matches as below.

Here's another way.
Here's the one you want. This is lazy [\s\S]*?
The first item:
[\s\S]*?(?:location="[^"]*")[\s\S]* Replace with: $1
Explaination: https://regex101.com/r/ZcqcUm/2
For completeness, this gets the last one. This is greedy [\s\S]*
The last item:[\s\S]*(?:location="([^"]*)")[\s\S]*
Replace with: $1
Explaination: https://regex101.com/r/LXSPDp/3
There's only 1 difference between these two regular expressions and that is the ?

The other answers here fail to spell out a full solution for regex versions which don't support non-greedy matching. The greedy quantifiers (.*?, .+? etc) are a Perl 5 extension which isn't supported in traditional regular expressions.
If your stopping condition is a single character, the solution is easy; instead of
a(.*?)b
you can match
a[^ab]*b
i.e specify a character class which excludes the starting and ending delimiiters.
In the more general case, you can painstakingly construct an expression like
start(|[^e]|e(|[^n]|n(|[^d])))end
to capture a match between start and the first occurrence of end. Notice how the subexpression with nested parentheses spells out a number of alternatives which between them allow e only if it isn't followed by nd and so forth, and also take care to cover the empty string as one alternative which doesn't match whatever is disallowed at that particular point.
Of course, the correct approach in most cases is to use a proper parser for the format you are trying to parse, but sometimes, maybe one isn't available, or maybe the specialized tool you are using is insisting on a regular expression and nothing else.

Because you are using quantified subpattern and as descried in Perl Doc,
By default, a quantified subpattern is "greedy", that is, it will
match as many times as possible (given a particular starting location)
while still allowing the rest of the pattern to match. If you want it
to match the minimum number of times possible, follow the quantifier
with a "?" . Note that the meanings don't change, just the
"greediness":
*? //Match 0 or more times, not greedily (minimum matches)
+? //Match 1 or more times, not greedily
Thus, to allow your quantified pattern to make minimum match, follow it by ? :
/location="(.*?)"/

import regex
text = 'ask her to call Mary back when she comes back'
p = r'(?i)(?s)call(.*?)back'
for match in regex.finditer(p, str(text)):
print (match.group(1))
Output:
Mary

Regular expression to extract string between a word and dot followed by strings [duplicate]

My regex pattern looks something like
<xxxx location="file path/level1/level2" xxxx some="xxx">
I am only interested in the part in quotes assigned to location. Shouldn't it be as easy as below without the greedy switch?
/.*location="(.*)".*/
Does not seem to work.

You need to make your regular expression lazy/non-greedy, because by default, "(.*)" will match all of "file path/level1/level2" xxx some="xxx".
Instead you can make your dot-star non-greedy, which will make it match as few characters as possible:
/location="(.*?)"/
Adding a ? on a quantifier (?, * or +) makes it non-greedy.
Note: this is only available in regex engines which implement the Perl 5 extensions (Java, Ruby, Python, etc) but not in "traditional" regex engines (including Awk, sed, grep without -P, etc.).

location="(.*)" will match from the " after location= until the " after some="xxx unless you make it non-greedy.
So you either need .*? (i.e. make it non-greedy by adding ?) or better replace .* with [^"]*.
[^"] Matches any character except for a " <quotation-mark>
More generic: [^abc] - Matches any character except for an a, b or c

How about
.*location="([^"]*)".*
This avoids the unlimited search with .* and will match exactly to the first quote.

Use non-greedy matching, if your engine supports it. Add the ? inside the capture.
/location="(.*?)"/

Use of Lazy quantifiers ? with no global flag is the answer.
Eg,
If you had global flag /g then, it would have matched all the lowest length matches as below.

Here's another way.
Here's the one you want. This is lazy [\s\S]*?
The first item:
[\s\S]*?(?:location="[^"]*")[\s\S]* Replace with: $1
Explaination: https://regex101.com/r/ZcqcUm/2
For completeness, this gets the last one. This is greedy [\s\S]*
The last item:[\s\S]*(?:location="([^"]*)")[\s\S]*
Replace with: $1
Explaination: https://regex101.com/r/LXSPDp/3
There's only 1 difference between these two regular expressions and that is the ?

The other answers here fail to spell out a full solution for regex versions which don't support non-greedy matching. The greedy quantifiers (.*?, .+? etc) are a Perl 5 extension which isn't supported in traditional regular expressions.
If your stopping condition is a single character, the solution is easy; instead of
a(.*?)b
you can match
a[^ab]*b
i.e specify a character class which excludes the starting and ending delimiiters.
In the more general case, you can painstakingly construct an expression like
start(|[^e]|e(|[^n]|n(|[^d])))end
to capture a match between start and the first occurrence of end. Notice how the subexpression with nested parentheses spells out a number of alternatives which between them allow e only if it isn't followed by nd and so forth, and also take care to cover the empty string as one alternative which doesn't match whatever is disallowed at that particular point.
Of course, the correct approach in most cases is to use a proper parser for the format you are trying to parse, but sometimes, maybe one isn't available, or maybe the specialized tool you are using is insisting on a regular expression and nothing else.

Because you are using quantified subpattern and as descried in Perl Doc,
By default, a quantified subpattern is "greedy", that is, it will
match as many times as possible (given a particular starting location)
while still allowing the rest of the pattern to match. If you want it
to match the minimum number of times possible, follow the quantifier
with a "?" . Note that the meanings don't change, just the
"greediness":
*? //Match 0 or more times, not greedily (minimum matches)
+? //Match 1 or more times, not greedily
Thus, to allow your quantified pattern to make minimum match, follow it by ? :
/location="(.*?)"/

import regex
text = 'ask her to call Mary back when she comes back'
p = r'(?i)(?s)call(.*?)back'
for match in regex.finditer(p, str(text)):
print (match.group(1))
Output:
Mary

Why is this regular expression matching so much?

I am trying to use http://www.regexr.com/ to create a regular expression.
Basically I am looking to replace something that matches <Openings>any other tags/text</Openings>
<Openings><opening><item><x>3</x><y>3</y><width>10.5</width><height>13.5</height><type>rectangle</type><clipX>0</clipX><clipY>0</clipY><imgsrc></imgsrc></item></opening></Openings>
I started with ([\<Openings\>])\w+ (http://regexr.com/393mv ) but it seems to be matching too many things. Right now that regular expression should only match <Openings>.

Regex to match the whole Openings tag is,
<Openings>.*?<\/Openings>
If you want to capture the contents inside the Openings tag then try the below,
<Openings>(.*?)<\/Openings>

([\<Openings\>])\w+
The brackets mean "Match any character in this". You should use
(\<Openings\>)\w+
which matches specifically "<Openings>" plus one or more word characters.

How to get first match of string by Regular Expression?

I have the following text string:
$ABCD(file="somefile.txt")$' />Some more text followed by a dollar like this one)$. Some more random text
I am trying to match the $ABCD(file="somefile.txt")$ part of the string using a regular expression.
I am using this (?=[$]ABCD[(]file=).*(?<=[)][$]) regular expression pattern to make the intended match. It's not working as expected because I am getting a match all the way to the second )$ in the string.
For example, the match will be as follows:
$ABCD(file="somefile.txt")$' />Some more text followed by a dollar like this one)$
How should I modify the pattern to match to the end of the first occurrence of the )$?
Here is a good online regular expression engine tester:
http://derekslager.com/blog/posts/2007/09/a-better-dotnet-regular-expression-tester.ashx

try appending a ? to the greedy *
(?=[$]ABCD[(]file=).*?(?<=[)][$])
Lazy quantification
The standard quantifiers in regular expressions are greedy, meaning
they match as much as they can. Modern regular expression tools allow a quantifier to be specified as lazy (also known as > non-greedy, reluctant, minimal, or ungreedy) by putting a question mark after the quantifier

You could just use this:
\$ABCD\(file="[a-z.]+"\)\$
to get $ABCD(file="somefile.txt")$.
Your problem was the .* bit, it was too general and thus matched everything up to the last $.

I would advance you to use the second quote to define the end of the searched pattern: [^"]* will match to anything except ".
So the pattern for the file name would be: \$ABCD\(file="([^"]*)

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

how to get individual parameters using regular expression - regex

Here is a string, func("abc", "def", "ghi"), I want to get the individual parameter using regular expression '"\"".*"\""', but it doesn't work, it will matches all of the arguments. How to get the individual parameter using regular expression?

this one may give what you want: "[^"]*" above is just the regex, you may need quote it to use in your programming language.

.* is greedy i.e it would match as much as possible You have to make .* match lazily using .? .? is lazy i.e it would match as less as possible So your regex would be "".*?""

Related

How to match with regexp any occurence of a specific char within a string delimited by specific delimiters? [duplicate]

I'm trying to create a javascript flavored Regex expression that will returned multiple of a named group [duplicate]

Regular expression to extract string between a word and dot followed by strings [duplicate]

Why is this regular expression matching so much?

How to get first match of string by Regular Expression?

Categories

Resources

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

how to get individual parameters using regular expression - regex

Here is a string, func("abc", "def", "ghi"), I want to get the individual parameter using regular expression '"\"".*"\""', but it doesn't work, it will matches all of the arguments. How to get the individual parameter using regular expression?

this one may give what you want: "[^"]*" above is just the regex, you may need quote it to use in your programming language.

.* is greedy i.e it would match as much as possible You have to make .* match lazily using .*? .*? is lazy i.e it would match as less as possible So your regex would be "".*?""

Related

How to match with regexp any occurence of a specific char within a string delimited by specific delimiters? [duplicate]

I'm trying to create a javascript flavored Regex expression that will returned multiple of a named group [duplicate]

Regular expression to extract string between a word and dot followed by strings [duplicate]

Why is this regular expression matching so much?

How to get first match of string by Regular Expression?

Categories

Resources

.* is greedy i.e it would match as much as possible You have to make .* match lazily using .? .? is lazy i.e it would match as less as possible So your regex would be "".*?""