RegEx: Match nth occurrence

RegEx: Match nth occurrence - regex

I have the following string:
_name=aVlTcWRjVG1YeDhucWdEbVFrN3pSOHZ5QTRjOEJZZmZUZXNIYW1PV2RGOWYrczBhVWRmdVJTMUxYazVBOE8zQ3JNMmNVKzJLM2JJTzFON3FiLzFHUE0xY0pkdz09LS1jbkkwaWoxUUl3YVhMMkhtZHpaOW13PT0"%"3D--57356371d167f"
I want to match everything between = and the end " (note there are other quotes after this so I can't just select the last ").
I tried using _name=(.*?)" but there are other quotes in the string as well. Is there a way to match the 3rd quote? I tried _name=(.*?)"{3} but the {3} matches for the quotes back to back, i.e. """
You can try it here

You can use this regex:
\b_name=(?:[^"]*"){3}
RegEx Demo
RegEx Details:
\b_name: Match full word _name:
=: Match a =
(?:[^"]*"){3}: Match 0 or more non-" characters followed by a ". Repeat this group 3 times.

If want to match everything between the first and the third(!) double quote (the third isn't necessarily the last, you told), you can use a pattern like this:
$string = '_name=foo"bar"test" more text"';
// This pattern will not include the last " (note the 2, not 3)
$pattern = '/_name=((.*?"){2}.*?)"/';
preg_match($pattern, $string, $m);
echo $m[1];
Output:
foo"bar"test
Original answer:
I'm not sure if I got you correctly, but it sounds like you want to perform a so called greedy match, meaning you want to match the string until the last " regardless whether the string contains multiple "s.
To perform a greedy match, just drop the ?, like this:
_name=(.*)"
You can try it here: https://regex101.com/r/uC5eO9/2

Related

Powershell Drop the last part of a string with multiple "."

I'm trying to do a regex expression in powershell to get only a specific part of a string. I know a way I can do this without regex but it can definitely be more efficient with. I have a string that looks like this:
Some/Stuff/Here/Then.drop.last
Ideally, I want to write a regex that gets me just:
Then.Drop

PS> 'Some/Stuff/Here/Then.drop.last' -replace '.*/(.+)\..*', '$1'
Then.drop
.*/ greedily matches everything up to the last /
(.+)\. greedily matches everything up to the last literal . and captures everything before that . in the first capture group ($1) - which is your string of interest.
.* matches the remaining part of the string.
Using $1 as the replacement string then replaces the overall match - the entire input string - with what the first capture group matched.
For more information about PowerShell's -replace operator, see this answer.

Getting the value inside specific quotes with preg_match()

I am trying to get the value inside specific quotes from Wordpress content string. I managed to get content but I couldn't use the correct preg_match function.
Example content is:
[vc_single_image image="1667" img_size="full" alignment="center" onclick="img_link_large" css_animation="fadeIn"]
From this string, I am trying to get the value "1667" by using preg_match function.
I have this code right now:
$regex = '/"([^"]+)"/';
$output = preg_match($regex, $post->post_content, $matches);
$first_img_id = trim($matches[0][21], '"');
When I echo out this code, it is working properly. But I am getting the value by checking only quotes. If I add more strings with quotes, then my code will not work properly. Therefore, I want to get the image id with "image=" part. How should I change my regex and matches array indexes?

Your string is a set of key-value pairs inside square brackets. You may match any specific key value by using a preg_match function with a regex like
"~[\s[]$key=\"\K[^\"]*~"
See the regex demo for the current scenario with image.
The [\s[] matches either a whitespace, or [ char, that is, it matches the left boundary of the key. $key is a variable that should not contain non-word chars, and in real life, keys usually consist of word chars. =" matches an equal sign and a quote after it, but the \K match reset operator clears the match value, and only the text that is matched by [^"]* (zero or more chars other than ") gets put into the match memory buffer, and that is what you get by accessing the first item in the resulting $matches array.
See the online PHP demo:
$key="image";
$regex = "~[\s[]$key=\"\K[^\"]*~";
$content='[vc_single_image image="1667" img_size="full" alignment="center" onclick="img_link_large" css_animation="fadeIn"]';
if (preg_match($regex, $content, $matches)) {
echo $matches[0];
}
Output: 1667

For the records, you could get all key/value pairs immediately with
(?:\G(?!\A)|\[)
[^][]*?\K
(?P<key>\w+)="(?P<value>\w+)"
See a demo on regex101.com.

It might be an option to first use trim to remove the [ and ] from the beginning and end from the string and use explode using a whitespace as the delimiter.
Then you could loop the array from explode and use a regex like for example ^image="(\d+)"$ to match image="1667"in the loop.
If it matches, capturing group 1 (\d+) will contain your value
^image="(\d+)"$ will match:
^ Assert position at the start of the string
image=" Match literally
(\d+) Match one or more digits in a capturing group
" Match literally
$ Assert position at the end of the string
Demo php

Remove match characters before output

My regex already works well but I would like to remove the " character at the output. Is this possible with Regex?
Regex: (?>\".*?\")
Link: https://regex101.com/r/G7OQ0a/2/
"SharedKeys" = "0","1","2","3","4","5","6","7","8","9"
"BroadCastKeys" = "0","1","2","3","4","5","6","7","8","9"
"A","B","C","D","E","F","G","H","I","J","K","L","M","N","O","P","Q","R","S","T","U","V","W","X","Y","Z"
"ProgramPath" = "D:\Games\WoW\World of Warcraft\Wow.exe"
Match: "BroadCastKeys" or "L" and so on
My target: BroadCastKeys or L and so on

You can do it with this pattern:
(?!\G)"\K[^"]*
demo
The idea it to skip the position of the closing quote (without consuming it with the pattern). To do that (?!\G) forbids the matches to be consecutive. (\G matches the position of the last successful match or the start of the string).
Note that if your string may start with a double quote, you need to change the pattern to (?!\G(?!\A))"\K[^"]* to allow the first match.
You can also make it more simple and use a capture group:
"([^"]*)"

Is there a way to "recall" a char sequence already matched in the regex itself?

The regex I'm searching has the following constraints:
it starts with "//"
then "[" a non number sequence (called delimiter in this list) and "]"
next line "\n"
"[" 0 or more number separated by the delimiter previously found "]".
For example the following text matches the regex:
//[*#*]
[1*#*34*#*64]
and the following text doesn't match the regex:
//[*#*]
[1#34#64]
because the delimiter is not the same matched in the first row
The regex I currently create is
^//\[(\D)+\]\n\[[(\d)+(\D)+]*(\d)+\]$|^//\[(\D)+\]\n\[\]$|^//\[(\D)+\]\n\[(\d)+\]$
but obviously this regex match with both previous examples.
Is there a way to "recall" a char sequence already matched in the regex itself?

You need something called back-reference (a very good tutorial here).
Use this regex in Python:
r'^//\[([^\]]+)\]\n\[\d+(\1\d+)*\]'
Sample run:
>>> string = """//[*#*]
... [1*#*34*#*64]"""
>>> print re.search(r'^//\[([^\]]+)\]\n\[\d+(\1\d+)*\]',string).group(0)
//[*#*]
[1*#*34*#*64]
will match your string in Python.
Debuggex Demo

You need to use a back-reference, in most languages you can reference a matching group using \n where n is the group number.
This pattern will work:
//\[([^]]++)]\n\[(?>\d++\1?)+]
To break it down:
// just matches the literal
\[([^]]++)] matches some characters in square brackets
\n matches the newline
\[(?:\d++\1?)++] matches one or more digits followed by the match captured in the first pattern section - optionally. This is an atomic group.

Ignoring Whitespace with Regex(perl)

I am using Perl Regular expressions.
How would i go about ignoring white space and still perform a test to see if a string match.
For example.
$var = " hello "; #I want var to igonore whitespace and still match
if($var =~ m/hello/)
{
}

what you have there should match just fine. the regex will match any occurance of the pattern hello, so as long as it sees "hello" somewhere in $var it will match
On the other hand, if you want to be strict about what you ignore, you should anchor your string from start to end
if($var =~ m/^\s*hello\s*$/) {
}
and if you have multiple words in your pattern
if($var =~ m/^\s*hello\s+world\s*$/) {
}
\s* matches 0 or more whitespace, \s+ matches 1 or more white space. ^ matches the beginning of a line, and $ matches the end of a line.

As other have said, Perl matches anywhere in the string, not the whole string. I found this confusing when I first started and I still get caught out. I try to teach myself to think about whether I need to look at the start of the line / whole string etc.
Another useful tip is use \b. This looks for word breaks so /\bbook\b/ matches
"book. "
"book "
"-book"
but not
"booking"
"ebook"

This regex is a little unrelated but if you wanted to concatenate all of the whitespaces from your string before passing it through the if.
s/[\h\v]+/ /g;

/^\shello\s$/

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

RegEx: Match nth occurrence - regex

You can use this regex: \b_name=(?:[^"]"){3} RegEx Demo RegEx Details: \b_name: Match full word _name: =: Match a = (?:[^"]"){3}: Match 0 or more non-" characters followed by a ". Repeat this group 3 times.

Related

Powershell Drop the last part of a string with multiple "."

Getting the value inside specific quotes with preg_match()

Remove match characters before output

Is there a way to "recall" a char sequence already matched in the regex itself?

Ignoring Whitespace with Regex(perl)

Categories

Resources

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

RegEx: Match nth occurrence - regex

You can use this regex: \b_name=(?:[^"]*"){3} RegEx Demo RegEx Details: \b_name: Match full word _name: =: Match a = (?:[^"]*"){3}: Match 0 or more non-" characters followed by a ". Repeat this group 3 times.

Related

Powershell Drop the last part of a string with multiple "."

Getting the value inside specific quotes with preg_match()

Remove match characters before output

Is there a way to "recall" a char sequence already matched in the regex itself?

Ignoring Whitespace with Regex(perl)

Categories

Resources

You can use this regex: \b_name=(?:[^"]"){3} RegEx Demo RegEx Details: \b_name: Match full word _name: =: Match a = (?:[^"]"){3}: Match 0 or more non-" characters followed by a ". Repeat this group 3 times.