Getting the value inside specific quotes with preg_match() - regex

I am trying to get the value inside specific quotes from Wordpress content string. I managed to get content but I couldn't use the correct preg_match function.
Example content is:
[vc_single_image image="1667" img_size="full" alignment="center" onclick="img_link_large" css_animation="fadeIn"]
From this string, I am trying to get the value "1667" by using preg_match function.
I have this code right now:
$regex = '/"([^"]+)"/';
$output = preg_match($regex, $post->post_content, $matches);
$first_img_id = trim($matches[0][21], '"');
When I echo out this code, it is working properly. But I am getting the value by checking only quotes. If I add more strings with quotes, then my code will not work properly. Therefore, I want to get the image id with "image=" part. How should I change my regex and matches array indexes?

Your string is a set of key-value pairs inside square brackets. You may match any specific key value by using a preg_match function with a regex like
"~[\s[]$key=\"\K[^\"]*~"
See the regex demo for the current scenario with image.
The [\s[] matches either a whitespace, or [ char, that is, it matches the left boundary of the key. $key is a variable that should not contain non-word chars, and in real life, keys usually consist of word chars. =" matches an equal sign and a quote after it, but the \K match reset operator clears the match value, and only the text that is matched by [^"]* (zero or more chars other than ") gets put into the match memory buffer, and that is what you get by accessing the first item in the resulting $matches array.
See the online PHP demo:
$key="image";
$regex = "~[\s[]$key=\"\K[^\"]*~";
$content='[vc_single_image image="1667" img_size="full" alignment="center" onclick="img_link_large" css_animation="fadeIn"]';
if (preg_match($regex, $content, $matches)) {
echo $matches[0];
}
Output: 1667

For the records, you could get all key/value pairs immediately with
(?:\G(?!\A)|\[)
[^][]*?\K
(?P<key>\w+)="(?P<value>\w+)"
See a demo on regex101.com.

It might be an option to first use trim to remove the [ and ] from the beginning and end from the string and use explode using a whitespace as the delimiter.
Then you could loop the array from explode and use a regex like for example ^image="(\d+)"$ to match image="1667"in the loop.
If it matches, capturing group 1 (\d+) will contain your value
^image="(\d+)"$ will match:
^ Assert position at the start of the string
image=" Match literally
(\d+) Match one or more digits in a capturing group
" Match literally
$ Assert position at the end of the string
Demo php

Related

Find strings and add html tags to the beginning and end [duplicate]

I am using the following expression:
Find what: [0-9]
But what should I write in Replace with field if I want to add specific sup tag to all the digits?
Thanks in advance!
The replacement can be
<sup>$0</sup>
or
<sup>$&</sup>
Note that the $0 / ${0} / $&, or even $MATCH and ${^MATCH} backrefrence inserts the whole match.
See the Substitutions section:
$&, $MATCH, ${^MATCH}
The whole matched text.
and
$n, ${n}, \n
Returns what matched the subexpression numbered n. Negative indices are not alowed.
Note that a match value is usually stored as Group 0 inside a match object.
However, \0 as of now does not work (Notepad++ v.6.9), it looks like it is treated as a NUL character and truncates the replacement pattern right at the location where it is located.

Powershell Drop the last part of a string with multiple "."

I'm trying to do a regex expression in powershell to get only a specific part of a string. I know a way I can do this without regex but it can definitely be more efficient with. I have a string that looks like this:
Some/Stuff/Here/Then.drop.last
Ideally, I want to write a regex that gets me just:
Then.Drop
PS> 'Some/Stuff/Here/Then.drop.last' -replace '.*/(.+)\..*', '$1'
Then.drop
.*/ greedily matches everything up to the last /
(.+)\. greedily matches everything up to the last literal . and captures everything before that . in the first capture group ($1) - which is your string of interest.
.* matches the remaining part of the string.
Using $1 as the replacement string then replaces the overall match - the entire input string - with what the first capture group matched.
For more information about PowerShell's -replace operator, see this answer.

Perl: How to substitute the content after pattern CLOSED

So I cant use $' variable
But i need to find the pattern that in a file that starts with the string “by: ” followed by any characters , then replace whatever characters comes after “by: ” with an existing string $foo
im using $^I and a while loop since i need to update multiple fields in a file.
I was thinking something along the lines of [s///]
s/(by\:[a-z]+)/$foo/i
I need help. Yes this is an assignment question but im 5 hours and ive lost many brain cells in the process
Some problems with your substitution:
You say you want to match by: (space after colon), but your regex will never match the space.
The pattern [a-z]+ means to match one or more occurrences of letters a to z. But you said you want to match "any characters". That might be zero characters, and it might contain non-letters.
You've replaced the match with $foo, but have lost by:. The entire matched string is replaced with the replacement.
No need to escape : in your pattern.
You're capturing the entire match in parentheses, but not using that anywhere.
I'm assuming you're processing the file line-by line. You want "starts with the string by: followed by any characters". This is the regex:
/^by: .*/
^ matches beginning of line. Then by: matches exactly those characters. . matches any character except for a newline, and * means zero-or more of the preceding item. So .* matches all the rest of the characters on the line.
"replace whatever characters that come after by: with an existing string $foo. I assume you mean the contents of the variable $foo and not the literal characters $foo. This is:
s/^by: .*/by: $foo/;
Since we matched by:, I repeated it in the replacement string because you want to preserve it. $foo will be interpolated in the replacement string.
Another way to write this would be:
s/^(by: ).*/$1$foo/
Here we've captured the text by: in the first set of parentheses. That text will be available in the $1 variable, so we can interpolate that into the replacement string.

RegEx: Match nth occurrence

I have the following string:
_name=aVlTcWRjVG1YeDhucWdEbVFrN3pSOHZ5QTRjOEJZZmZUZXNIYW1PV2RGOWYrczBhVWRmdVJTMUxYazVBOE8zQ3JNMmNVKzJLM2JJTzFON3FiLzFHUE0xY0pkdz09LS1jbkkwaWoxUUl3YVhMMkhtZHpaOW13PT0"%"3D--57356371d167f"
I want to match everything between = and the end " (note there are other quotes after this so I can't just select the last ").
I tried using _name=(.*?)" but there are other quotes in the string as well. Is there a way to match the 3rd quote? I tried _name=(.*?)"{3} but the {3} matches for the quotes back to back, i.e. """
You can try it here
You can use this regex:
\b_name=(?:[^"]*"){3}
RegEx Demo
RegEx Details:
\b_name: Match full word _name:
=: Match a =
(?:[^"]*"){3}: Match 0 or more non-" characters followed by a ". Repeat this group 3 times.
If want to match everything between the first and the third(!) double quote (the third isn't necessarily the last, you told), you can use a pattern like this:
$string = '_name=foo"bar"test" more text"';
// This pattern will not include the last " (note the 2, not 3)
$pattern = '/_name=((.*?"){2}.*?)"/';
preg_match($pattern, $string, $m);
echo $m[1];
Output:
foo"bar"test
Original answer:
I'm not sure if I got you correctly, but it sounds like you want to perform a so called greedy match, meaning you want to match the string until the last " regardless whether the string contains multiple "s.
To perform a greedy match, just drop the ?, like this:
_name=(.*)"
You can try it here: https://regex101.com/r/uC5eO9/2

Powershell regex

Is there a Powershell regex command I could use to replace the last consecutive zero in a text string with a "M". For Example:
$Pattern = #("000123456", "012345678", "000000001", "000120000")
Final result:
00M123456
M12345678
0000000M1
00M120000
Thanks.
Search for the following regex:
"^(0*)0"
The regex searches for a consecutive string of 0 at the beginning ^ of the string. It captures all the 0 except the one for replacement. "^0(0*)" also works, since we only need to take note of the number of 0 which we don't touch.
With the replacement string:
'$1M'
Note that $1 is denotes the text captured by the first capturing group, which is (0*) in the regex.
Example by #SegFault:
"000120000" -replace "^(0*)0", '$1M'