This question already has answers here:
Regex match entire words only
(7 answers)
Closed 4 years ago.
I'm trying to get the file name between two parentheses that can contain spaces between the name and any parentheses.
example:
( file_name )
I used the regex:
(([A-Za-z_][A-Za-z0-9_]*)[ \t]*)
The problem is that, it matches the file_name with the spaces before and after it.
I want to match the file_name without the spaces.
Any help is appreciated.
Just add \s* outside the capturing parenthesis. (also you need to escape the outermost parenthesis if you want to match a litteral parenthesis) :
\(\s*([A-Za-z_][A-Za-z0-9_]*)\s*\)
You could use
\(\s*(\S+)\s*\)
and take the first group, see a demo on regex101.com.
Explained:
\( # match ( literally
\s* # zero or more whitespaces
(\S+) # capture anything not a whitespace, at least one character
\s*. # same as above
\) # match ) literally
Related
This question already has answers here:
Regex to extract nth token of a string separated by pipes
(2 answers)
Closed 2 years ago.
So I'm looking at a string like this with a | delimiter:
XL231241424|AB|ABCDE|LK|Word|A Phrase|Another random phrase|GH49|Example|31/02/2020|05/03/2020|N/A|N/A|N/A
The end goal is to pluck specific items from this string and concatenate them together, essentially giving a more concise bit of text for readable in a reporting platform.
I tried this and this but couldn't get it to work for my string. The match is simple, it's just:
([^\|]*) but how do I then just pull a match in position x? Specifically, if I want to get Another random phrase which is position 7, how do I do that?
Any help is much appreciated and an explanation is even better!
Thanks.
You could use
^(?:[^|]*\|){6}([^|]*)
Demo
In your example Another random phrase would be saved to capture group 1.
The regex engine performs the following operations.
^ # match beginning of line
(?: # begin a non-capture group
[^|]*\| # match 0+ chars other than '|' followed by '|'
) # end non-capture group
{6} # execute non-capture group 6 times
([^|]*) # match 0+ chars other than '|' in capture group 1
If your regex engine supports \K (as does PCRE (PHP), P and others), you don't need a capture group:
^(?:[^|]*\|){6}\K[^|]*
Demo
In your example this matches Another random phrase.
^(?:[^|]*\|){6}\K[^|]*
The regex engine performs the following operations.
^ # match beginning of line
(?: # begin a non-capture group
[^|]*\| # match 0+ chars other than '|' followed by '|'
) # end non-capture group
{6} # execute non-capture group 6 times
\K # discard everything matched so far
[^|]* # match 0+ chars other than '|'
This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 2 years ago.
Example: search for the pattern man but only as the beginning of a word (i.e. not preceded directly by letters).
This pattern would be found in the strings man, spider-man, manpower, iron_man. But not in woman or human.
I have assumed that if the word "man" is preceded by a hyphen or underscore, to achieve a match the hyphen or underscore must be preceded by a letter (e.g., "-man" would not be matched).
The \K escape sequence resets the beginning of the match to the current position in the token list. If supported by the regex engine, the following regular expression (with the case-indifferent flag set) could be used.
(?:^| |[a-z][-_])\Kman
Demo
The selected answer to this SO question provides a list of regex engines that support \K. That list was last updated in August 2019.
The regex engine performs the following operations.
(?: # begin non-capture group
^ # match beginning of line
| # or
# match a space
| # or
[a-z] # match a letter
[-_] # match '-' or '_'
) # end non-capture group
\K # discard everything matched so far
man # match 'man'
Alternatively, a capture group could be used.
(?:^| |[a-z][-_])(man)
Demo
You can use positive lookbehind to achieve this:
(?<=^|[a-z][_-]|\s)man
regex101 demo
Add a word boundary \b or look behind for underscore to the start:
((?<=_)|\b)man
See live demo.
This question already has answers here:
How to extract a substring using regex
(14 answers)
Closed 2 years ago.
Looking for some help with a RegEx expression. I want to parse a line similar to below, capturing the integer after the letter Q:
Q232.1232 K1232.232323
would be come 232.1232 ideally in output.
The expression /(^Q)[0-9.-]* . provides me with Q232.1232 however i do not want the Q in the output.
Appreciate anyone who could assist!
You can use
/(?<=Q)(?:\d+(?![.\d])|\d+\.\d+(?![.\d]))/gm
demo
(?<=Q) is a positive lookbehind. It requires the match to be immediately preceded by "Q", but "Q" is not part of the match.
I've made assumptions about which strings can be matched. Those are reflected at the demo.
The regular expression can be written in free-spacing mode to make it self-documenting:
/
(?<=Q) # match 'Q' in a positive lookbehind
(?: # begin non-capture group
\d+ # match representation of an integer (1+ digits)
(?![.\d]) # do not match a period or digit (negative lookahead)
| # or
\d+\.\d+ # match representation of a float
(?![.\d]) # do not match a period or digit (negative lookahead)
) # end non-capture group
/gmx # global, multiline and free-spacing regex definition modes
Any group of characters enclosed in brackets will be captured. You can then use back-references to refer to these captured groups in the replacement expression.
So for your example, you would need
(^Q[0-9.-])*
back-references are normally accessed using $ or \, followed by the captured group number, $1 in your case
This question already has an answer here:
Notepad++ regular expression replace
(1 answer)
Closed 3 years ago.
I want to upgrade a project from one version to another so, that in need to change thousands of lines with same pattern.
Example:
From this
$this->returnData['status']
To this
$this->{returnData['status']}
By using following regex i found all the matches but unable to replace with braces.
->[a-zA-z]{5,15}\['[a-zA-z]{5,15}'\]
I used following to replace
->\{[a-zA-z]{5,15}\['[a-zA-z]{5,15}'\]\}
Try using the following find and replace, in regex mode:
Find: \$this->([A-Za-z]{5,15}\['[A-Za-z]{5,15}'\])
Replace: $this->{$1}
Demo
The regex matches says to:
\$this-> match "$this->"
( then match and capture
[A-Za-z]{5,15} 5-15 letters
\[ [
'[A-Za-z]{5,15}' 5-15 letters in single quotes
\] ]
) stop capture group
The replacement is just $this-> followed by the first capture group, but now wrapped in curly braces.
If you want to just match the arrow operator, regardless of what precedes it, then just use this pattern, and keep everything else the same:
->([A-Za-z]{5,15}\['[A-Za-z]{5,15}'\])
You could also use negated character classes and use \K to forget the match. In the replacement use $0 to insert the full match surrounded by { and }
Match:
->\K[^[\n]+\['[^\n']+']
That will match
-> match ->
\K Forget what was matched
[^[\n]+ Match not [ or newline 1+ times or exactly [a-zA-Z]{5,15} instead of [^[\n]+
\[ Match [
'[^\n']+' Match from ', 1+ times not ' or newline and then ' or exactly [a-zA-Z]{5,15} instead of [^\n']+
] Match literally
Regex demo
Replace with:
{$0}
Result:
$this->{returnData['status']}
The exacter version would look like:
->\K[a-zA-Z]{5,15}\['[a-zA-Z]{5,15}']
Regex demo
Note that [a-zA-z] matches more than [a-zA-Z]
This question already has answers here:
Regular expression to skip character in capture group
(6 answers)
Closed 7 years ago.
I have a regex capture, and I would like to exclude a character (a space, in this particular case) from the middle of the captured string. Can this be done in one step, by modifying the regex?
(Quick and dirty) example:
Text: Key name = value
My regex: (.*) = (.*)
Output: \1 = "Key name" and \2 = "value"
Desired output: \1 = "Keyname" and \2 = "value"
Update: I'm not sure what regex engine will run this regex, since it's part of a larger software product. If you have a solution, please specify which engines it will run on, and on which it will not.
Update2: The aforementioned product takes a regex as an input, and then uses the matched values further, which is the reason for which a one-step solution is asked for. There is no opportunity to insert an intermediate processing step in the pipeline.
This is a possible theoretical pure-regex implementation using the end-of-previous-match \G anchor:
/(?:\G(\w+)\h(?:(?:=\h)(\w+))?)+/g
Online demo
Legenda
(?: # Non capturing group 1
\G # Matches where the regex engine stops in the previous step
(\w+) # capture group 1: a regex word of 1+ chars
\h* # zero or more horizontal spaces (space, tabs)
(?: # Non capturing group 2
=\h* # literal '=' follower by zero or more hspaces
(\w+) # capture group 2: a regex word of 1+ chars
)? # make the non capturing group 2 optional
)+ # repeat the non capturing group 1, one or more
In the substitution section of the demo:
\1 actually contains Keyname (the 2 terms are separated by a fake space)
\2 is value
NOTE: i don't recommend using this unless actually needed (why?).
There are multiple possible approaches in 2 steps: as surely already stated simply strip spaces from the first capturing group of the OP regex.
I would come up with sth. like:
(?<key>[\w]+)\s*=\s*(?<value>.+)
# look for a word character and capture it in a group called "key"
# followed by zero or unlimited times of a whitespace character (\s)
# followed by an equation sign
# followed by zero or unlimited times of a whitespace character (\s)
# capture the rest in a group called value
... and process the captured output afterwards. But with the \w character class no whitespace will matched (do you have keys with a whitespace in it?).
See a working demo here. But as mentionned in the comments, it depends on your programming language.