I need to catch all redefined variables in my.cnf
In my case, they looks like
#basedir = /usr/local/mysql
basedir = /usr
So I need to extract all redefined parameters.
Search criteria that parameter was redefined: file has both strings which starts from #param and param.
Please advice me correct regexp.
You may use
^\h*#\K([_$a-zA-Z0-9]+)(?=\s+=\s.+\R\h*\1\s)
See the regex demo
For the regex to work, use the m multiline modifier and read the file into memory as a single string (you can do it with -0777 options).
Pattern details
^ - start of a line
\h* - 0+ horizontal whitespaces
# - a # char
\K - match reset operator
([_$a-zA-Z0-9]+) - Group 1: any 1 or more ASCII letters, digits, _ and $
(?=\s+=\s.+\R^\h*\1\s) - that is immediately followed with:
\s+ - 1+ whitespaces
= - a = char
\s - whitespace
.+ - 1+ chars other than line break chars
\R - a line break sequence
\h* - 0+ horizontal whitespaces
\1 - same value as in Group 1
\s - whitespace.
Related
I am looking for regex that not allowing only spaces (but more than one). One allows (blank space).
I got something like this .*\S.*' or this .*[^ ].* but i want to allow only one space but not more than one only spaces.
You can use
pattern="\S*(?:\s\S*)?"
The pattern will get parsed as a ^(?:\S*(?:\s\S*)?)$ pattern and will match
^ - start of string
(?: - start of a non-capturing group:
\S* - zero or more chars other than whitespace
(?:\s\S*)? - an optional sequence of a whitespace and zero or more non-whitespace chars
) - end of a non-capturing group
$ - end of string.
I have fields which contain data in the following possible formats (each line is a different possibility):
AAA - Something Here
AAA - Something Here - D
Something Here
Note that the first group of letters (AAA) can be of varying lengths.
What I am trying to capture is the "Something Here" or "Something Here - D" (if it exists) using PCRE, but I can't get the Regex to work properly for all three cases. I have tried:
- (.*) which works fine for cases 1 and 2 but obviously not 3;
(?<= - )(.*) which also works fine for cases 1 and 2;
(?! - )(.+)| - (.+) works for cases 2 and 3 but not 1.
I feel like I'm on the verge of it but I can't seem to crack it.
Thanks in advance for your help.
Edit: I realized that I was unclear in my requirements. If there is a trailing " - D" (the letter in the data is arbitrary but should only be a single character), that needs to be captured as well.
About the patterns that you tried:
- (.*)This pattern will match the first occurrence of - followed by matching the rest of the line. It will match too much for the second example as the .* will also match the second occurrence of -
(?<= - )(.*)This pattern will match the same as the first example without the - as it asserts that is should occur directly to the left
(?! - )(.+)| - (.+) This pattern uses a negative lookahead which asserts what is directly to the right is not (?! - ). As none of the example start with - , the whole line will be matched directly after the negative lookahead due to .+ and the second part after the alternation | will not be evaluated
If the first group of letters can be of varying length, you could make the match either specific matching 1 or more uppercase characters [A-Z]+ or 1+ word characters \w+.
To get a more broad match, you could match 1 or more non whitespace characters using \S+
^(?:\S+\h-\h)?\K\S+(?:\h(?!-\h)\S+)*
Explanation
^ Start of string
(?:\S+\h-\h)? Optionally match the first group of non whitespace chars followed by - between horizontal whitespace chars
\K Clear the match buffer (Forget what is currently matched)
\S+ Match 1+ non whitespace characters
(?: Non capture group
\h(?!-\h) Match a horizontal whitespace char and assert what is directly to the right is not - followed by another horizontal whitespace char
\S+ Match 1+ non whitespace chars
)* Close non capture group and repeat 1+ times to match more "words" separated by spaces
Regex demo
Edit
To match an optional hyphen and trailing single character, you could add an optional non capturing group (?:-\h\S\h*)?$ and assert the end of the string if the pattern should match the whole string:
^(?:\S+\h-\h)?\K\S+(?:\h(?!-\h)\S+)*\h*(?:-\h\S\h*)?$
Regex demo
You may use
^(?:.*? - )?\K.*?(?= - | *$)
^(?:.*?\h-\h)?\K.*?(?=\h-\h|\h*$)
See the regex demo
Details
^ - start of string
-(?:.*? - )? - an optional non-capturing group matching any 0+ chars other than line break chars as few as possible up to the first space-space
\K - match reset operator
.*? - any 0+ chars other than line break chars as few as possible
(?= - | *$) - space-space or 0+ spaces till the end of string should follow immediately on the right.
Note that \h matches any horizontal whitespace chars.
^(?:[A-Z]+ - \K)?.*\S
demo
Since "Something Here" can be anything, there's no reason to specially describe the eventual last letter in the pattern. You don't need something more complicated.
With this pattern I assume that you are not interested by the trailing spaces, that's why I ended it with \S. If you want to keep them, remove the \S and change the previous quantifier to +.
I have a list in this format
FIRSTTEXT:SECONDTEXT:RANDOMTEXT::::::::RANDOMNUMBERS:NUMBER:
but all the text is not in this format. i want to save only FIRSTTEXT:SECONDTEXT,
firsttext and secondtext are in the same position on all document !
I have tried this one:
Find what: (.+):(.+)
Replace with: \1:\2
However, it doesn't work.
You may use
Find What: ^(?:([^:\s]+:[^:\s]+).*|.*\R*)
Replace With: $1
Details
^ - start of a line
(?: - start of a non-capturing group:
([^:\s]+:[^:\s]+) - Group 1 ($1 refers to this value):
[^:\s]+ - 1+ chars other than whitespace and :
: - a colon
[^:\s]+ - 1+ chars other than whitespace and :
.* - 0+ chars other than any line break char, as many as possible
| - or
.* - 0+ chars other than any line break char, as many as possible
\R* - 0+ line break sequences
) - end of the non-capturing group.
Demo and settings:
I have a multiline text field and need to test if each line matches a pattern.
The field might look like this:
1xABCD
9xDEFGHIJK
7xAJDKSLD
2xA
The pattern is this: \dx\w.*
The number of lines is from 1 to X.
I was trying ^\d+x\w.*${1,} or \d+x\w.*\r\n{1,}
Thank you
You may use
^\d+x\w+(?:\r?\n\d+x\w+)*$
Details
^ - start of string
\d+x\w+ - 1+ digits, x and then 1+ word chars (letters, digits or _)
(?:\r?\n\d+x\w+)* - a non-capturing group ((?:...)) that matches 0 or more (*) occurrences of:
\r?\n - an optional CR and an LF symbol
\d+x\w+ - 1+ digits, x and then 1+ word chars (letters, digits or _)
$ - end of string.
See the regex demo (note the text pasted in the regex101.com has LF only line endings).
I want to search keyword TIMESTAMP in CREATE TABLE. This is my regex:
(?i)(\s+|^)CREATE\s+TABLE\s+\[\s*\bdbo\b\s*\]\.\[\w+\]\s*\(\s*((.|\n)*)\bTIMESTAMP
But it search CREATE TABLE in a query and TIMESTAMP in another query.
Like this
Can you help me, please?
When you just want to search Create Table and Timestamp you can use this simple regex:
(?i)(CREATE TABLE|TIMESTAMP)
The (?i) optional for case insensive.
You may use
(?im)^CREATE\s+TABLE\s+\[\s*dbo\s*\]\.\[\w+\]\s*\(\s*(.*(?:\n(?!CREATE\s+TABLE\b).*)*)\bTIMESTAMP\b
See this regex demo
If your regex can't match a CR char with . add \r? before \n.
Note you do not need \b word boundaries on both ends of dbo as it is inside [...].
Details
(?im) - ignore case and multiline modes on
^ - start of a line
CREATE\s+TABLE\s+\[ - CREATE TABLE [ with any 1+ whitespaces in between words
\s*dbo\s* - a dbo string enclosed with 0+ whitespaces
\]\.\[ - ].[ string
\w+ - 1+ word chars
\] - ] char
-\s*\(\s* - a ( enclosed with 0+ whitespaces
(.*(?:\n(?!CREATE\s+TABLE\b).*)*) - Group 1:
.* - any 0+ chars other than line break chars
(?:\n(?!CREATE\s+TABLE\b).*)* - 0 or more sequences of
\n(?!CREATE\s+TABLE\b) - a newline not followed with CREATE TABLE
.* - any 0+ chars other than line break chars
\bTIMESTAMP\b - a whole word TIMESTAMP
It might be easier to do it in two steps.
Step 1: find "complete" CREATE TABLE statements. Actually, find the span of the outermost parentheses.
(?i)(^ *)CREATE\s+TABLE\s+[^()]*\(([^()]*\([^()]*\))*[^()]*\)
Test here.
Step 2: find timestamp in the resulting found strings.