Regex search the 2 nearest keywords - regex

I want to search keyword TIMESTAMP in CREATE TABLE. This is my regex:
(?i)(\s+|^)CREATE\s+TABLE\s+\[\s*\bdbo\b\s*\]\.\[\w+\]\s*\(\s*((.|\n)*)\bTIMESTAMP
But it search CREATE TABLE in a query and TIMESTAMP in another query.
Like this
Can you help me, please?

When you just want to search Create Table and Timestamp you can use this simple regex:
(?i)(CREATE TABLE|TIMESTAMP)
The (?i) optional for case insensive.

You may use
(?im)^CREATE\s+TABLE\s+\[\s*dbo\s*\]\.\[\w+\]\s*\(\s*(.*(?:\n(?!CREATE\s+TABLE\b).*)*)\bTIMESTAMP\b
See this regex demo
If your regex can't match a CR char with . add \r? before \n.
Note you do not need \b word boundaries on both ends of dbo as it is inside [...].
Details
(?im) - ignore case and multiline modes on
^ - start of a line
CREATE\s+TABLE\s+\[ - CREATE TABLE [ with any 1+ whitespaces in between words
\s*dbo\s* - a dbo string enclosed with 0+ whitespaces
\]\.\[ - ].[ string
\w+ - 1+ word chars
\] - ] char
-\s*\(\s* - a ( enclosed with 0+ whitespaces
(.*(?:\n(?!CREATE\s+TABLE\b).*)*) - Group 1:
.* - any 0+ chars other than line break chars
(?:\n(?!CREATE\s+TABLE\b).*)* - 0 or more sequences of
\n(?!CREATE\s+TABLE\b) - a newline not followed with CREATE TABLE
.* - any 0+ chars other than line break chars
\bTIMESTAMP\b - a whole word TIMESTAMP

It might be easier to do it in two steps.
Step 1: find "complete" CREATE TABLE statements. Actually, find the span of the outermost parentheses.
(?i)(^ *)CREATE\s+TABLE\s+[^()]*\(([^()]*\([^()]*\))*[^()]*\)
Test here.
Step 2: find timestamp in the resulting found strings.

Related

Support required in regex writting in PCRE

I am not very good in regex and learning this on daily basis. I got issue where I want to extract data after # and before > if it exist in the field value else it should return as its data.
Data example: <abc#xyz.com>, chene.com abc.xyz#xyz.com
Expected output of my regex should be xyz.com, chene.com and xyz.com.
What I wrote is
([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})
but this is not fetching all of the required data.
I suggest capturing the part you need using
(?:<?[\w.-]+#)?\b(?<from_domain>\w[\w.-]*\.[a-zA-Z]{2,5})\b
See the regex demo
Details
(?:<?[\w.-]+#)? - an optional non-capturing group that matches
<? - an optional < char
[\w.-]+ - 1+ word chars, . or - chars
# - a # char
\b - a word boundary
(?<from_domain>\w[\w.-]*\.[a-zA-Z]{2,5}) - Group "from_domain":
\w[\w.-]* - a word char followed with 0 or more word, dot or hyphen chars
\. - a dot
[a-zA-Z]{2,5} - two to five ASCII letters
\b - a word boundary

Right regexp for detect changes in mysql config

I need to catch all redefined variables in my.cnf
In my case, they looks like
#basedir = /usr/local/mysql
basedir = /usr
So I need to extract all redefined parameters.
Search criteria that parameter was redefined: file has both strings which starts from #param and param.
Please advice me correct regexp.
You may use
^\h*#\K([_$a-zA-Z0-9]+)(?=\s+=\s.+\R\h*\1\s)
See the regex demo
For the regex to work, use the m multiline modifier and read the file into memory as a single string (you can do it with -0777 options).
Pattern details
^ - start of a line
\h* - 0+ horizontal whitespaces
# - a # char
\K - match reset operator
([_$a-zA-Z0-9]+) - Group 1: any 1 or more ASCII letters, digits, _ and $
(?=\s+=\s.+\R^\h*\1\s) - that is immediately followed with:
\s+ - 1+ whitespaces
= - a = char
\s - whitespace
.+ - 1+ chars other than line break chars
\R - a line break sequence
\h* - 0+ horizontal whitespaces
\1 - same value as in Group 1
\s - whitespace.

Regex help for Event Match that are unique, though the pattern is same

here is my regex: https://regex101.com/r/g56UzY/1
i have this pattern
pdlvkw6v INFO 18:25:03.994 pdlvkw6v WARN 18:25:03.994 pdlvkw6v INFO
18:25:03.994 rg9n9bz7 INFO 18:23:52.987 rg9n9bz7 ERROR 19:23:52.987
rg9n9bz7 INFO 21:23:52.987 5y6n9bz7 WARN 18:23:52.987
and my current regex is: [\w]{8}\s+(INFO|WARN|ERROR)\s+\d\d:\d\d:\d\d\.\d\d\d
I want the regex to only determine the first unique string ie. show pdlvkw6v and after that it should show me rg9n9bz7 and then 5y6n9bz7, it should not match the repititive strings.
What i am trying is to break events from multiline based on this fixed string and since one event can have multiple string and i want to be able to break it by the first matching string and leave the rest into the event.
You need to capture the word you are interested in and add a negative lookahead check:
(?s)\b(\w{8})\b(?!.*\b\1\b)\s+(?:INFO|WARN|ERROR)\s+\d\d(?::\d\d){2}\.\d{3}
^^^^^^^^^^^^^^^^^^^^^^^
Or, if (?s) modifier is not supported:
\b(\w{8})\b(?![\s\S]*\b\1\b)\s+(?:INFO|WARN|ERROR)\s+\d\d(?::\d\d){2}\.\d{3}
See the regex demo
Explanation:
(?s) - a DOTALL modifier making . match any char
\b - a word boundary
(\w{8}) - Group 1: 8 word chars
\b - a word boundary
(?!.*\b\1\b) - the negative lookahead that fails the match if immediately to the right of the current location, after 0+ chars, there is a whole word equal to the one stored in the Group 1 buffer
\s+ - 1+ whitespaces
(?:INFO|WARN|ERROR) - one of the three substrings
\s+ - 1+ whitespaces
\d\d - 2 digits
(?::\d\d){2} - 2 sequences of :, digit, digit
\. - a dot
\d{3} - three digits

Nested regex replacement

I need to create the laravel migrations, so I have converted my SQL script to a laravel migration format using "replacement in files" with regular expressions from Sublime Text.
My problem is that i have to replace in the following string the '#' character by the 'tablename' in about 70 tables:
Schema::table('tablename', function($table) {
$table->dropForeign('#_columnname_foreign');
});
Actually I can do this using the following expression:
(Schema::table\('([a-z]+)',[\s]*function\(\$table\)[\s]*{[\s]*\$table->dropForeign\(')#(_[a-z_]+'\);)
And in the replace field:
$1$2$3
but I don't know how to do when the table has more than one fk:
Schema::table('tablename1', function($table) {
$table->dropForeign('#_field1_foreign');
$table->dropForeign('#_field2_foreign');
$table->dropForeign('#_field3_foreign');
$table->dropForeign('#_field4_foreign');
$table->dropForeign('#_field5_foreign');
$table->dropForeign('#_field6_foreign');
});
I have been using this site to validate my regular expressions RegExr
It is not an easy task for a regex in Sublime Text. The only way to do it with a regex is to make sure you capture the function singature with the optional number of table-dropForeign lines (matched lazily), and replace #s on the next line.
The regex below requires clicking Replace All multiple times until all matches are found.
(Schema::table\('([a-z0-9]+)',\s*function\(\$table\)\s*{(?:\s*\$table->dropForeign\('[a-z0-9]+_\w+'\);)*?\s*\$table->dropForeign\(')#(_\w+'\);)
Replacement is $1$2$3. See this regex demo, where you may replace the # in the second block manually with the table name and see how the match goes further.
Details:
(Schema::table\('([a-z0-9]+)',\s*function\(\$table\)\s*{(?:\s*\$table->dropForeign\('[a-z0-9]+_\w+'\);)*?\s*\$table->dropForeign\(') - Group 1 capturing:
Schema::table\(' - literal Schema::table(' substring
([a-z0-9]+) - Group 2 capturing 1+ alphanumerics (do not check Match Case option to also match uppercase ASCII letters)
',\s* - a comma and 0+ whitespaces
function\(\$table\) - a literal text function($table)
\s* - 0+ whitespaces
{ - a literal { (in SublimeText 2, it requires escaping)
(?:\s*\$table->dropForeign\('[a-z0-9]+_\w+'\);)*? - 0+ sequences, but as few as possible, matching:
\s*\$table->dropForeign\(' - 0+ whitespaces and then a literal text `$table->dropForeign('
[a-z0-9]+_\w+ - 1+ alphanumerics, _ and 1+ digits, letters or underscores (\w+)
'\); - a literal substring ');
\s* - 0+ whitespaces
\$table->dropForeign\(' - a literal text $table->dropForeign('
# - a matched # symbol to be replaced
(_\w+'\);) - Group 2 capturing:
_ - an underscore
\w+ - 1 or more letters, digits or underscores
'\); - a literal substring ');
NOTE: The issue I thought I found was related to an unescaped { that causes a regex failure in Sublime Text 2. In Sublime Text 3, the { in the regex does not have to be escaped.

Matching if all of BCD..n exist after last occurrence of A

I have a source string that looks like this: mID00231mID00008mID00231mID00054mID00013mID00008mID00065
The pattern I am trying to create, using this example, is: For the last occurrence of "mID00231" in the string, one or more occurrences of each of {mID00054, mID00013, mID00008, mID00065} must follow it (in any order).
Examples of matches:
mID00231mID00008mID00231mID00054mID00013mID00008mID00065
mID00231mID00013mID00054mID00008mID00065mID00008
Example of no match because of missing "mID00065":
mID00231mID00054mID00013mID00008
Example of no match because the last occurrence of "mID00231" is not followed by a "mID00054" and a "mID00008":
mID00231mID00013mID00065mID00054mID00008mID00231mID00013mID00065
I am fairly new to regex but usually arrive at something that works. This one has been very difficult. I tried this:
(?:mID00231)(?:(?=.*mID00054)(?=.*mID00013)(?=.*mID00008)(?=.*mID00065).*)
It works if there is only one occurrence of the first element (mID00231). If the element repeats, the pattern fails. Any help is appreciated.
You need to fail the match if there is the same value with a negative lookahead:
mID00231((?!.*mID00231)(?=.*mID00054)(?=.*mID00013)(?=.*mID00008)(?=.*mID00065).*)
^^^^^^^^^^^^^^
See the regex demo.
Details:
mID00231 - match a literal mID00231 text
( - start of the capturing group
(?!.*mID00231) - there cannot be mID00231 anywhere after 0+ any chars but a newline
(?=.*mID00054) - there must be mID00054 anywhere after 0+ any chars but a newline
(?=.*mID00013) - there must be mID00013 anywhere after 0+ any chars but a newline
(?=.*mID00008) - there must be mID00008 anywhere after 0+ any chars but a newline
(?=.*mID00065) - there must be mID00065 anywhere after 0+ any chars but a newline
.* - 0+ any chars but a newline
) - end of the capturing group.